Hierarchy definition classes

The Node class

class tables.Node(parentnode: Group | SoftLink, name: str, _log: bool = True)[source]

Abstract base class for all PyTables nodes.

This is the base class for all nodes in a PyTables hierarchy. It is an abstract class, i.e. it may not be directly instantiated; however, every node in the hierarchy is an instance of this class.

A PyTables node is always hosted in a PyTables file, under a parent group, at a certain depth in the node hierarchy. A node knows its own name in the parent group and its own path name in the file.

All the previous information is location-dependent, i.e. it may change when moving or renaming a node in the hierarchy. A node also has location-independent information, such as its HDF5 object identifier and its attribute set.

This class gathers the operations and attributes (both location-dependent and independent) which are common to all PyTables nodes, whatever their type is. Nonetheless, due to natural naming restrictions, the names of all of these members start with a reserved prefix (see the Group class in The Group class).

Sub-classes with no children (e.g. leaf nodes) may define new methods, attributes and properties to avoid natural naming restrictions. For instance, _v_attrs may be shortened to attrs and _f_rename to rename. However, the original methods and attributes should still be available.

Node attributes

_v_depth: The depth of this node in the tree (n non-negative integer value).

_v_file: The hosting File instance (see The File Class).

_v_name: The name of this node in its parent group (a string).

_v_pathname: The path of this node in the tree (a string).

_v_objectid: A node identifier (may change from run to run).

Changed in version 3.0: The _v_objectID attribute has been renamed into _v_object_id.

Node instance variables - location dependent

Node._v_parent: Return the parent Group instance.

Node instance variables - location independent

Node._v_attrs

AttributeSet instance associated to the Node.

Node instance variables - attribute shorthands

Node._v_title

Return the description of the node.

A shorthand for TITLE attribute.

Node methods - hierarchy manipulation

Node._f_close() → None[source]

Close this node in the tree.

This releases all resources held by the node, so it should not be used again. On nodes with data, it may be flushed to disk.

You should not need to close nodes manually because they are automatically opened/closed when they are loaded/evicted from the integrated LRU cache.

Node._f_copy(newparent: Group | str | None = None, newname: str | None = None, overwrite: bool = False, recursive: bool = False, createparents: bool = False, **kwargs) → Node[source]

Copy this node and return the new node.

Creates and returns a copy of the node, maybe in a different place in the hierarchy. newparent can be a Group object (see The Group class) or a pathname in string form. If it is not specified or None, the current parent group is chosen as the new parent. newname must be a string with a new name. If it is not specified or None, the current name is chosen as the new name. If recursive copy is stated, all descendants are copied as well. If createparents is true, the needed groups for the given new parent group path to exist will be created.

Copying a node across databases is supported but can not be undone. Copying a node over itself is not allowed, nor it is recursively copying a node into itself. These result in a NodeError. Copying over another existing node is similarly not allowed, unless the optional overwrite argument is true, in which case that node is recursively removed before copying.

Additional keyword arguments may be passed to customize the copying process. For instance, title and filters may be changed, user attributes may be or may not be copied, data may be sub-sampled, stats may be collected, etc. See the documentation for the particular node type.

Using only the first argument is equivalent to copying the node to a new location without changing its name. Using only the second argument is equivalent to making a copy of the node in the same group.

Node._f_isvisible() → bool[source]: Return True if the node is visible.

Node._f_move(newparent: Group | str | None = None, newname: str | None = None, overwrite: bool = False, createparents: bool = False) → None[source]

Move or rename this node.

Moves a node into a new parent group, or changes the name of the node. newparent can be a Group object (see The Group class) or a pathname in string form. If it is not specified or None, the current parent group is chosen as the new parent. newname must be a string with a new name. If it is not specified or None, the current name is chosen as the new name. If createparents is true, the needed groups for the given new parent group path to exist will be created.

Moving a node across databases is not allowed, nor it is moving a node into itself. These result in a NodeError. However, moving a node over itself is allowed and simply does nothing. Moving over another existing node is similarly not allowed, unless the optional overwrite argument is true, in which case that node is recursively removed before moving.

Usually, only the first argument will be used, effectively moving the node to a new location without changing its name. Using only the second argument is equivalent to renaming the node in place.

Node._f_remove(recursive: bool = False, force: bool = False) → None[source]

Remove this node from the hierarchy.

If the node has children, recursive removal must be stated by giving recursive a true value; otherwise, a NodeError will be raised.

If the node is a link to a Group object, and you are sure that you want to delete it, you can do this by setting the force flag to true.

Node._f_rename(newname: str, overwrite: bool = False) → None[source]

Rename this node in place.

Changes the name of a node to newname (a string). If a node with the same newname already exists and overwrite is true, recursively remove it before renaming.

Node methods - attribute handling

Node._f_delattr(name: str) → None[source]

Delete a PyTables attribute from this node.

If the named attribute does not exist, an AttributeError is raised.

Node._f_getattr(name: str) → Any[source]

Get a PyTables attribute from this node.

If the named attribute does not exist, an AttributeError is raised.

Node._f_setattr(name: str, value: Any) → None[source]

Set a PyTables attribute for this node.

If the node already has a large number of attributes, a PerformanceWarning is issued.

The Group class

class tables.Group(parentnode: Group, name: str, title: str = '', new: bool = False, filters: Filters | None = None, _log: bool = True)[source]

Basic PyTables grouping structure.

Instances of this class are grouping structures containing child instances of zero or more groups or leaves, together with supporting metadata. Each group has exactly one parent group.

Working with groups and leaves is similar in many ways to working with directories and files, respectively, in a Unix filesystem. As with Unix directories and files, objects in the object tree are often described by giving their full (or absolute) path names. This full path can be specified either as a string (like in ‘/group1/group2’) or as a complete object path written in natural naming schema (like in file.root.group1.group2).

A collateral effect of the natural naming schema is that the names of members in the Group class and its instances must be carefully chosen to avoid colliding with existing children node names. For this reason and to avoid polluting the children namespace all members in a Group start with some reserved prefix, like _f_ (for public methods), _g_ (for private ones), _v_ (for instance variables) or _c_ (for class variables). Any attempt to create a new child node whose name starts with one of these prefixes will raise a ValueError exception.

Another effect of natural naming is that children named after Python keywords or having names not valid as Python identifiers (e.g. class, $a or 44) can not be accessed using the node.child syntax. You will be forced to use node._f_get_child(child) to access them (which is recommended for programmatic accesses).

You will also need to use _f_get_child() to access an existing child node if you set a Python attribute in the Group with the same name as that node (you will get a NaturalNameWarning when doing this).

Parameters:

parentnode – The parent Group object.
name (str) – The name of this node in its parent group.
title – The title for this group
new – If this group is new or has to be read from disk
filters (Filters) – A Filters instance

Changed in version 3.0: parentNode renamed into parentnode

Notes

The following documentation includes methods that are automatically called when a Group instance is accessed in a special way.

For instance, this class defines the __setattr__, __getattr__, __delattr__ and __dir__ methods, and they set, get and delete ordinary Python attributes as normally intended. In addition to that, __getattr__ allows getting child nodes by their name for the sake of easy interaction on the command line, as long as there is no Python attribute with the same name. Groups also allow the interactive completion (when using readline) of the names of child nodes. For instance:

# get a Python attribute
nchild = group._v_nchildren

# Add a Table child called 'table' under 'group'.
h5file.create_table(group, 'table', myDescription)
table = group.table          # get the table child instance
group.table = 'foo'          # set a Python attribute

# (PyTables warns you here about using the name of a child node.)
foo = group.table            # get a Python attribute
del group.table              # delete a Python attribute
table = group.table          # get the table child instance again

Additionally, on interactive python sessions you may get autocompletions of children named as valid python identifiers by pressing the [Tab] key, or to use the dir() global function.

Group attributes

The following instance variables are provided in addition to those in Node (see The Node class):

_v_children: Dictionary with all nodes hanging from this group.

_v_groups: Dictionary with all groups hanging from this group.

_v_hidden: Dictionary with all hidden nodes hanging from this group.

_v_leaves: Dictionary with all leaves hanging from this group.

_v_links: Dictionary with all links hanging from this group.

_v_unknown: Dictionary with all unknown nodes hanging from this group.

Group properties

Group._v_nchildren: Return the number of children hanging from this group.

Group._v_filters

Default filter properties for child nodes.

You can (and are encouraged to) use this property to get, set and delete the FILTERS HDF5 attribute of the group, which stores a Filters instance (see The Filters class). When the group has no such attribute, a default Filters instance is used.

Group methods

Important

Caveat: The following methods are documented for completeness, and they can be used without any problem. However, you should use the high-level counterpart methods in the File class (see The File Class, because they are most used in documentation and examples, and are a bit more powerful than those exposed here.

The following methods are provided in addition to those in Node (see The Node class):

Group._f_close() → None[source]

Close this group and all its descendents.

This method has the behavior described in Node._f_close(). It should be noted that this operation closes all the nodes descending from this group.

You should not need to close nodes manually because they are automatically opened/closed when they are loaded/evicted from the integrated LRU cache.

Group._f_copy(newparent: Group | None = None, newname: str | None = None, overwrite: bool = False, recursive: bool = False, createparents: bool = False, **kwargs) → Group[source]

Copy this node and return the new one.

This method has the behavior described in Node._f_copy(). In addition, it recognizes the following keyword arguments:

Parameters:

title – The new title for the destination. If omitted or None, the original title is used. This only applies to the topmost node in recursive copies.
filters (Filters) – Specifying this parameter overrides the original filter properties in the source node. If specified, it must be an instance of the Filters class (see The Filters class). The default is to copy the filter properties from the source node.
copyuserattrs – You can prevent the user attributes from being copied by setting thisparameter to False. The default is to copy them.
stats – This argument may be used to collect statistics on the copy process. When used, it should be a dictionary with keys ‘groups’, ‘leaves’, ‘links’ and ‘bytes’ having a numeric value. Their values will be incremented to reflect the number of groups, leaves and bytes, respectively, that have been copied during the operation.

Group._f_copy_children(dstgroup: Group, overwrite: bool = False, recursive: bool = False, createparents: bool = False, **kwargs) → None[source]

Copy the children of this group into another group.

Children hanging directly from this group are copied into dstgroup, which can be a Group (see The Group class) object or its pathname in string form. If createparents is true, the needed groups for the given destination group path to exist will be created.

The operation will fail with a NodeError if there is a child node in the destination group with the same name as one of the copied children from this one, unless overwrite is true; in this case, the former child node is recursively removed before copying the latter.

By default, nodes descending from children groups of this node are not copied. If the recursive argument is true, all descendant nodes of this node are recursively copied.

Additional keyword arguments may be passed to customize the copying process. For instance, title and filters may be changed, user attributes may be or may not be copied, data may be sub-sampled, stats may be collected, etc. Arguments unknown to nodes are simply ignored. Check the documentation for copying operations of nodes to see which options they support.

Group._f_get_child(childname: str) → Node[source]

Get the child called childname of this group.

If the child exists (be it visible or not), it is returned. Else, a NoSuchNodeError is raised.

Using this method is recommended over getattr() when doing programmatic accesses to children if childname is unknown beforehand or when its name is not a valid Python identifier.

Group._f_iter_nodes(classname: str | None = None) → Iterator[Node][source]

Iterate over children nodes.

Child nodes are yielded alphanumerically sorted by node name. If the name of a class derived from Node (see The Node class) is supplied in the classname parameter, only instances of that class (or subclasses of it) will be returned.

This is an iterator version of Group._f_list_nodes().

Group._f_list_nodes(classname: str | None = None) → list[Node][source]

Return a list with children nodes.

This is a list-returning version of Group._f_iter_nodes().

Group._f_walk_groups() → Iterator[Group][source]

Recursively iterate over descendent groups (not leaves).

This method starts by yielding self, and then it goes on to recursively iterate over all child groups in alphanumerical order, top to bottom (preorder), following the same procedure.

Group._f_walknodes(classname: str | None = None) → Iterator[Node][source]

Iterate over descendant nodes.

This method recursively walks self top to bottom (preorder), iterating over child groups in alphanumerical order, and yielding nodes. If classname is supplied, only instances of the named class are yielded.

If classname is Group, it behaves like Group._f_walk_groups(), yielding only groups. If you don’t want a recursive behavior, use Group._f_iter_nodes() instead.

Examples

# Recursively print all the arrays hanging from '/'
print("Arrays in the object tree '/':")
for array in h5file.root._f_walknodes('Array', recursive=True):
    print(array)

Group special methods

Following are described the methods that automatically trigger actions when a Group instance is accessed in a special way.

This class defines the __setattr__(), __getattr__() and __delattr__() methods, and they set, get and delete ordinary Python attributes as normally intended. In addition to that, __getattr__() allows getting child nodes by their name for the sake of easy interaction on the command line, as long as there is no Python attribute with the same name. Groups also allow the interactive completion (when using readline) of the names of child nodes. For instance:

# get a Python attribute
nchild = group._v_nchildren

# Add a Table child called 'table' under 'group'.
h5file.create_table(group, 'table', my_description)
table = group.table          # get the table child instance
group.table = 'foo'          # set a Python attribute

# (PyTables warns you here about using the name of a child node.)
foo = group.table            # get a Python attribute
del group.table              # delete a Python attribute
table = group.table          # get the table child instance again

Group.__contains__(name: str) → bool[source]

Return True if there is a child with the specified name.

Returns a true value if the group has a child node (visible or hidden) with the given name (a string), false otherwise.

Group.__delattr__(name: str) → None[source]

Delete a Python attribute called name.

This method only provides an extra warning in case the user tries to delete a children node using __delattr__.

To remove a children node from this group use File.remove_node() or Node._f_remove(). To delete a PyTables node attribute use File.del_node_attr(), Node._f_delattr() or Node._v_attrs`.

If there is an attribute and a child node with the same name, the child node will be made accessible again via natural naming.

Group.__getattr__(name: str) → Any[source]

Get a Python attribute or child node called name.

If the node has a child node called name it is returned, else an AttributeError is raised.

Group.__iter__() → Iterator[Node][source]

Iterate over the child nodes hanging directly from the group.

This iterator is not recursive.

Examples

# Non-recursively list all the nodes hanging from '/detector'
print("Nodes in '/detector' group:")
for node in h5file.root.detector:
    print(node)

Group.__repr__() → str[source]

Return a detailed string representation of the group.

Examples

>>> import tables
>>> f = tables.open_file('tables/tests/Tables_lzo2.h5')
>>> f.root.group0
/group0 (Group) ''
  children := ['group1' (Group), 'tuple1' (Table)]
>>> f.close()

Group.__setattr__(name: str, value: Any) → None[source]

Set a Python attribute called name with the given value.

This method stores an ordinary Python attribute in the object. It does not store new children nodes under this group; for that, use the File.create*() methods (see the File class in The File Class). It does neither store a PyTables node attribute; for that, use File.set_node_attr(), :meth`:Node._f_setattr` or Node._v_attrs.

If there is already a child node with the same name, a NaturalNameWarning will be issued and the child node will not be accessible via natural naming nor getattr(). It will still be available via File.get_node(), Group._f_get_child() and children dictionaries in the group (if visible).

Group.__str__() → str[source]

Return a short string representation of the group.

Examples

>>> import tables
>>> f = tables.open_file('tables/tests/Tables_lzo2.h5')
>>> print(f.root.group0)
/group0 (Group) ''
>>> f.close()

The Leaf class

class tables.Leaf(parentnode: Group, name: str, new: bool = False, filters: Filters | None = None, byteorder: Literal['little', 'big', None] = None, _log: bool = True, track_times: bool = True)[source]

Abstract base class for all PyTables leaves.

A leaf is a node (see the Node class in Node) which hangs from a group (see the Group class in Group) but, unlike a group, it can not have any further children below it (i.e. it is an end node).

This definition includes all nodes which contain actual data (datasets handled by the Table - see The Table class, Array - see The Array class, CArray - see The CArray class, EArray - see The EArray class, and VLArray - see The VLArray class classes) and unsupported nodes (the UnImplemented class - The UnImplemented class) these classes do in fact inherit from Leaf.

Leaf attributes

These instance variables are provided in addition to those in Node (see The Node class):

byteorder: The byte ordering of the leaf data on disk. It will be either little or big.

dtype: The NumPy dtype that most closely matches this leaf type.

extdim: The index of the enlargeable dimension (-1 if none).

nrows: The length of the main dimension of the leaf data.

nrowsinbuf

The number of rows that fit in internal input buffers.

You can change this to fine-tune the speed or memory requirements of your application.

shape: The shape of data in the leaf.

Leaf properties

Leaf.chunkshape

HDF5 chunk size for chunked leaves (a tuple).

This is read-only because you cannot change the chunk size of a leaf once it has been created.

Leaf.ndim: Return the number of dimensions of the leaf data.

Leaf.filters: Filter properties for this leaf.

See also

Filters

Leaf.maindim

Dimension along which iterators work.

Its value is 0 (i.e. the first dimension) when the dataset is not extendable, and self.extdim (where available) for extendable ones.

Leaf.flavor

Type of the data object read from this leaf.

It can be any of ‘numpy’ or ‘python’.

You can (and are encouraged to) use this property to get, set and delete the FLAVOR HDF5 attribute of the leaf. When the leaf has no such attribute, the default flavor is used.

Leaf.size_in_memory: The size of this leaf’s data in bytes when it is fully loaded into memory.

Leaf.size_on_disk

Size on disk of the object.

The size of this leaf’s data in bytes as it is stored on disk. If the data is compressed, this shows the compressed size. In the case of uncompressed, chunked data, this may be slightly larger than the amount of data, due to partially filled chunks.

Leaf instance variables - aliases

The following are just easier-to-write aliases to their Node (see The Node class) counterparts (indicated between parentheses):

Leaf.attrs: The associated AttributeSet instance - see The AttributeSet class (This is an easier-to-write alias of Node._v_attrs.

Leaf.name

Name of the node.

The name of this node in its parent group (This is an easier-to-write alias of Node._v_name).

Leaf.object_id

Node identifier, which may change from run to run.

(This is an easier-to-write alias of Node._v_objectid).

Changed in version 3.0: The objectID property has been renamed into object_id.

Leaf.title: A description for this node (This is an easier-to-write alias of Node._v_title).

Leaf methods

Leaf.close(flush: bool = True) → None[source]

Close this node in the tree.

This method is completely equivalent to Leaf._f_close().

Leaf.copy(newparent: Group | None = None, newname: str | None = None, overwrite: bool = False, createparents: bool = False, **kwargs) → Leaf[source]

Copy this node and return the new one.

This method has the behavior described in Node._f_copy(). Please note that there is no recursive flag since leaves do not have child nodes.

Warning

Note that unknown parameters passed to this method will be ignored, so may want to double check the spelling of these (i.e. if you write them incorrectly, they will most probably be ignored).

Parameters:

title – The new title for the destination. If omitted or None, the original title is used.
filters (Filters) – Specifying this parameter overrides the original filter properties in the source node. If specified, it must be an instance of the Filters class (see The Filters class). The default is to copy the filter properties from the source node.
copyuserattrs – You can prevent the user attributes from being copied by setting this parameter to False. The default is to copy them.
start (int) – Specify the range of rows to be copied; the default is to copy all the rows.
stop (int) – Specify the range of rows to be copied; the default is to copy all the rows.
step (int) – Specify the range of rows to be copied; the default is to copy all the rows.
stats – This argument may be used to collect statistics on the copy process. When used, it should be a dictionary with keys ‘groups’, ‘leaves’ and ‘bytes’ having a numeric value. Their values will be incremented to reflect the number of groups, leaves and bytes, respectively, that have been copied during the operation.
chunkshape – The chunkshape of the new leaf. It supports a couple of special values. A value of keep means that the chunkshape will be the same as original leaf (this is the default). A value of auto means that a new shape will be computed automatically in order to ensure the best performance when accessing the dataset through the main dimension. Any other value should be an integer or a tuple matching the dimensions of the leaf.

Leaf.flush() → None[source]

Flush pending data to disk.

Saves whatever remaining buffered data to disk. It also releases I/O buffers, so if you are filling many datasets in the same PyTables session, please call flush() extensively so as to help PyTables to keep memory requirements low.

Leaf.isvisible() → bool[source]

Return True if this node is visible.

This method has the behavior described in Node._f_isvisible().

Leaf.move(newparent: Group | None = None, newname: str | None = None, overwrite: bool = False, createparents: bool = False) → None[source]

Move or rename this node.

This method has the behavior described in Node._f_move()

Leaf.rename(newname: str) → None[source]

Rename this node in place.

This method has the behavior described in Node._f_rename().

Leaf.remove() → None[source]

Remove this node from the hierarchy.

This method has the behavior described in Node._f_remove(). Please note that there is no recursive flag since leaves do not have child nodes.

Leaf.get_attr(name: str) → Any[source]

Get a PyTables attribute from this node.

This method has the behavior described in Node._f_getattr().

Leaf.set_attr(name: str, value: Any) → None[source]

Set a PyTables attribute for this node.

This method has the behavior described in Node._f_setattr().

Leaf.del_attr(name: str) → None[source]

Delete a PyTables attribute from this node.

This method has the behavior described in Node_f_delAttr().

Leaf.truncate(size: int) → None[source]

Truncate the main dimension to be size rows.

If the main dimension previously was larger than this size, the extra data is lost. If the main dimension previously was shorter, it is extended, and the extended part is filled with the default values.

The truncation operation can only be applied to enlargeable datasets, else a TypeError will be raised.

Leaf.chunk_info(coords: tuple[int, ...]) → ChunkInfo[source]

Get storage information about the chunk containing the coords.

The coordinates coords are a tuple of integers with the same rank as the dataset.

Return a ChunkInfo instance with the information.

The coordinates need not be aligined with chunk boundaries. This means that this method may be used to get the start coordinates of the chunk that contains the item at the given coordinates, for use with other direct chunking operations (see ChunkInfo.start).

If the coordinates are within the dataset’s shape but there is no such chunk in storage (missing chunk), a ChunkInfo with a valid start and filter_mask = offset = size = None is returned. If the coordinates are beyond the shape, IndexError is raised (even if the start of the chunk would fall within the shape).

Calling this method on a non-chunked dataset raises a NotChunkedError.

Leaf.read_chunk(coords: tuple[int, ...], out: bytearray | ndarray[tuple[int], dtype[uint8]] | None = None) → bytes | memoryview[source]

Get the raw chunk that starts at the given coords from storage.

The coordinates coords are a tuple of integers with the same rank as the dataset. If they are not multiples of its chunkshape, NotChunkAlignedError is raised.

If a buffer-like out argument is given, it receives chunk data. If it has insufficient storage for the chunk, ValueError is raised (use chunk_info() to get the required capacity).

The obtained data is supposed to have gone at storage time through dataset filters, minus those in the chunk’s filter mask (use chunk_info() to get it).

Return the chunk’s raw content, either as a bytes instance (if out is None) or as a memoryview over the object given as out.

Reading a chunk within the dataset’s shape, but not in storage (missing chunk) raises a NoSuchChunkError. If the chunk is beyond the shape, IndexError is raised.

Calling this method on a non-chunked dataset raises a NotChunkedError.

Leaf.write_chunk(coords: tuple[int, ...], data: bytes | bytearray | memoryview | ndarray[tuple[int], dtype[uint8]], filter_mask: int = 0) → None[source]

Write data to storage for the chunk starting at the given coords.

The coordinates coords are a tuple of integers with the same rank as the dataset. If they are not multiples of its chunkshape, NotChunkAlignedError is raised.

The content of the buffer-like data must already have gone through dataset filters, minus those in the given filter_mask (which is to be saved along data; see ChunkInfo.filter_mask).

Writing a chunk which is already in storage replaces it, otherwise it is added to storage as long as it is within the dataset’s shape (missing chunk). This means that you may use truncate() to grow an enlargeable dataset cheaply (as no chunk data is written), then sparsely write selected chunks in arbitrary order.

If the chunk is beyond the dataset’s shape, IndexError is raised.

Calling this method on a non-chunked dataset raises a NotChunkedError.

Leaf.__len__() → int[source]

Return the length of the main dimension of the leaf data.

Please note that this may raise an OverflowError on 32-bit platforms for datasets having more than 2**31-1 rows. This is a limitation of Python that you can work around by using the nrows or shape attributes.

Leaf._f_close(flush: bool = True) → None[source]

Close this node in the tree.

This method has the behavior described in Node._f_close(). Besides that, the optional argument flush tells whether to flush pending data to disk or not before closing.