PyTables implements several classes to represent the different
nodes in the object tree. They are named File,
Group, Leaf,
Table, Array,
CArray, EArray,
VLArray and UnImplemented. Another
one allows the user to complement the information on these different
objects; its name is AttributeSet. Finally, another
important class called IsDescription allows to build
a Table record description by declaring a subclass of
it. Many other classes are defined in PyTables, but they can be regarded
as helpers whose goal is mainly to declare the data type
properties of the different first class objects and will be
described at the end of this chapter as well.
An important function, called openFile is
responsible to create, open or append to files. In addition, a few
utility functions are defined to guess if the user supplied file is a
PyTables or HDF5 file. These
are called isPyTablesFile() and
isHDF5File(), respectively. There exists also a
function called whichLibVersion() that informs about
the versions of the underlying C libraries (for example, HDF5 or
Zlib) and another called
print_versions() that prints all the versions of the
software that PyTables relies on. Finally, test()
lets you run the complete test suite from a Python console
interactively.
Let's start discussing the first-level variables and functions available to the user, then the different classes defined in PyTables.
The PyTables version number.
The underlying HDF5 library version number.
True for PyTables Professional edition, false otherwise.
An easy way of copying one PyTables file to another.
This function allows you to copy an existing PyTables file
named srcfilename to another file called
dstfilename. The source file must exist and be
readable. The destination file can be overwritten in place if
existing by asserting the overwrite
argument.
This function is a shorthand for the
File.copyFile() method, which acts on an
already opened file. kwargs takes keyword
arguments used to customize the copying process. See the
documentation of File.copyFile() (see description) for a description of those
arguments.
Determine whether a file is in the HDF5 format.
When successful, it returns a true value if the file is an
HDF5 file, false otherwise. If there were problems identifying the
file, an HDF5ExtError is raised.
Determine whether a file is in the PyTables format.
When successful, it returns a true value if the file is a
PyTables file, false otherwise. The true value is the format
version string of the file. If there were problems identifying the
file, an HDF5ExtError is raised.
Iterate over long ranges.
This is similar to xrange(), but it
allows 64-bit arguments on all platforms. The results of the
iteration are sequentially yielded in the form of
numpy.int64 values, but getting random
individual items is not supported.
Because of the Python 32-bit limitation on object lengths,
the length attribute (which is also a
numpy.int64 value) should be used instead of
the len() syntax.
Default start and step
arguments are supported in the same way as in
xrange(). When the standard
[x]range() Python objects support 64-bit
arguments, this iterator will be deprecated.
Open a PyTables (or generic HDF5) file and return a
File object.
Arguments:
The name of the file (supports environment variable
expansion). It is suggested that file names have any of the
.h5, .hdf or
.hdf5 extensions, although this is not
mandatory.
The mode in whichto open the file. It can be one of the following:
Read-only; no data can be modified.
Write; a new file is created (an existing file with the same name would be deleted).
Append; an existing file is opened for reading and writing, and if the file does not exist it is created.
It is similar to 'a', but the
file must already exist.
If the file is to be created, a
TITLE string attribute will be set on the
root group with the given value. Otherwise, the title will
be read from disk, and this will not have any effect.
A dictionary to map names in the object tree into different HDF5 names in file. The keys are the Python names, while the values are the HDF5 names. This is useful when you need to name HDF5 nodes with invalid or reserved words in Python and you want to continue using the natural naming facility on the nodes.
The root User Entry Point. This is a group in the HDF5
hierarchy which will be taken as the starting point to
create the object tree. It can be whatever existing group in
the file, named by its HDF5 path. If it does not exist, an
HDF5ExtError is issued. Use this if you
do not want to build the entire object
tree, but rather only a subtree of
it.
An instance of the Filters (see
Section 4.14.1) class that provides
information about the desired I/O filters applicable to the
leaves that hang directly from the root
group, unless other filter properties are
specified for these leaves. Besides, if you do not specify
filter properties for child groups, they will inherit these
ones, which will in turn propagate to child nodes.
The number of unreferenced nodes to be kept in memory. Least recently used nodes are unloaded from memory when this number of loaded nodes is reached. To load a node again, simply access it as usual. Nodes referenced by user variables are not taken into account nor unloaded.
Disable all flavors except those in
keep.
Providing an empty keep sequence implies
disabling all flavors (but the internal one). If the sequence is
not specified, only optional flavors are disabled.
![]() | Important |
|---|---|
Once you disable a flavor, it can not be enabled again. |
Split a PyTables type into a PyTables
kind and an item size.
Returns a tuple of (kind, itemsize). If
no item size is present in the type (in the
form of a precision), the returned item size is
None.
>>> split_type('int32')
('int', 4)
>>> split_type('string')
('string', None)
>>> split_type('int20')
Traceback (most recent call last):
...
ValueError: precision must be a multiple of 8: 20
>>> split_type('foo bar')
Traceback (most recent call last):
...
ValueError: malformed type: 'foo bar'Run all the tests in the test suite.
If verbose is set, the test suite will
emit messages with full verbosity (not recommended unless you are
looking into a certain problem).
If heavy is set, the test suite will be
run in heavy mode (you should be careful with
this because it can take a lot of time and resources from your
computer).
Get version information about a C library.
If the library indicated by name is
available, this function returns a 3-tuple containing the major
library version as an integer, its full version as a string, and
the version date as a string. If the library is not available,
None is returned.
The currently supported library names are
hdf5, zlib,
lzo and bzip2. If another
name is given, a ValueError is raised.
In-memory representation of a PyTables file.
An instance of this class is returned when a PyTables file is
opened with the openFile() (see description) function. It offers methods to manipulate
(create, rename, delete...) nodes and handle their attributes, as well
as methods to traverse the object tree. The user entry
point to the object tree attached to the HDF5 file is
represented in the rootUEP attribute. Other
attributes are available.
File objects support an Undo/Redo
mechanism which can be enabled with the
enableUndo() (see description) method. Once the Undo/Redo mechanism is
enabled, explicit marks (with an optional unique
name) can be set on the state of the database using the
mark() (see description)
method. There are two implicit marks which are always available: the
initial mark (0) and the final mark (-1). Both the identifier of a
mark and its name can be used in undo and
redo operations.
Hierarchy manipulation operations (node creation, movement and
removal) and attribute handling operations (setting and deleting) made
after a mark can be undone by using the undo() (see
description) method, which returns the database to the
state of a past mark. If undo() is not followed by
operations that modify the hierarchy or attributes, the
redo() (see description) method can
be used to return the database to the state of a future mark. Else,
future states of the database are forgotten.
Note that data handling operations can not be undone nor redone
by now. Also, hierarchy manipulation operations on nodes that do not
support the Undo/Redo mechanism issue an
UndoRedoWarning before
changing the database.
The Undo/Redo mechanism is persistent between sessions and can
only be disabled by calling the disableUndo() (see
description) method.
File objects can also act as context managers when using the
with statement introduced in Python 2.5. When
exiting a context, the file is automatically closed.
The name of the opened file.
The PyTables version number of this file.
True if the underlying file is open, false otherwise.
The mode in which the file was opened.
The title of the root group in the file.
A dictionary that maps node names between PyTables and
HDF5 domain names. Its initial values are set from the
trMap parameter passed to the
openFile() (see description) function. You cannot change its
contents after a file is opened.
The UEP (user entry point) group name in the file (see
the openFile() function in description).
Default filter properties for the root group (see Section 4.14.1).
The root of the object tree
hierarchy (a Group instance).
Copy the contents of this file to
dstfilename.
dstfilename must be a path string
indicating the name of the destination file. If it already exists,
the copy will fail with an IOError, unless the
overwrite argument is true, in which case the
destination file will be overwritten in place. In this last case,
the destination file should be closed or ugly errors will
happen.
Additional keyword arguments may be passed to customize the copying process. For instance, title and filters may be changed, user attributes may be or may not be copied, data may be sub-sampled, stats may be collected, etc. Arguments unknown to nodes are simply ignored. Check the documentation for copying operations of nodes to see which options they support.
Copying a file usually has the beneficial side effect of creating a more compact and cleaner version of the original file.
Return a short string representation of the object tree.
Example of use:
>>> f = tables.openFile('data/test.h5')
>>> print f
data/test.h5 (File) 'Table Benchmark'
Last modif.: 'Mon Sep 20 12:40:47 2004'
Object Tree:
/ (Group) 'Table Benchmark'
/tuple0 (Table(100L,)) 'This is the table title'
/group0 (Group) ''
/group0/tuple1 (Table(100L,)) 'This is the table title'
/group0/group1 (Group) ''
/group0/group1/tuple2 (Table(100L,)) 'This is the table title'
/group0/group1/group2 (Group) ''Copy the children of a group into another group.
This method copies the nodes hanging from the source group
srcgroup into the destination group
dstgroup. Existing destination nodes can be
replaced by asserting the overwrite argument.
If the recursive argument is true, all
descendant nodes of srcnode are recursively
copied. If createparents is true, the needed
groups for the given destination parent group path to exist will
be created.
kwargs takes keyword arguments used to
customize the copying process. See the documentation of
Group._f_copyChildren() (see description) for a description of those
arguments.
Copy the node specified by where and
name to
newparent/newname.
These arguments work as in
File.getNode() (see description), referencing the node to be acted
upon.
The destination group that the node will be copied
into (a path name or a Group
instance). If not specified or None, the
current parent group is chosen as the new parent.
The name to be assigned to the new copy in its
destination (a string). If it is not specified or
None, the current name is chosen as the
new name.
Additional keyword arguments may be passed to customize the
copying process. The supported arguments depend on the kind of
node being copied. See Group._f_copy() (description) and Leaf.copy()
(description) for more information on their
allowed keyword arguments.
This method returns the newly created copy of the source
node (i.e. the destination node). See
Node._f_copy() (description)
for further details on the semantics of copying nodes.
Create a new array with the given name in
where location. See the
Array class (in Section 4.7) for more information on
arrays.
The array or scalar to be saved. Accepted types are
NumPy arrays and scalars, numarray arrays
and string arrays, Numeric arrays and scalars, as well as
native Python sequences and scalars, provided that values
are regular (i.e. they are not like
[[1,2],2]) and homogeneous (i.e. all the
elements are of the same type).
Also, objects that have some of their dimensions equal
to 0 are not supported (use an EArray
node (see Section 4.9) if you want to store an array
with one of its dimensions equal to 0).
The byteorder of the data on
disk, specified as 'little' or
'big'. If this is not specified, the
byteorder is that of the given
object.
See File.createTable() (description) for more
information on the rest of parameters.
Create a new chunked array with the given
name in where location. See
the CArray class (in Section 4.8) for more
information on chunked arrays.
An Atom (see Section 4.13.3)
instance representing the type and
shape of the atomic objects to be
saved.
The shape of the new array.
The shape of the data chunk to be read or written in a
single HDF5 I/O operation. Filters are applied to those
chunks of data. The dimensionality of
chunkshape must be the same as that of
shape. If None, a
sensible value is calculated (which is recommended).
See File.createTable() (description) for more
information on the rest of parameters.
Create a new enlargeable array with the given
name in where location. See
the EArray (in Section 4.9) class for more information on
enlargeable arrays.
An Atom (see Section 4.13.3)
instance representing the type and
shape of the atomic objects to be
saved.
The shape of the new array. One (and only one) of the
shape dimensions must be 0. The
dimension being 0 means that the resulting
EArray object can be extended along it.
Multiple enlargeable dimensions are not supported right
now.
A user estimate about the number of row elements that
will be added to the growable dimension in the
EArray node. If not provided, the
default value is 1000 rows. If you plan to create either a
much smaller or a much bigger array try providing a guess;
this will optimize the HDF5 B-Tree creation and management
process time and the amount of memory used. If you want to
specify your own chunk size for I/O purposes, see also the
chunkshape parameter below.
The shape of the data chunk to be read or written in a
single HDF5 I/O operation. Filters are applied to those
chunks of data. The dimensionality of
chunkshape must be the same as that of
shape (beware: no dimension should be 0
this time!). If None, a sensible value
is calculated (which is recommended).
The byteorder of the data on
disk, specified as 'little' or
'big'. If this is not specified, the
byteorder is that of the platform.
See File.createTable() (description) for more
information on the rest of parameters.
Create a new group with the given name in
where location. See the
Group class (in Section 4.4) for more information on
groups.
An instance of the Filters class
(see Section 4.14.1) that provides information
about the desired I/O filters applicable to the leaves that
hang directly from this new group (unless other filter
properties are specified for these leaves). Besides, if you
do not specify filter properties for its child groups, they
will inherit these ones.
See File.createTable() (description) for more
information on the rest of parameters.
Create a new table with the given name in
where location. See the
Table (in Section 4.6) class for more information on
tables.
The parent group where the new table will hang from.
It can be a path string (for example
'/level1/leaf5'), or a
Group instance (see Section 4.4).
The name of the new table.
This is an object that describes the table, i.e. how many columns it has, their names, types, shapes, etc. It can be any of the following:
This should inherit from the
IsDescription class (see Section 4.13.1) where table fields are specified.
For example, when you do not know beforehand which structure your table will have).
See Section 3.4 for an example of using a dictionary to describe a table.
Description
instanceYou can use the description
attribute of another table to create a new one with
the same structure.
NumPy (record)
array instanceYou can use a NumPy array, whether nested or
not, and its field structure will be reflected in the
new Table object. Moreover, if the
array has actual data it will be injected into the
newly created table. If you are using
numarray instead of NumPy, you may
use one of the objects below for the same
purpose.
RecArray
instanceThis object from the numarray
package is also accepted, but it does not give you the
possibility to create a nested table. Array data is
injected into the new table.
NestedRecArray
instanceFinally, if you want to have nested columns in
your table and you are using
numarray, you can use this
object. Array data is injected into the new
table.
See Appendix C for a description of the
NestedRecArray class.
A description for this node (it sets the
TITLE HDF5 attribute on disk).
An instance of the Filters class
(see Section 4.14.1) that provides information
about the desired I/O filters to be applied during the life
of this object.
A user estimate of the number of records that will be
in the table. If not provided, the default value is
appropriate for tables up to 10 MB in size (more or
less). If you plan to create a bigger table try providing a
guess; this will optimize the HDF5 B-Tree creation and
management process time and memory used. If you want to
specify your own chunk size for I/O purposes, see also the
chunkshape parameter below.
See Section 5.1 for a discussion on the issue of providing a number of expected rows.
The shape of the data chunk to be read or written in a
single HDF5 I/O operation. Filters are applied to those
chunks of data. The rank of the
chunkshape for tables must be 1. If
None, a sensible value is calculated
(which is recommended).
The byteorder of data on disk,
specified as 'little' or
'big'. If this is not specified, the
byteorder is that of the platform, unless you passed an
array as the description, in which case
its byteorder will be used.
Whether to create the needed groups for the parent path to exist (not done by default).
Create a new variable-length array with the given
name in where location. See
the VLArray (in Section 4.10) class
for more information on variable-length arrays.
An Atom (see Section 4.13.3)
instance representing the type and
shape of the atomic objects to be
saved.
An user estimate about the size (in MB) in the final
VLArray node. If not provided, the
default value is 1 MB. If you plan to create either a much
smaller or a much bigger array try providing a guess; this
will optimize the HDF5 B-Tree creation and management
process time and the amount of memory used. If you want to
specify your own chunk size for I/O purposes, see also the
chunkshape parameter below.
The shape of the data chunk to be read or written in a
single HDF5 I/O operation. Filters are applied to those
chunks of data. The dimensionality of
chunkshape must be 1. If
None, a sensible value is calculated
(which is recommended).
See File.createTable() (description) for more
information on the rest of parameters.
Move the node specified by where and
name to
newparent/newname.
These arguments work as in
File.getNode() (see description), referencing the node to be acted
upon.
The destination group the node will be moved into (a
path name or a Group instance). If it is
not specified or None, the current parent
group is chosen as the new parent.
The new name to be assigned to the node in its
destination (a string). If it is not specified or
None, the current name is chosen as the
new name.
The other arguments work as in
Node._f_move() (see description).
Remove the object node name under where location.
These arguments work as in
File.getNode() (see description), referencing the node to be acted
upon.
If not supplied or false, the node will be removed
only if it has no children; if it does, a
NodeError will be raised. If supplied
with a true value, the node and all its descendants will be
completely removed.
Change the name of the node specified by
where and name to
newname.
These arguments work as in
File.getNode() (see description), referencing the node to be acted
upon.
The new name to be assigned to the node (a string).
Whether to recursively remove a node with the same
newname if it already exists (not done by
default).
Get the node under where with the given
name.
where can be a Node
instance (see Section 4.3) or a path string leading to a node. If no
name is specified, that node is
returned.
If a name is specified, this must be a
string with the name of a node under where. In
this case the where argument can only lead to a
Group (see Section 4.4) instance (else a
TypeError is raised). The node called
name under the group where
is returned.
In both cases, if the node to be returned does not exist, a
NoSuchNodeError is raised. Please note that
hidden nodes are also considered.
If the classname argument is specified,
it must be the name of a class derived from
Node. If the node is found but it is not an
instance of that class, a NoSuchNodeError is
also raised.
Is the node under path visible?
If the node does not exist, a
NoSuchNodeError is raised.
Iterate over children nodes hanging from
where.
This argument works as in
File.getNode() (see description), referencing the node to be acted
upon.
If the name of a class derived from
Node (see Section 4.3) is supplied, only instances of
that class (or subclasses of it) will be returned.
The returned nodes are alphanumerically sorted by their
name. This is an iterator version of
File.listNodes() (see description).
Return a list with children nodes
hanging from where.
This is a list-returning version of
File.iterNodes() (see description).
Recursively iterate over groups (not leaves) hanging from
where.
The where group itself is listed first
(preorder), then each of its child groups (following an
alphanumerical order) is also traversed, following the same
procedure. If where is not supplied, the root
group is used.
The where argument can be a path string
or a Group instance (see Section 4.4).
Recursively iterate over nodes hanging from
where.
If supplied, the iteration starts from (and includes)
this group. It can be a path string or a
Group instance (see Section 4.4).
If the name of a class derived from
Node (see Section 4.4) is supplied, only instances of
that class (or subclasses of it) will be returned.
Example of use:
# Recursively print all the nodes hanging from '/detector'.
print "Nodes hanging from group '/detector':"
for node in h5file.walkNodes('/detector', classname='EArray'):
print nodeIs there a node with that path?
Returns True if the file has a node with
the given path (a string),
False otherwise.
Recursively iterate over the nodes in the tree.
This is equivalent to calling
File.walkNodes() (see description) with no arguments.
Example of use:
# Recursively list all the nodes in the object tree.
h5file = tables.openFile('vlarray1.h5')
print "All nodes in the object tree:"
for node in h5file:
print nodeDisable the Undo/Redo mechanism.
Disabling the Undo/Redo mechanism leaves the database in the
current state and forgets past and future database states. This
makes File.mark() (see description), File.undo() (see description), File.redo() (see description) and other methods fail with an
UndoRedoError.
Calling this method when the Undo/Redo mechanism is already
disabled raises an UndoRedoError.
Enable the Undo/Redo mechanism.
This operation prepares the database for undoing and redoing
modifications in the node hierarchy. This allows
File.mark() (see description),
File.undo() (see description),
File.redo() (see description)
and other methods to be called.
The filters argument, when specified,
must be an instance of class Filters (see Section 4.14.1) and is
meant for setting the compression values for the action log. The
default is having compression enabled, as the gains in terms of
space can be considerable. You may want to disable compression if
you want maximum speed for Undo/Redo operations.
Calling this method when the Undo/Redo mechanism is already
enabled raises an UndoRedoError.
Get the identifier of the current mark.
Returns the identifier of the current mark. This can be used
to know the state of a database after an application crash, or to
get the identifier of the initial implicit mark after a call to
File.enableUndo() (see description).
This method can only be called when the Undo/Redo mechanism
has been enabled. Otherwise, an UndoRedoError
is raised.
Go to a specific mark of the database.
Returns the database to the state associated with the
specified mark. Both the identifier of a mark
and its name can be used.
This method can only be called when the Undo/Redo mechanism
has been enabled. Otherwise, an UndoRedoError
is raised.
Is the Undo/Redo mechanism enabled?
Returns True if the Undo/Redo mechanism
has been enabled for this file, False
otherwise. Please note that this mechanism is persistent, so a
newly opened PyTables file may already have Undo/Redo
support enabled.
Mark the state of the database.
Creates a mark for the current state of the database. A
unique (and immutable) identifier for the mark is returned. An
optional name (a string) can be assigned to the
mark. Both the identifier of a mark and its name can be used in
File.undo() (see description)
and File.redo() (see description) operations. When the name has already been
used for another mark, an UndoRedoError is
raised.
This method can only be called when the Undo/Redo mechanism
has been enabled. Otherwise, an UndoRedoError
is raised.
Go to a future state of the database.
Returns the database to the state associated with the
specified mark. Both the identifier of a mark
and its name can be used. If the mark is
omitted, the next created mark is used. If there are no future
marks, or the specified mark is not newer than
the current one, an UndoRedoError is
raised.
This method can only be called when the Undo/Redo mechanism
has been enabled. Otherwise, an UndoRedoError
is raised.
Go to a past state of the database.
Returns the database to the state associated with the
specified mark. Both the identifier of a mark
and its name can be used. If the mark is
omitted, the last created mark is used. If there are no past
marks, or the specified mark is not older than
the current one, an UndoRedoError is
raised.
This method can only be called when the Undo/Redo mechanism
has been enabled. Otherwise, an UndoRedoError
is raised.
Copy PyTables attributes from one node to another.
These arguments work as in
File.getNode() (see description), referencing the node to be acted
upon.
The destination node where the attributes will be
copied to. It can be a path string or a
Node instance (see Section 4.3).
Delete a PyTables attribute from the given node.
These arguments work as in
File.getNode() (see description), referencing the node to be acted
upon.
The name of the attribute to delete. If the named
attribute does not exist, an
AttributeError is raised.
Get a PyTables attribute from the given node.
These arguments work as in
File.getNode() (see description), referencing the node to be acted
upon.
The name of the attribute to retrieve. If the named
attribute does not exist, an
AttributeError is raised.
Set a PyTables attribute for the given node.
These arguments work as in
File.getNode() (see description), referencing the node to be acted
upon.
The name of the attribute to set.
The value of the attribute to set. Any kind of Python
object (like strings, ints, floats, lists, tuples, dicts,
small NumPy/Numeric/numarray objects...) can be stored as an
attribute. However, if necessary, cPickle
is automatically used so as to serialize objects that you
might want to save. See the AttributeSet
class (in Section 4.12) for details.
If the node already has a large number of attributes, a
PerformanceWarning is issued.
Abstract base class for all PyTables nodes.
This is the base class for all nodes in a PyTables hierarchy. It is an abstract class, i.e. it may not be directly instantiated; however, every node in the hierarchy is an instance of this class.
A PyTables node is always hosted in a PyTables
file, under a parent group,
at a certain depth in the node hierarchy. A node
knows its own name in the parent group and its
own path name in the file. When using a
translation map (see the File class in Section 4.2), its
HDF5 name might differ from its PyTables
name.
All the previous information is location-dependent, i.e. it may change when moving or renaming a node in the hierarchy. A node also has location-independent information, such as its HDF5 object identifier and its attribute set.
This class gathers the operations and attributes (both
location-dependent and independent) which are common to all PyTables
nodes, whatever their type is. Nonetheless, due to natural naming
restrictions, the names of all of these members start with a reserved
prefix (see the Group class in Section 4.4).
Sub-classes with no children (i.e. leaf
nodes) may define new methods, attributes and properties to
avoid natural naming restrictions. For instance,
_v_attrs may be shortened to
attrs and _f_rename to
rename. However, the original methods and
attributes should still be available.
The depth of this node in the tree (an non-negative integer value).
The hosting File instance (see Section 4.2).
The name of this node in the hosting HDF5 file (a string).
The name of this node in its parent group (a string).
The parent Group instance (see Section 4.4).
The path of this node in the tree (a string).
The associated AttributeSet instance
(see Section 4.12).
Whether this node is open or not.
A node identifier (may change from run to run).
A description of this node. A shorthand for
TITLE attribute.
Close this node in the tree.
This releases all resources held by the node, so it should not be used again. On nodes with data, it may be flushed to disk.
The closing operation is not recursive, i.e. closing a group does not close its children.
Copy this node and return the new node.
Creates and returns a copy of the node, maybe in a different
place in the hierarchy. newparent can be a
Group object (see Section 4.4) or a
pathname in string form. If it is not specified or
None, the current parent group is chosen as the
new parent. newname must be a string with a
new name. If it is not specified or None, the
current name is chosen as the new name. If
recursive copy is stated, all descendants are
copied as well. If createparents is true, the
needed groups for the given new parent group path to exist will be
created.
Copying a node across databases is supported but can not be
undone. Copying a node over itself is not allowed, nor it is
recursively copying a node into itself. These result in a
NodeError. Copying over another existing node
is similarly not allowed, unless the optional
overwrite argument is true, in which case that
node is recursively removed before copying.
Additional keyword arguments may be passed to customize the copying process. For instance, title and filters may be changed, user attributes may be or may not be copied, data may be sub-sampled, stats may be collected, etc. See the documentation for the particular node type.
Using only the first argument is equivalent to copying the node to a new location without changing its name. Using only the second argument is equivalent to making a copy of the node in the same group.
Move or rename this node.
Moves a node into a new parent group, or changes the name of
the node. newparent can be a
Group object (see Section 4.4) or a
pathname in string form. If it is not specified or
None, the current parent group is chosen as the
new parent. newname must be a string with a
new name. If it is not specified or None, the
current name is chosen as the new name. If
createparents is true, the needed groups for
the given new parent group path to exist will be created.
Moving a node across databases is not allowed, nor it is
moving a node into itself. These result in a
NodeError. However, moving a node
over itself is allowed and simply does
nothing. Moving over another existing node is similarly not
allowed, unless the optional overwrite argument
is true, in which case that node is recursively removed before
moving.
Usually, only the first argument will be used, effectively moving the node to a new location without changing its name. Using only the second argument is equivalent to renaming the node in place.
Remove this node from the hierarchy.
If the node has children, recursive removal must be stated
by giving recursive a true value; otherwise, a
NodeError will be raised.
Delete a PyTables attribute from this node.
If the named attribute does not exist, an
AttributeError is raised.
Get a PyTables attribute from this node.
If the named attribute does not exist, an
AttributeError is raised.
Basic PyTables grouping structure.
Instances of this class are grouping structures containing child instances of zero or more groups or leaves, together with supporting metadata. Each group has exactly one parent group.
Working with groups and leaves is similar in many ways to
working with directories and files, respectively, in a Unix
filesystem. As with Unix directories and files, objects in the object
tree are often described by giving their full (or absolute) path
names. This full path can be specified either as a string (like in
'/group1/group2') or as a complete object path
written in natural naming schema (like in
file.root.group1.group2).
See Section 1.2 for more information on natural naming.
A collateral effect of the natural naming
schema is that the names of members in the Group
class and its instances must be carefully chosen to avoid colliding
with existing children node names. For this reason and to avoid
polluting the children namespace all members in a
Group start with some reserved prefix, like
_f_ (for public methods), _g_
(for private ones), _v_ (for instance variables) or
_c_ (for class variables). Any attempt to create a
new child node whose name starts with one of these prefixes will raise
a ValueError exception.
Another effect of natural naming is that children named after
Python keywords or having names not valid as Python identifiers (e.g.
class, $a or
44) can not be accessed using the
node.child syntax. You will be forced to use
node._f_getChild(child) to access them (which is
recommended for programmatic accesses). You can also make use of the
trMap (translation map dictionary) parameter in the
openFile() function (see description) in order to translate HDF5 names not
suited for natural naming into more convenient ones, so that you can
go on using file.root.group1.group2 syntax or
getattr().
You will also need to use _f_getChild() to
access an existing child node if you set a Python attribute in the
Group with the same name as that node (you will get
a NaturalNameWarning when doing this).
The following instance variables are provided in addition to
those in Node (see Section 4.3):
The number of children hanging from this group.
Default filter properties for child nodes.
You can (and are encouraged to) use this property to
get, set and delete the FILTERS HDF5
attribute of the group, which stores a
Filters instance (see Section 4.14.1). When
the group has no such attribute, a default
Filters instance is used.
Dictionary with all groups hanging from this group.
Dictionary with all hidden nodes hanging from this group.
Dictionary with all leaves hanging from this group.
Dictionary with all nodes hanging from this group.
Caveat: The following methods are
documented for completeness, and they can be used without any
problem. However, you should use the high-level counterpart methods
in the File class (see Section 4.2, because they
are most used in documentation and examples, and are a bit more
powerful than those exposed here.
The following methods are provided in addition to those in
Node (see Section 4.3):
Close this node in the tree.
This method has the behavior described in
Node._f_close() (see description). It should be noted that this
operation disables access to nodes descending from this group.
Therefore, if you want to explicitly close them, you will need to
walk the nodes hanging from this group before
closing it.
Copy this node and return the new one.
This method has the behavior described in
Node._f_copy() (see description). In addition, it recognizes the
following keyword arguments:
The new title for the destination. If omitted or
None, the original title is used. This
only applies to the topmost node in recursive copies.
Specifying this parameter overrides the original
filter properties in the source node. If specified, it must
be an instance of the Filters class (see
Section 4.14.1). The default is to copy the
filter properties from the source node.
You can prevent the user attributes from being copied
by setting this parameter to False. The
default is to copy them.
This argument may be used to collect statistics on the
copy process. When used, it should be a dictionary with keys
'groups', 'leaves' and
'bytes' having a numeric value. Their
values will be incremented to reflect the number of groups,
leaves and bytes, respectively, that have been copied during
the operation.
Copy the children of this group into another group.
Children hanging directly from this group are copied into
dstgroup, which can be a
Group (see Section 4.4) object or its pathname in string
form. If createparents is true, the needed
groups for the given destination group path to exist will be
created.
The operation will fail with a NodeError
if there is a child node in the destination group with the same
name as one of the copied children from this one, unless
overwrite is true; in this case, the former
child node is recursively removed before copying the later.
By default, nodes descending from children groups of this
node are not copied. If the recursive argument
is true, all descendant nodes of this node are recursively
copied.
Additional keyword arguments may be passed to customize the copying process. For instance, title and filters may be changed, user attributes may be or may not be copied, data may be sub-sampled, stats may be collected, etc. Arguments unknown to nodes are simply ignored. Check the documentation for copying operations of nodes to see which options they support.
Get the child called childname of this
group.
If the child exists (be it visible or not), it is returned.
Else, a NoSuchNodeError is raised.
Using this method is recommended over
getattr() when doing programmatic accesses to
children if the childname is unknown beforehand
or when its name is not a valid Python identifier.
Iterate over children nodes.
Child nodes are yielded alphanumerically sorted by node
name. If the name of a class derived from Node
(see Section 4.3)
is supplied in the classname parameter, only
instances of that class (or subclasses of it) will be
returned.
This is an iterator version of
Group._f_listNodes() (see description).
Return a list with children nodes.
This is a list-returning version of
Group._f_iterNodes() (see description).
Recursively iterate over descendent groups (not leaves).
This method starts by yielding self, and then it goes on to recursively iterate over all child groups in alphanumerical order, top to bottom (preorder), following the same procedure.
Iterate over descendent nodes.
This method recursively walks self top
to bottom (preorder), iterating over child groups in
alphanumerical order, and yielding nodes. If
classname is supplied, only instances of the
named class are yielded.
If classname is
Group, it behaves like
Group._f_walkGroups() (see the section called “_f_walkGroups()”), yielding only groups. If you
don't want a recursive behavior, use
Group._f_iterNodes() (see description) instead.
Example of use:
# Recursively print all the arrays hanging from '/'
print "Arrays in the object tree '/':"
for array in h5file.root._f_walkNodes('Array', recursive=True):
print arrayFollowing are described the methods that automatically trigger
actions when a Group instance is accessed in a
special way.
This class defines the __setattr__,
__getattr__ and __delattr__
methods, and they set, get and delete ordinary Python
attributes as normally intended. In addition to that,
__getattr__ allows getting child
nodes by their name for the sake of easy interaction on
the command line, as long as there is no Python attribute with the
same name. Groups also allow the interactive completion (when using
readline) of the names of child nodes. For
instance:
nchild = group._v_nchildren # get a Python attribute # Add a Table child called 'table' under 'group'. h5file.createTable(group, 'table', myDescription) table = group.table # get the table child instance group.table = 'foo' # set a Python attribute # (PyTables warns you here about using the name of a child node.) foo = group.table # get a Python attribute del group.table # delete a Python attribute table = group.table # get the table child instance again
Is there a child with that name?
Returns a true value if the group has a child node (visible or hidden) with the given name (a string), false otherwise.
Delete a Python attribute called
name.
This method deletes an ordinary Python
attribute from the object. It does
not remove children nodes from this group;
for that, use File.removeNode() (see description) or
Node._f_remove() (see description). It does neither
delete a PyTables node attribute; for that, use
File.delNodeAttr() (see description),
Node._f_delAttr() (see description) or Node._v_attrs
(see Section 4.3.2).
If there is an attribute and a child node with the same
name, the child node will be made accessible
again via natural naming.
Get a Python attribute or child node called
name.
If the object has a Python attribute called
name, its value is returned. Else, if the node
has a child node called name, it is returned.
Else, an AttributeError is raised.
Iterate over the child nodes hanging directly from the group.
This iterator is not recursive. Example of use:
# Non-recursively list all the nodes hanging from '/detector'
print "Nodes in '/detector' group:"
for node in h5file.root.detector:
print nodeReturn a detailed string representation of the group.
Example of use:
>>> f = tables.openFile('data/test.h5')
>>> f.root.group0
/group0 (Group) 'First Group'
children := ['tuple1' (Table), 'group1' (Group)]Set a Python attribute called name with
the given value.
This method stores an ordinary Python
attribute in the object. It does
not store new children nodes under this
group; for that, use the File.create*() methods
(see the File class in Section 4.2). It does
neither store a PyTables node attribute; for
that, use File.setNodeAttr() (see description),
Node._f_setAttr() (see description) or Node._v_attrs
(see Section 4.3.2).
If there is already a child node with the same
name, a NaturalNameWarning
will be issued and the child node will not be accessible via
natural naming nor getattr(). It will still be
available via File.getNode() (see description), Group._f_getChild()
(see description) and children
dictionaries in the group (if visible).
Abstract base class for all PyTables leaves.
A leaf is a node (see the Node class in Section 4.3) which hangs
from a group (see the Group class in Section 4.4) but, unlike a
group, it can not have any further children below it (i.e. it is an
end node).
This definition includes all nodes which contain actual data
(datasets handled by the Table —see Section 4.6—,
Array —see Section 4.7—, CArray —see Section 4.8—,
EArray —see Section 4.9— and VLArray —see
Section 4.10—
classes) and unsupported nodes (the UnImplemented
class —Section 4.11) —these classes do in fact inherit from
Leaf.
These instance variables are provided in addition to those in
Node (see Section 4.3):
The byte ordering of the leaf data on disk.
The HDF5 chunk size for chunked leaves (a tuple).
This is read-only because you cannot change the chunk size of a leaf once it has been created.
The index of the enlargeable dimension (-1 if none).
Filter properties for this leaf —see
Filters in Section 4.14.1.
The type of data object read from this leaf.
It can be any of 'numpy',
'numarray', 'numeric' or
'python' (the set of supported flavors
depends on which packages you have installed on your
system).
You can (and are encouraged to) use this property to
get, set and delete the FLAVOR HDF5
attribute of the leaf. When the leaf has no such attribute,
the default flavor is used.
The dimension along which iterators work.
Its value is 0 (i.e. the first dimension) when the
dataset is not extendable, and self.extdim
(where available) for extendable ones.
The length of the main dimension of the leaf data.
The number of rows that fit in internal input buffers.
You can change this to fine-tune the speed or memory requirements of your application.
The shape of data in the leaf.
The following are just easier-to-write aliases to their
Node (see Section 4.3) counterparts (indicated between
parentheses):
The associated AttributeSet instance
—see Section 4.12— (Node._v_attrs).
The name of this node in the hosting HDF5 file
(Node._v_hdf5name).
The name of this node in its parent group
(Node._v_name).
A node identifier (may change from run to run).
(Node._v_objectID).
A description for this node
(Node._v_title).
Close this node in the tree.
This method is completely equivalent to
Leaf._f_close() (see description).
Copy this node and return the new one.
This method has the behavior described in
Node._f_copy() (see description). Please note that there is no
recursive flag since leaves do not have child
nodes. In addition, this method recognizes the following keyword
arguments:
The new title for the destination. If omitted or
None, the original title is used.
Specifying this parameter overrides the original
filter properties in the source node. If specified, it must
be an instance of the Filters class (see
Section 4.14.1). The default is to copy the
filter properties from the source node.
You can prevent the user attributes from being copied
by setting this parameter to False. The
default is to copy them.
Specify the range of rows to be copied; the default is to copy all the rows.
This argument may be used to collect statistics on the
copy process. When used, it should be a dictionary with keys
'groups', 'leaves' and
'bytes' having a numeric value. Their
values will be incremented to reflect the number of groups,
leaves and bytes, respectively, that have been copied during
the operation.
Delete a PyTables attribute from this node.
This method has the behavior described in
Node._f_delAttr() (see description).
Flush pending data to disk.
Saves whatever remaining buffered data to disk. It also
releases I/O buffers, so if you are filling many datasets in the
same PyTables session, please call flush()
extensively so as to help PyTables to keep memory requirements
low.
Get a PyTables attribute from this node.
This method has the behavior described in
Node._f_getAttr() (see description).
Is this node visible?
This method has the behavior described in
Node._f_isVisible() (see description).
Move or rename this node.
This method has the behavior described in
Node._f_move() (see description).
Rename this node in place.
This method has the behavior described in
Node._f_rename() (see description).
Remove this node from the hierarchy.
This method has the behavior described in
Node._f_remove() (see description). Please note that there is no
recursive flag since leaves do not have child
nodes.
Set a PyTables attribute for this node.
This method has the behavior described in
Node._f_setAttr() (see description).
Close this node in the tree.
This method has the behavior described in
Node._f_close() (see description). Besides that, the optional argument
flush tells whether to flush pending data to
disk or not before closing.
This class represents heterogeneous datasets in an HDF5 file.
Tables are leaves (see the Leaf class in
Section 4.5) whose
data consists of a unidimensional sequence of
rows, where each row contains one or more
fields. Fields have an associated unique
name and position, with the
first field having position 0. All rows have the same fields, which
are arranged in columns.
Fields can have any type supported by the Col
class (see Section 4.13.2)
and its descendants, which support multidimensional data. Moreover, a
field can be nested (to an arbitrary depth),
meaning that it includes further fields inside. A field named
x inside a nested field a in a
table can be accessed as the field a/x (its
path name) from the table.
The structure of a table is declared by its description, which
is made available in the Table.description
attribute (see Section 4.6.1).
This class provides new methods to read, write and search table data efficiently. It also provides special Python methods to allow accessing the table as a normal sequence or array (with extended slicing supported).
PyTables supports in-kernel searches
working simultaneously on several columns using complex conditions.
These are faster than selections using Python expressions. See the
Tables.where() method —description— for more information on in-kernel searches.
See also Section 5.2.1
for a detailed review of the advantages and shortcomings of in-kernel
searches.
Non-nested columns can be indexed. Searching an indexed column can be several times faster than searching a non-nested one. Search methods automatically take advantage of indexing where available.
![]() | Note |
|---|---|
Column indexing is only available in PyTables Pro. |
When iterating a table, an object from the
Row (see Section 4.6.7) class is used. This object allows to
read and write data one row at a time, as well as to perform queries
which are not supported by in-kernel syntax (at a much lower speed, of
course).
See the tutorial sections in Chapter 3 on how to use the Row interface.
Objects of this class support access to individual columns via
natural naming through the
Table.cols accessor (see Section 4.6.1).
Nested columns are mapped to Cols instances, and
non-nested ones to Column instances. See the
Column class in Section 4.6.9 for examples of this feature.
The following instance variables are provided in addition to
those in Leaf (see Section 4.5). Please note that there are several
col* dictionaries to ease retrieving information
about a column directly by its path name, avoiding the need to walk
through Table.description or
Table.cols.
Automatically keep column indexes up to date?
Setting this value states whether existing indexes should be automatically updated after an append operation or recomputed after an index-invalidating operation (i.e. removal and modification of rows). The default is true.
This value gets into effect whenever a column is
altered. If you don't have automatic indexing activated and
you want to do an immediate update use
Table.flushRowsToIndex() (see Section ); for immediate reindexing of invalidated indexes, use
Table.reIndexDirty() (see Section ).
This value is persistent.
![]() | Note |
|---|---|
Column indexing is only available in PyTables Pro. |
Maps the name of a column to its Col
description (see Section 4.13.2).
Maps the name of a column to its default value.
Maps the name of a column to its NumPy data type.
Is the column which name is used as a key indexed?
![]() | Note |
|---|---|
Column indexing is only available in PyTables Pro. |
Maps the name of a column to its
Column (see Section 4.6.9) or
Cols (see Section 4.6.8) instance.
A list containing the names of top-level columns in the table.
A list containing the pathnames of bottom-level columns in the table.
These are the leaf columns obtained when walking the
table description left-to-right, bottom-first. Columns inside
a nested column have slashes (/) separating
name components in their pathname.
A Cols instance that provides
natural naming access to non-nested
(Column, see Section 4.6.9) and
nested (Cols, see Section 4.6.8)
columns.
Maps the name of a column to its PyTables data type.
A Description instance (see Section 4.6.6)
reflecting the structure of the table.
The index of the enlargeable dimension (always 0 for tables).
Does this table have any indexed columns?
![]() | Note |
|---|---|
Column indexing is only available in PyTables Pro. |
List of the pathnames of indexed columns in the table.
![]() | Note |
|---|---|
Column indexing is only available in PyTables Pro. |
Filters used to compress indexes.
Setting this value to a Filters (see
Section 4.14.1) instance determines the
compression to be used for indexes. Setting it to
None means that no filters will be used for