Natural Browsing by Using Natural Naming
Quite a bit of effort has been invested in making browsing the hierarchical data structure a pleasant experience. For this, PyTables implements the natural naming convention, i.e. the attributes of a group object have the same name than its children nodes. This is best understood by looking at some examples.
Example 1
Lets suppose that we want to access to the data in the dataset in /columns/pressure in filename, say, "tutorial1.h5". We will use the powerful IPython shell for improved interactivity experience.
1 In [1]: import tables
2 # Open the file
3 In [2]: fileh=tables.openFile("tutorial1.h5")
4 # Look at the childs hanging from group (hit TAB twice)
5 In [3]: fileh.root.
6 fileh.root.columns fileh.root.detector
7 # Access to the columns group
8 In [3]: fileh.root.columns
9 Out[3]:
10 /columns (Group) 'Pressure and Name'
11 children := ['pressure' (Array), 'name' (Array)]
12 # We can see that we have to arrays hanging from there.
13 # Get the metadata of pressure leaf.
14 In [4]: fileh.root.columns.pressure
15 Out[4]:
16 /columns/pressure (Array(3L,)) 'Pressure column selection'
17 type = Float64
18 stype = 'Float64'
19 shape = (3L,)
20 itemsize = 8
21 nrows = 3
22 flavor = 'NumArray'
23 byteorder = 'little'
24 # Good, we see that it is a floating point vector with 3 entries
25 # Read its contents:
26 In [5]: fileh.root.columns.pressure[:]
27 Out[5]: array([ 1. , 2.1, 2. ])
28 # We get the data in an array container (memory efficient)
29 # Easy, eh?
Example 2
Lets see now the contents of heterogeneous datasets (i.e. tables):
1 In [1]: import tables
2
3 In [2]: fileh=tables.openFile("tutorial1.h5")
4
5 In [3]: fileh.root.
6 fileh.root.columns fileh.root.detector
7
8 In [3]: fileh.root.detector.readout
9 Out[3]:
10 /detector/readout (Table(10L,)) 'Readout example'
11 description := {
12 "ADCcount": Col(dtype='UInt16', shape=1, dflt=0, pos=0, indexed=False),
13 "TDCcount": Col(dtype='UInt8', shape=1, dflt=0, pos=1, indexed=False),
14 "energy": Col(dtype='Float64', shape=1, dflt=0.0, pos=2, indexed=False),
15 "grid_i": Col(dtype='Int32', shape=1, dflt=0, pos=3, indexed=False),
16 "grid_j": Col(dtype='Int32', shape=1, dflt=0, pos=4, indexed=False),
17 "idnumber": Col(dtype='Int64', shape=1, dflt=0L, pos=5, indexed=False),
18 "name": StringCol(length=16, dflt=CharArray(['']), shape=1, pos=6, indexed=False),
19 "pressure": Col(dtype='Float32', shape=1, dflt=0.0, pos=7, indexed=False)}
20 byteorder := little
21 # Look at the contents of readout/ADCcount
22 In [5]: fileh.root.detector.readout.cols.ADCcount[:]
23 Out[5]: array([ 0, 1, 512, 768, 2, 2560, 2816, 3072, 3328, 3584], type=UInt16)
24 # Get just the values from row 1 to 10, but just every two values:
25 In [6]: fileh.root.detector.readout.cols.ADCcount[1:10:2]
26 Out[6]: array([ 1, 768, 2560, 3072, 3584], type=UInt16)
27 # The combination of natural naming and extended slicing is great, isn't it?
Example 3
You can dig into the contents of tables even with nested columns. Look a the next interactive session:
1 In [1]: import tables
2 # Open the file that have a table with nested fields.
3 In [2]: fileh=tables.openFile("nested1.h5")
4 # Print out the metainformation about this table
5 In [3]: fileh.root.table
6 Out[3]:
7 /table (Table(10L,)) ''
8 description := {
9 "x": Col(dtype='Int32', shape=(2,), dflt=0, pos=0, indexed=False),
10 "info": {
11 "value": Col(dtype='Float64', shape=1, dflt=0.0, pos=0, indexed=False),
12 "y2": Col(dtype='Float64', shape=(2, 3), dflt=1.0, pos=1, indexed=False),
13 "info2": {
14 "info3": {
15 "name": StringCol(length=10, dflt=CharArray(['']), shape=1, pos=0, indexed=False),
16 "value": TimeCol(dflt=0.0, shape=1, itemsize=8, pos=1, indexed=False),
17 "y4": Col(dtype='Float64', shape=(2, 3), dflt=1.0, pos=2, indexed=False),
18 "z4": Col(dtype='UInt8', shape=1, dflt=1, pos=3, indexed=False)},
19 "name": StringCol(length=10, dflt=CharArray(['']), shape=1, pos=1, indexed=False),
20 "value": EnumCol(Enum({'blue': 2L, 'green': 1L, 'red': 0L}), 'blue', dtype='UInt32', shape=(1,), pos=2, indexed=False),
21 "y3": Col(dtype='Float64', shape=(2, 3), dflt=1.0, pos=3, indexed=False),
22 "z3": Col(dtype='UInt8', shape=1, dflt=1, pos=4, indexed=True)},
23 "name": StringCol(length=10, dflt=CharArray(['']), shape=1, pos=3, indexed=False),
24 "z2": Col(dtype='UInt8', shape=1, dflt=1, pos=4, indexed=False)},
25 "Info": {
26 "Name": Col(dtype='UInt32', shape=1, dflt=0L, pos=0, indexed=False),
27 "Value": Col(dtype='Float64', shape=1, dflt=0.0, pos=1, indexed=False)},
28 "color": EnumCol(Enum({'blue': 2L, 'green': 1L, 'red': 0L}), 'red', dtype='UInt32', shape=(2,), pos=3, indexed=False),
29 "y": Col(dtype='Float64', shape=(2, 3), dflt=1.2, pos=4, indexed=False),
30 "z": Col(dtype='UInt8', shape=1, dflt=1, pos=5, indexed=False)}
31 indexprops := IndexProps(auto=1, reindex=1, filters=Filters(complevel=1, complib='zlib', shuffle=1, fletcher32=0))
32 byteorder := little
33 # Wow, a lot of columns (as well as nested ones)
34 # Let's see the top-level columns by asking to the .cols accessor about
35 # its attributes (press TAB twice):
36 In [6]: p fileh.root.table.cols.
37 fileh.root.table.cols.Info fileh.root.table.cols.x
38 fileh.root.table.cols.color fileh.root.table.cols.y
39 fileh.root.table.cols.info fileh.root.table.cols.z
40 # Good. Now, look into the Info and press TAB again
41 In [6]: p fileh.root.table.cols.Info.
42 fileh.root.table.cols.Info.Name fileh.root.table.cols.Info.Value
43 # We have two levels of nesting.
44 # Lets see the metainfo for the column Info/Value
45 In [7]: p fileh.root.table.cols.Info.Value
46 /table.cols.Info.Value (Column(1,), Float64, idx=None)
47 # Let's see the contents of it
48 In [8]: p fileh.root.table.cols.Info.Value[:]
49 [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
50 # i.e. all the values are zero.
More examples
PyTables also implements a few easy-to-use methods for browsing the hierarchy. See the tutorials chapter in the documentation for more details.
NaturalNaming
