Natural Browsing by Using Natural Naming

Quite a bit of effort has been invested in making browsing the hierarchical data structure a pleasant experience. For this, PyTables implements the natural naming convention, i.e. the attributes of a group object have the same name than its children nodes. This is best understood by looking at some examples.

Example 1

Lets suppose that we want to access to the data in the dataset in /columns/pressure in filename, say, "tutorial1.h5". We will use the powerful IPython shell for improved interactivity experience.

   1 In [1]: import tables
   2 # Open the file
   3 In [2]: fileh=tables.openFile("tutorial1.h5")
   4 # Look at the childs hanging from group (hit TAB twice)
   5 In [3]: fileh.root.
   6 fileh.root.columns   fileh.root.detector
   7 # Access to the columns group
   8 In [3]: fileh.root.columns
   9 Out[3]:
  10 /columns (Group) 'Pressure and Name'
  11   children := ['pressure' (Array), 'name' (Array)]
  12 # We can see that we have to arrays hanging from there.
  13 # Get the metadata of pressure leaf.
  14 In [4]: fileh.root.columns.pressure
  15 Out[4]:
  16 /columns/pressure (Array(3L,)) 'Pressure column selection'
  17   type = Float64
  18   stype = 'Float64'
  19   shape = (3L,)
  20   itemsize = 8
  21   nrows = 3
  22   flavor = 'NumArray'
  23   byteorder = 'little'
  24 # Good, we see that it is a floating point vector with 3 entries
  25 # Read its contents:
  26 In [5]: fileh.root.columns.pressure[:]
  27 Out[5]: array([ 1. ,  2.1,  2. ])
  28 # We get the data in an array container (memory efficient)
  29 # Easy, eh?

Example 2

Lets see now the contents of heterogeneous datasets (i.e. tables):

   1 In [1]: import tables
   2 
   3 In [2]: fileh=tables.openFile("tutorial1.h5")
   4 
   5 In [3]: fileh.root.
   6 fileh.root.columns   fileh.root.detector
   7 
   8 In [3]: fileh.root.detector.readout
   9 Out[3]:
  10 /detector/readout (Table(10L,)) 'Readout example'
  11   description := {
  12   "ADCcount": Col(dtype='UInt16', shape=1, dflt=0, pos=0, indexed=False),
  13   "TDCcount": Col(dtype='UInt8', shape=1, dflt=0, pos=1, indexed=False),
  14   "energy": Col(dtype='Float64', shape=1, dflt=0.0, pos=2, indexed=False),
  15   "grid_i": Col(dtype='Int32', shape=1, dflt=0, pos=3, indexed=False),
  16   "grid_j": Col(dtype='Int32', shape=1, dflt=0, pos=4, indexed=False),
  17   "idnumber": Col(dtype='Int64', shape=1, dflt=0L, pos=5, indexed=False),
  18   "name": StringCol(length=16, dflt=CharArray(['']), shape=1, pos=6, indexed=False),
  19   "pressure": Col(dtype='Float32', shape=1, dflt=0.0, pos=7, indexed=False)}
  20   byteorder := little
  21 # Look at the contents of readout/ADCcount
  22 In [5]: fileh.root.detector.readout.cols.ADCcount[:]
  23 Out[5]: array([   0,    1,  512,  768,    2, 2560, 2816, 3072, 3328, 3584], type=UInt16)
  24 # Get just the values from row 1 to 10, but just every two values:
  25 In [6]: fileh.root.detector.readout.cols.ADCcount[1:10:2]
  26 Out[6]: array([   1,  768, 2560, 3072, 3584], type=UInt16)
  27 # The combination of natural naming and extended slicing is great, isn't it?

Example 3

You can dig into the contents of tables even with nested columns. Look a the next interactive session:

   1 In [1]: import tables
   2 # Open the file that have a table with nested fields.
   3 In [2]: fileh=tables.openFile("nested1.h5")
   4 # Print out the metainformation about this table
   5 In [3]: fileh.root.table
   6 Out[3]:
   7 /table (Table(10L,)) ''
   8   description := {
   9   "x": Col(dtype='Int32', shape=(2,), dflt=0, pos=0, indexed=False),
  10   "info": {
  11     "value": Col(dtype='Float64', shape=1, dflt=0.0, pos=0, indexed=False),
  12     "y2": Col(dtype='Float64', shape=(2, 3), dflt=1.0, pos=1, indexed=False),
  13     "info2": {
  14       "info3": {
  15         "name": StringCol(length=10, dflt=CharArray(['']), shape=1, pos=0, indexed=False),
  16         "value": TimeCol(dflt=0.0, shape=1, itemsize=8, pos=1, indexed=False),
  17         "y4": Col(dtype='Float64', shape=(2, 3), dflt=1.0, pos=2, indexed=False),
  18         "z4": Col(dtype='UInt8', shape=1, dflt=1, pos=3, indexed=False)},
  19       "name": StringCol(length=10, dflt=CharArray(['']), shape=1, pos=1, indexed=False),
  20       "value": EnumCol(Enum({'blue': 2L, 'green': 1L, 'red': 0L}), 'blue', dtype='UInt32', shape=(1,), pos=2, indexed=False),
  21       "y3": Col(dtype='Float64', shape=(2, 3), dflt=1.0, pos=3, indexed=False),
  22       "z3": Col(dtype='UInt8', shape=1, dflt=1, pos=4, indexed=True)},
  23     "name": StringCol(length=10, dflt=CharArray(['']), shape=1, pos=3, indexed=False),
  24     "z2": Col(dtype='UInt8', shape=1, dflt=1, pos=4, indexed=False)},
  25   "Info": {
  26     "Name": Col(dtype='UInt32', shape=1, dflt=0L, pos=0, indexed=False),
  27     "Value": Col(dtype='Float64', shape=1, dflt=0.0, pos=1, indexed=False)},
  28   "color": EnumCol(Enum({'blue': 2L, 'green': 1L, 'red': 0L}), 'red', dtype='UInt32', shape=(2,), pos=3, indexed=False),
  29   "y": Col(dtype='Float64', shape=(2, 3), dflt=1.2, pos=4, indexed=False),
  30   "z": Col(dtype='UInt8', shape=1, dflt=1, pos=5, indexed=False)}
  31   indexprops := IndexProps(auto=1, reindex=1, filters=Filters(complevel=1, complib='zlib', shuffle=1, fletcher32=0))
  32   byteorder := little
  33 # Wow, a lot of columns (as well as nested ones)
  34 # Let's see the top-level columns by asking to the .cols accessor about
  35 # its attributes (press TAB twice):
  36 In [6]: p fileh.root.table.cols.
  37 fileh.root.table.cols.Info   fileh.root.table.cols.x
  38 fileh.root.table.cols.color  fileh.root.table.cols.y
  39 fileh.root.table.cols.info   fileh.root.table.cols.z
  40 # Good. Now, look into the Info and press TAB again
  41 In [6]: p fileh.root.table.cols.Info.
  42 fileh.root.table.cols.Info.Name   fileh.root.table.cols.Info.Value
  43 # We have two levels of nesting.
  44 # Lets see the metainfo for the column Info/Value
  45 In [7]: p fileh.root.table.cols.Info.Value
  46 /table.cols.Info.Value (Column(1,), Float64, idx=None)
  47 # Let's see the contents of it
  48 In [8]: p fileh.root.table.cols.Info.Value[:]
  49 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
  50 # i.e. all the values are zero.

More examples

PyTables also implements a few easy-to-use methods for browsing the hierarchy. See the tutorials chapter in the documentation for more details.


NaturalNaming

NaturalNaming (last edited 2008-04-21 11:12:45 by localhost)