Contents
User's Manual
You can access to the online documentation (including tutorials) in HTML or PDF formats.
Frequently Asked Questions
Questions about PyTables? Be sure to read the FAQ section before asking in the list.
Mailing Lists
If you have any problem or doubt that the FAQ or manual can't answer, you may want to ask the users' mailing list in order to see if other users can help you. You can even check the archives of the users' list in order to check if the answer to your question has been already discussed
You might be interested in getting subscribed only to PyTables' announcement list. This is a very low traffic list where only messages about new releases of PyTables' & family software is posted.
Videos
These are the videos of a series dedicated to introduce the main features of PyTables in a visual and easy to grasp manner. More videos will be made available with the time:
PyTables, part I: Introduction: HDF5 file creation, the object tree, homogeneous array storage, natural naming, working with attributes.
PyTables, part II: Working with tables: Creation of tables with multidimensional and nested columns, and how to efficiently query them.
Hints for SQL users
If you are a seasoned user of SQL or relational databases, you may be interested in HintsForSQLUsers, a gentle introduction and cookbook to PyTables based on the concepts of SQL and RDBMS.
Presentations
Here you have the slides of some presentations about PyTables that you may find useful:
An Overview of Future Improvements to OPSI. Informal talk given at the THG headquarters in Urbana-Champaign, Illinois, USA (October 2007).
Finding Needles in a Huge DataStack. Talk given at the EuroPython 2006 Conference, held at CERN, Genève, Switzerland (July 2006).
Presentation given at the HDF Workshop 2005, held at San Francisco, USA (December 2005).
I and II Workshop in Free Software and Scientific Computing given at the Universitat Jaume I, Castelló, Spain (October 2004). In Catalan.
Presentation given at the SciPy Workshop 2004, held at Caltech, Pasadena, USA (September 2004).
Slides of presentation given at EuroPython Conference in Charleroi, Belgium (June 2003).
Presentation for the iParty5 held at Castelló, Spain (May 2003). In Spanish.
Talk on PyTables given at the PyCon 2003 Convention held at Washington, USA (March 203).
Reports
White Paper on OPSI indexes, explaining the newest and powerful indexing engine under PyTables Pro.
Performance study on how the new object tree cache introduced in PyTables 1.2 can accelerate the opening of files with a large number of objects, while being quite less memory hungry.
Paper version of the presentation at PyCon2003.
User Contributed Documents
Also, you may want to check the UserDocuments page. In this area, the PyTables users put documents that explain how they have dealt with their own problems. If you want to add your own document, you are more than welcome!
Usage Examples
Besides the tutorials in documentation above, you can see several simple examples here.
Getting Started
Here you have a small code to create a table in a group.
1 from tables import *
2
3 # Define a user record to characterize some kind of particles
4 class Particle(IsDescription):
5 name = StringCol(16) # 16-character String
6 idnumber = Int64Col() # Signed 64-bit integer
7 ADCcount = UInt16Col() # Unsigned short integer
8 TDCcount = UInt8Col() # unsigned byte
9 grid_i = Int32Col() # integer
10 grid_j = Int32Col() # integer
11 pressure = Float32Col() # float (single-precision)
12 energy = FloatCol() # double (double-precision)
13
14 filename = "test.h5"
15 # Open a file in "w"rite mode
16 h5file = openFile(filename, mode = "w", title = "Test file")
17 # Create a new group under "/" (root)
18 group = h5file.createGroup("/", 'detector', 'Detector information')
19 # Create one table on it
20 table = h5file.createTable(group, 'readout', Particle, "Readout example")
21 # Fill the table with 10 particles
22 particle = table.row
23 for i in xrange(10):
24 particle['name'] = 'Particle: %6d' % (i)
25 particle['TDCcount'] = i % 256
26 particle['ADCcount'] = (i * 256) % (1 << 16)
27 particle['grid_i'] = i
28 particle['grid_j'] = 10 - i
29 particle['pressure'] = float(i*i)
30 particle['energy'] = float(particle['pressure'] ** 4)
31 particle['idnumber'] = i * (2 ** 34)
32 # Insert a new particle record
33 particle.append()
34 # Close (and flush) the file
35 h5file.close()
Browsing the object tree
You can browse the contents of the file that we have created above. For this, we will use the convenient IPython shell.
In [1]:import tables
In [2]:f=tables.openFile("test.h5")
In [3]:f.root
Out[3]:
/ (RootGroup) 'Test file'
children := ['detector' (Group)]
In [4]:f.root.detector
Out[4]:
/detector (Group) 'Detector information'
children := ['readout' (Table)]
In [5]:f.root.detector.readout
Out[5]:
/detector/readout (Table(10L,)) 'Readout example'
description := {
"ADCcount": Col(dtype='UInt16', shape=1, dflt=0, pos=0, indexed=False),
"TDCcount": Col(dtype='UInt8', shape=1, dflt=0, pos=1, indexed=False),
"energy": Col(dtype='Float64', shape=1, dflt=0.0, pos=2, indexed=False),
"grid_i": Col(dtype='Int32', shape=1, dflt=0, pos=3, indexed=False),
"grid_j": Col(dtype='Int32', shape=1, dflt=0, pos=4, indexed=False),
"idnumber": Col(dtype='Int64', shape=1, dflt=0L, pos=5, indexed=False),
"name": StringCol(length=16, dflt=CharArray(['']), shape=1, pos=6, indexed=False),
"pressure": Col(dtype='Float32', shape=1, dflt=0.0, pos=7, indexed=False)}
byteorder := little
In [6]:f.root.detector.readout.attrs.TITLE
Out[6]:'Readout example'
Getting actual data
Here you can see how to get actual data in the readout table. Slicing and field selections is shown.
In [7]:p f.root.detector.readout[1]
(256, 1, 1.0, 1, 9, 17179869184L, 'Particle: 1', 1.0)
In [8]:p f.root.detector.readout[1:3]
NestedRecArray[
(256, 1, 1.0, 1, 9, 17179869184L, 'Particle: 1', 1.0),
(512, 2, 256.0, 2, 8, 34359738368L, 'Particle: 2', 4.0)
]
In [9]:p f.root.detector.readout[1::3]
NestedRecArray[
(256, 1, 1.0, 1, 9, 17179869184L, 'Particle: 1', 1.0),
(1024, 4, 65536.0, 4, 6, 68719476736L, 'Particle: 4', 16.0),
(1792, 7, 5764801.0, 7, 3, 120259084288L, 'Particle: 7', 49.0)
]
In [10]:p f.root.detector.readout[1::3].field('energy')
[ 1.00000000e+00 6.55360000e+04 5.76480100e+06]
In [11]:d.root.detector.readout.cols.energy[:]
Out[11]:
array([ 0.00000000e+00, 1.00000000e+00, 2.56000000e+02,
6.56100000e+03, 6.55360000e+04, 3.90625000e+05,
1.67961600e+06, 5.76480100e+06, 1.67772160e+07,
4.30467210e+07])
Selecting values
In [12]:p [row['energy'] for row in ro.where('pressure > 10')]
[65536.0, 390625.0, 1679616.0, 5764801.0, 16777216.0, 43046721.0]
In [13]:p [row['name'] for row in ro.where('energy < 10**6')]
['Particle: 0', 'Particle: 1', 'Particle: 2', 'Particle: 3', 'Particle: 4', 'Particle: 5']
In [14]:p [row['energy'] for row in ro.where('pressure > 10')]
[65536.0, 390625.0, 1679616.0, 5764801.0, 16777216.0, 43046721.0]
In [15]:sum(row['energy'] for row in ro.where('pressure > 10'))
Out[15]:67724515.0
In [16]:[row['energy'] for row in ro.where('pressure > 10')
....: if row['energy'] < 10**7 and row['TDCcount'] < 6 ]
Out[16]:[65536.0, 390625.0]
In [17]:sum(row['energy'] for row in ro.where('(pressure > 10) & (energy < 10**7)')
....: if row['TDCcount'] < 6 )
Out[17]:456161.0
In [18]:[row.nrow() for row in ro.where('(pressure > 10) & (energy < 10**7) | (TDCcount < 6)')
....:
Out[18]:[4L, 5L]
Other sources for examples
The examples presented above show just a little of the full capabilities of PyTables. Please, check out the documentation and the examples/ directory in the source package for a more examples.
