Contents
Quotes from users
This is what people are saying about their experience with PyTables. You can leave your own quote here or cite other people's quotes. Also, it would be really cool if you can add the Powered by PyTables button to your project website
PyTables has been a huge help to pymc. Pymc produces lots of samples from certain probability distributions, some of which can be high-dimensional. David Huard wrote a PyTables-based backend that compresses and saves the samples as they're created. The backend makes it much possible to work with larger models and recover long simulations after crashes... and when you're using it, you can hardly tell the data aren't in memory. Thanks for a terrific product.
-- Anand Patil
PyTables offers the best programming interface for using HDF-5 files available in any language. It is the most elegant HDF-5 API around, far better than the native HDF-5 C-interface. Don't be fooled by its tables and database moniker: if you have to deal with HDF-5 files, you'll enjoy PyTables.
-- Maarten Sneep
The nice thing of PyTables, above tiff, for me, is that I can use my data as one large memmapped NumPy array and let PyTables do the rest, while still saving lots of diskspace compared to normal memmapped arrays.
-- Vincent Schut, Sarvision
So in summary I am very happy that I found PyTables. You save me from having to use ROOT. Please keep up the good work. You guys help me enjoy my data again!
-- Jan Strube, Stanford Linear Accelerator Center
- Many, many thanks for making such an extraordinary and excellent library freely available. I use it a lot for my research, and I simply can't work without it anymore.
-- Gabriel J.L. Beckers, Group Neurobiology of Behaviour, Max Planck Institute for Ornithology
I have been using PyTables with great success in a shared data access application for quite some time now and I am pleased to say that it has never let me down. My praise is endless regarding this excellent package.
-- Elias Collas, Stress Methods, Gulfstream Aerospace
PyTables is a very well designed interface to the HDF5 libraries. It fills a gap for people using Python/Numeric/!NumPy/numarray who need to deal with large data sets and convenient and fast data analysis tools.
-- Ernesto Rodríguez, Group Supervisor in the Radar Science and Engineering Section, Jet Propulsion Laboratory
We are very pleased with the PyTables functionality. Especially we are pleased by your prompt replay for problem reports. We had evaluated other products for storing our data in HDF5 files, but are we happy we choosed PyTables.
-- Berthold Höllmann, Germanischer Lloyd
We are developing a new engineering application, with PyTables as its core data base. We have found PyTables to be well designed, fast, well integrated into Python, and, perhaps more importantly, very robust.
-- Farzin Shakib, President, ACUSIM Software, Inc.
For large arrays and our raid 5 server, I can get read speeds approaching 1 GB per second. That is just awesome performance! Thanks to the PyTables team.
-- Lou Wicker, National Severe Storms Lab, Norman OK USA
I've recently started using PyTables for storing large datasets and I'd give it 10/10! Access is fast enough you can just access the data you need and leave the full array on disk.
-- Bryan Cole, TeraView Ltd.
Success stories
Have you been using PyTables in your work? Has it been useful to you? Then let the world know by telling your story in this page. Explain what your problem was, how PyTables helped you to solve it, and how the solution fared. (The source text of this page contains a sample success story.)
The nicest success stories get a cool Cárabos t-shirt. Just remember to identify yourself!
PyTables in Multi-camera tracking of flying flies
We use PyTables extensively saving data from our multi-camera realtime fly tracking system. A typical experiment tracks multiple flies in a flight arena for durations of several hours or more and generates roughly 2 GB of uncompressed data in PyTables format (in addition to video footage). PyTables has proven very amenable for logging data in this environment, but its real benefit comes later. Its integration with Python's numerical array packages and its fast searching set it apart from other possible solutions. The ease and speed with which this significant amount of data can be analyzed interactively far surpasses other systems I've worked with. Carabos has been extremely responsive in responding to bug reports and feature requests in both a paid and unpaid manner. It's clear to me that HDF5 format is a wonderful beast, but without PyTables, there's no way I, as a scientist first and programmer second, would have learned to master its low levels and would simply be stuck with a far lesser solution.
-- AndrewStraw
PyTables as a high-performance container for multi-gigabyte logfiles
PyTables ROCKS! I work for one of the largest online travel sites and we produce many gigabytes of logfiles over the month with data on how the various services are performing. Originally we loaded all the data into a database to generate SLA reports at the end of the month. PostgreSQL just wasn't up to the task -- or at least I wasn't up to the task of tuning Postgres to handle the load. The code was a combination of Python and the proprietary Postgres Procedural Language and it was hideous. Not only was it difficult to maintain but the report would take hours (sometimes more than a day) to run and brought our server to its knees.
When I stumbled on PyTables I had to write a prototype to see how it would perform in our situation. In just a few days I had rewritten the whole system. The number of lines was drastically reduced. It was much easier to read. It was all Python. But best of all -- it worked really well and really fast! Because of the ability to turn on compression our diskspace consumption was drastically reduced. It now takes me 90 minutes to convert one months worth of log files into HDF5 format and then about 3 minutes to do all the computations. And it can run easily on any developer's machine since a database isn't needed!
I used to be saddled with the original system but since I rewrote it using PyTables I handed it off to another developer to maintain and it has successfully been transitioned to two other developers since then because it is so much easier to understand and work with now.
I didn't have any major issues with PyTables and you were fantastic about replying to any questions I did have. Word has gotten around our organization and PyTables is now a serious contender for quite a few different applications. Keep up the good work!
-- Chuck Clark
PyTables in animal communication research
Social animals such as birds and humans can produce tens of thousands of vocalizations per day. An essential first step in getting insight into how vocal communication is organized is to record complete acoustic scenes for extended periods of time, and organize this data in such a way that it can be looked at efficiently and in flexible ways. We use PyTables to store both the primary data (sound recordings) and measurements (pitch, duration, etc) of each sound present in acoustic communications scenes that are recorded continuously for weeks. PyTables allows us to very rapidly select sets of vocalizations based on those measurements and evaluate patterns, or perform additional analyses on the actual sounds. Given that the data sets are very large (say 20 Gb for a week of communication between two birds), this would be very impractical with traditional methods. One of the great things of PyTables is that it is very easy to work with; it essentially allows anyone with at least basic Python skills to work with a very high-performance system to organize, store, access, discover, analyze, and share huge amounts of complex data. Highly recommended!
-- Gabriel Beckers
PyTables in multi-physics micromagnetic simulator Nmag
The Nmag simulation package is a finite element solver for micromagnetic problems. Ferromagnetic nanostructures are discretised using a tetrahedral mesh, and the temporal behaviour of a multitude of physical fields are computed on the mesh. Nmag is novel (multi-physics) approach where we do not know at compile and coding time what type of data the user may use and how often they decide to save the data (or just part of it), so flexibility is crucial. PyTables provides just that flexibility.
We also use PyTables to save the mesh on which the calculation is done. The inbuilt compression allows us to reduce memory consumption significantly without slowing the process down: saving a mesh is approximately 4 times more space efficient than saving it in a ascii based file format (see some data). We get much more significant space savings when saving field data, as we have some fields that change very little over space or time and thus compress excellently.
In summary, PyTables allows to quickly write code that saves complicated and hard-to-predict data structures with very reasonable compression.
-- Hans Fangohr, University of Southampton, United Kingdom
