pytables-pro_logo.png

PyTables Professional Edition

PyTables Professional Edition (aka PyTables Pro) is a commercial and much enhanced version of the Free Source software PyTables Standard Edition. In brief, PyTables Pro is ready for getting the most of the hardware behind it, allowing to perform complex data analysis on datasets that are typically larger (and sometimes much larger) than your available memory. With typical usage, PyTables Pro can cope, while using similar resources, with datasets that can be up to 5x larger than what traditional databases would allow you.

PyTables Pro’s simple and elegant database design and operation capabilities reduce the need for database administrators. In addition, its outstanding test suite and years of accumulated experience, guarantee exceptionally high levels of quality and stability.

PyTables Pro is being actively used in a wide spectrum of fields, like aeronautics, drug discovery, financial analysis, telecommunications switching systems, data mining or statistical analysis, to name just a few, allowing to deal with extremely large datasets with maximum efficiency.


Main features

Performance of queries for different optimization levels

It comes with OPSI included

OPSI is a powerful and innovative indexing engine alloming PyTables Pro to perform really fast queries on arbitrarily large tables. Moreover, it offers a wide range of optimization levels for its indexes so that the user can choose the best one that suits her needs (more or less size, more or less performance). Indexation code also takes advantage of the vectorization capabilities of the NumPy and Numexpr packages to ensure really short indexing and search times.

By using OPSI it is possible to complete a query on a table in the order of ten thousand million (10,000,000,000) rows in a matter of hundredths of a second. Moreover, it can index columns in times between 1.5x and 25x shorter (depending on the table size and the desired optimization level) than other databases, while reducing the size of the indexes between 3x and 15x.

Performing queries in very large (indexed) tables, fast

The main reason behind this incredible performance and compactness is that OPSI is geared towards tables that are mostly used for read-only or append-only purposes, and this is the scenario where it absolutely shines. If the user needs to update or delete frequently the values of indexed columns, then OPSI will take much more time than other solutions to keep its indexes updated. So, for situations that don't require fast updates or deletions, OPSI is probably one of the best indexing engines available.

Last but not least, and thanks to OPSI, PyTables Pro provides the ability to sort arbitrarily large tables by a specific field.

For more information about the operational details and benchmarks of this innovative indexing engine, see the OPSI White Paper.

Improved cache implementation

PyTables Pro wears a fine-tuned LRU cache for both metadata (nodes) and regular data that lets you achieve maximum speed for intensive object tree browsing and during data reads and queries. It complements the already efficient cache present in HDF5, although this is more geared towards high-level structures that are specific to PyTables Pro and that are critical for achieving very high performance.

The PyTables Pro installer

Professional installers

An all-in-one PyTables Pro installer for Windows is provided so that the user only has to download and execute the auto-installer to get the job done. All the software pre-requisites (except the Python itself) are included in this package, reducing to the maximum the risk of installing wrong versions on the user side. Although all-in-one installers for other platforms are not available, PyTables Pro can be quickly deployed on those by using Python's distutils anyway.

Meant for production

More than 50,000 carefully designed test units (in 2.1 version) check every detail and feature of PyTables Pro. Besides, for every new version of the product, all the tests are verified to successfully pass for the most common platforms (Windows, Mac OS X, Linux 32-bit, Linux 64-bit). In this way, you can relax and concentrate your efforts in resolving your own problems.


What is new in PyTables Pro 2.1

With the 2.1 version of PyTables Pro a series of new and exciting features for OPSI have been released. Among the main improvements you will find:

Size of indexes in PyTables Pro 2.1

Index creation time in PyTables Pro 2.1

Performance of queries for different compressors

Performance of queries (unsorted vs sorted tables)


As a result, PyTables Pro has become a mature library for handling extremely large tables very quickly, with a rich set of features and, most importantly, fun to use. You can get more detailed information about these new powerful developments in the informal talk that I gave at the The HDF Group headquarters.


Getting PyTables Pro

Please go to the pricing schema page for PyTables Pro and in case it fits your budget, follow the instructions there. Needless to say, by acquiring a PyTables Pro license, you are not only making FrancescAlted (the main responsible of the beast) happier but also reassuring the future of the PyTables project.

You can download the evaluation version if you want to check first that PyTables Pro actually meets your expectations. Please be sure to read the evaluation license before using this version.

PyTablesPro (last edited 2009-06-18 09:19:29 by FrancescAlted)