Release notes for PyTables 2.2 series

Author: Francesc Alted i Abad

Changes from 2.1.1 to 2.2b1


  • Added Expr, a class for evaluating expressions containing array-like objects. It can evaluate expressions (like '3*a+4*b') that operate on arbitrary large arrays while optimizing the resources (basically main memory and CPU cache memory) required to perform them. It is similar to the Numexpr package, but in addition to NumPy objects, it also accepts disk-based homogeneous arrays, like the Array, CArray, EArray and Column PyTables objects.

  • Added support for NumPy's extended slicing in all Leaf objects. With that, you can do the next sort of selections:

    array1 = array[4]                       # simple selection
    array2 = array[4:1000:2]                # slice selection
    array3 = array[1, ..., ::2, 1:4, 4:]    # general slice selection
    array4 = array[1, [1,5,10], ..., -1]    # fancy selection
    array5 = array[np.where(array[:] > 4)]  # point selection
    array6 = array[array[:] > 4]            # boolean selection

    Thanks to Andrew Collette for implementing this for h5py, from which it has been backported. Closes #198 and #209.

  • Numexpr updated to 1.3.1. This can lead to up a 25% improvement of the time for both in-kernel and indexed queries for unaligned tables.

  • HDF5 1.8.3 supported.

Bugs fixed

  • Fixed problems when modifying multidimensional columns in Table objects. Closes #228.
  • Row attribute is no longer stalled after a table move or rename. Fixes #224.
  • Array.__getitem__(scalar) returns a NumPy scalar now, instead of a 0-dim NumPy array. This should not be noticed by normal users, unless they check for the type of returned value. Fixes #222.

API changes

  • Added a dtype attribute for all leaves. This is the NumPy dtype that most closely matches the leaf type. This allows for a quick-and-dirty check of leaf types. Closes #230.

  • Added a shape attribute for Column objects. This is formed by concatenating the length of the column and the shape of its type. Also, the representation of columns has changed an now includes the length of the column as the leading dimension. Closes #231.

  • Added a new maindim attribute for Column which has the 0 value (the leading dimension). This allows for a better similarity with other *Array objects.

    System Message: WARNING/2 (<string>, line 69); backlink

    Inline emphasis start-string without end-string.

  • In order to be consistent and allow the extended slicing to happen in VLArray objects too, VLArray.__setitem__() is not able to partially modify rows based on the second dimension passed as key. If this is tried, an IndexError is raised now. Closes #210.

  • The forceCSI flag has been replaced by checkCSI in the next Table methods: copy(), readSorted() and itersorted(). The change reflects the fact that a re-index operation cannot be triggered from these methods anymore. The rational for the change is that an indexing operation is a potentially very expensive operation that should be carried out explicitely instead of being triggered by methods that should not be in charge of this task. Closes #216.

Backward incompatible changes

  • After the introduction of the shape attribute for Column objects, the shape information for multidimensional columns has been removed from the dtype attribute (it is set to the base type of the column now). Closes #232.

    Enjoy data!

    -- The PyTables Team

ReleaseNotes/Release_2.2b1 (last edited 2009-06-23 17:54:58 by FrancescAlted)