Release notes for PyTables 2.2 series

Author: Francesc Alted i Abad
Contact: faltet@pytables.org

Changes from 2.2b2 to 2.2b3

  • Blosc compressor has been added as an additional filter, in addition to the existing Zlib, LZO and bzip2. This new compressor is meant for fast compression and extremely fast decompression. Fixes #265.
  • In File.copyFile() method, copyuserattrs was set to false as default. This was unconsistent with other methods where the default value for copyuserattrs is true. The default for this is true now. Closes #261.
  • tables.copyFile and File.copyFile recognize now the parameters present in tables/parameters.py. Fixes #262.
  • Backported fix for issue #25 in Numexpr (OP_NEG_LL treats the argument as an int, not a long long). Thanks to David Cooke for this.
  • CHUNK_CACHE_NELMTS in tables/paramters.py set to a prime number as Neil Fortner suggested.
  • Workaround for a problem in Python 2.6.4 (and probably other versions too) for pickling strings like "0" or "0.". Fixes #253.

Changes from 2.2b1 to 2.2b2

Enhancements

  • Support for HDF5 hard links, soft links and external links (when PyTables is compiled against HDF5 1.8.x series). A new tutorial about its usage has been added to the 'Tutorials' chapter of User's Manual. Closes #239 and #247.
  • Added support for setting HDF5 chunk cache parameters in file opening/creating time. 'CHUNK_CACHE_NELMTS', 'CHUNK_CACHE_PREEMPT' and 'CHUNK_CACHE_SIZE' are the new parameters. See "PyTables' parameter files" appendix in User's Manual for more info. Closes #221.
  • New Unknown class added so that objects that HDF5 identifies as H5G_UNKNOWN can be mapped to it and continue operations gracefully.
  • Added flag --dont-create-sysattrs to ptrepack so as to not create sys attrs (default is to do it).
  • Support for native compound types in attributes. This allows for better compatibility with HDF5 files. Closes #208.
  • Support for native NumPy dtype in the description parameter of File.createTable(). Closes #238.

Bugs fixed

  • Added missing _c_classId attribute to the UnImplemented class. ptrepack no longer chokes while copying Unimplemented classes.
  • The FIELD_* sys attrs are no longer copied when the PYTABLES_SYS_ATTRS parameter is set to false.
  • File.createTable() no longer segfaults if description=None. Closes #248.
  • Workaround for avoiding a Python issue causing a segfault when saving and then retrieving a string attribute with values "0" or "0.". Closes #253.

API changes

  • Row.__contains__() disabled because it has little sense to query for a key in Row, and the correct way should be to query for it in Table.colnames or Table.colpathnames better. Closes #241.

  • [Semantic change] To avoid a common pitfall when asking for the string representation of a Row class, Row.__str__() has been redefined. Now, it prints something like:

    >>> for row in table:
    ...     print row
    ...
    /newgroup/table.row (Row), pointing to row #0
    /newgroup/table.row (Row), pointing to row #1
    /newgroup/table.row (Row), pointing to row #2

    instead of:

    >>> for row in table:
    ...     print row
    ...
    ('Particle:      0', 0, 10, 0.0, 0.0)
    ('Particle:      1', 1, 9, 1.0, 1.0)
    ('Particle:      2', 2, 8, 4.0, 4.0)

    Use print row[:] idiom if you want to reproduce the old behaviour. Closes #252.

Other changes

  • After some improvements in both HDF5 and PyTables, the limit before emitting a PerformanceWarning on the number of children in a group has been raised from 4096 to 16384.

Changes from 2.1.1 to 2.2b1

Enhancements

  • Added Expr, a class for evaluating expressions containing array-like objects. It can evaluate expressions (like '3*a+4*b') that operate on arbitrary large arrays while optimizing the resources (basically main memory and CPU cache memory) required to perform them. It is similar to the Numexpr package, but in addition to NumPy objects, it also accepts disk-based homogeneous arrays, like the Array, CArray, EArray and Column PyTables objects.

  • Added support for NumPy's extended slicing in all Leaf objects. With that, you can do the next sort of selections:

    array1 = array[4]                       # simple selection
    array2 = array[4:1000:2]                # slice selection
    array3 = array[1, ..., ::2, 1:4, 4:]    # general slice selection
    array4 = array[1, [1,5,10], ..., -1]    # fancy selection
    array5 = array[np.where(array[:] > 4)]  # point selection
    array6 = array[array[:] > 4]            # boolean selection

    Thanks to Andrew Collette for implementing this for h5py, from which it has been backported. Closes #198 and #209.

  • Numexpr updated to 1.3.1. This can lead to up a 25% improvement of the time for both in-kernel and indexed queries for unaligned tables.

  • HDF5 1.8.3 supported.

Bugs fixed

  • Fixed problems when modifying multidimensional columns in Table objects. Closes #228.
  • Row attribute is no longer stalled after a table move or rename. Fixes #224.
  • Array.__getitem__(scalar) returns a NumPy scalar now, instead of a 0-dim NumPy array. This should not be noticed by normal users, unless they check for the type of returned value. Fixes #222.

API changes

  • Added a dtype attribute for all leaves. This is the NumPy dtype that most closely matches the leaf type. This allows for a quick-and-dirty check of leaf types. Closes #230.

  • Added a shape attribute for Column objects. This is formed by concatenating the length of the column and the shape of its type. Also, the representation of columns has changed an now includes the length of the column as the leading dimension. Closes #231.

  • Added a new maindim attribute for Column which has the 0 value (the leading dimension). This allows for a better similarity with other *Array objects.

    System Message: WARNING/2 (<string>, line 181); backlink

    Inline emphasis start-string without end-string.

  • In order to be consistent and allow the extended slicing to happen in VLArray objects too, VLArray.__setitem__() is not able to partially modify rows based on the second dimension passed as key. If this is tried, an IndexError is raised now. Closes #210.

  • The forceCSI flag has been replaced by checkCSI in the next Table methods: copy(), readSorted() and itersorted(). The change reflects the fact that a re-index operation cannot be triggered from these methods anymore. The rational for the change is that an indexing operation is a potentially very expensive operation that should be carried out explicitly instead of being triggered by methods that should not be in charge of this task. Closes #216.

Backward incompatible changes

  • After the introduction of the shape attribute for Column objects, the shape information for multidimensional columns has been removed from the dtype attribute (it is set to the base type of the column now). Closes #232.

    Enjoy data!

    -- The PyTables Team

ReleaseNotes/Release_2.2b3 (last edited 2010-02-24 19:03:09 by FrancescAlted)