Version 2.11 of the Python Zarr package has just been released. πŸŽ‰

This blog post aims to provide an overview of new features in this release, especially a new parameter that may impact the performance of writing arrays.

Empty chunks will no longer be written by default

One of the advantages of the Zarr format is that it is sparse, which means that chunks with no data (more precisely, with data equal to the fill value, which is usually 0) don’t need to be written to disk at all. They will simply be assumed to be empty at read time. However, until this release, the Zarr library would write these empty chunks to disk anyway. This changes in this version: a small performance penalty at write time leads to significant speedups at read time and in filesystem operations in the case of sparse arrays. To revert to the old behaviour, pass the argument write_empty_chunks=True to the array creation function.

This feature was added by Davis Bennett and Juan Nunez-Iglesias with PR #738 and #853 respectively.

Some preliminary benchmark results are shown in PR #853. For example:

benchmark results

shows how with the previous setting of write_empty_chunks=True the speed of writing a fill-valued chunk is the same as writing a randomly-filled chunk. With the new default of write_empty_chunks_False, writing the fill-valued chunk is much faster while writing a randomly-filled chunk is slightly slower.

Fancy numpy-style indexing

Zarr arrays now support NumPy-style fancy indexing with arrays of integer coordinates. This is equivalent to using zarr.Array.vindex. Mixing slices and integer arrays are not supported.

    >>> z.vindex[[0, 2], [1, 3]]
    array([-1, -2])
    >>> z.vindex[[0, 2], [1, 3]] = [-3, -4]
    >>> z[:]
    array([[ 0, -3,  2,  3,  4],
           [ 5,  6,  7,  8,  9],
           [10, 11, 12, -4, 14]])
    >>> z[[0, 2], [1, 3]]
    array([-3, -4])

See Advanced indexing in the tutorial for more information.

This feature was added by Juan Nunez-Iglesias with PR #725.

New base class

This release of Zarr Python introduces a new BaseStore class that all provided store classes implemented in Zarr Python now inherit from. This is done as part of refactoring to enable future support of the Zarr version 3 spec. Existing third-party stores that are a MutableMapping (e.g. dict) can be converted to a new-style key/value store inheriting from BaseStore by passing them as the argument to the new zarr.storage.KVStore class. For backwards compatibility, various higher-level array creation and convenience functions still accept plain Python dicts or other mutable mappings for the store argument, but will internally convert these to a KVStore.

This feature was added by Greggory Lee with PR #839, #789 and #950.

More information

Details on these features as well as the full list of all changes in 2.11.0 are available on the release notes. Check here.

Appreciation πŸ™ŒπŸ»

Shout-out to all the contributors who made release 2.11.0 possible:

If you find the above features useful and end up using them, please mention @zarr_dev on Twitter and tweet using #ZarrData and we’ll make sure to get it featured! ✌🏻

~Sanket Verma