Zarr Python 2.11 Release
Version 2.11 of the Python Zarr package has just been released. π
This blog post aims to provide an overview of new features in this release, especially a new parameter that may impact the performance of writing arrays.
Empty chunks will no longer be written by default
One of the advantages of the Zarr format is that it is sparse, which means that
chunks with no data (more precisely, with data equal to the fill value, which
is usually 0) donβt need to be written to disk at all. They will simply be
assumed to be empty at read time. However, until this release, the Zarr library
would write these empty chunks to disk anyway. This changes in this version: a
small performance penalty at write time leads to significant speedups at read
time and in filesystem operations in the case of sparse arrays. To revert to
the old behaviour, pass the argument write_empty_chunks=True
to the array
creation function.
This feature was added by Davis Bennett and Juan Nunez-Iglesias with PR #738 and #853 respectively.
Some preliminary benchmark results are shown in PR #853. For example:
shows how with the previous setting of write_empty_chunks=True
the speed of
writing a fill-valued chunk is the same as writing a randomly-filled chunk.
With the new default of write_empty_chunks_False
, writing the fill-valued
chunk is much faster while writing a randomly-filled chunk is slightly
slower.
Fancy numpy-style indexing
Zarr arrays now support NumPy-style fancy indexing with arrays of integer
coordinates. This is equivalent to using zarr.Array.vindex
. Mixing slices and
integer arrays are not supported.
>>> z.vindex[[0, 2], [1, 3]]
array([-1, -2])
>>> z.vindex[[0, 2], [1, 3]] = [-3, -4]
>>> z[:]
array([[ 0, -3, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, -4, 14]])
>>> z[[0, 2], [1, 3]]
array([-3, -4])
See Advanced indexing in the tutorial for more information.
This feature was added by Juan Nunez-Iglesias with PR #725.
New base class
This release of Zarr Python introduces a new BaseStore
class that all
provided store classes implemented in Zarr Python now inherit from. This is
done as part of refactoring to enable future support of the Zarr version 3
spec. Existing third-party stores that are a MutableMapping (e.g. dict) can be
converted to a new-style key/value store inheriting from BaseStore
by
passing them as the argument to the new zarr.storage.KVStore
class. For
backwards compatibility, various higher-level array creation and convenience
functions still accept plain Python dicts or other mutable mappings for the
store
argument, but will internally convert these to a KVStore
.
This feature was added by Greggory Lee with PR #839, #789 and #950.
More information
Details on these features as well as the full list of all changes in 2.11.0 are available on the release notes. Check here.
Appreciation ππ»
Shout-out to all the contributors who made release 2.11.0 possible:
- Juan Nunez-Iglesias
- Davis Bennett
- Gregory Lee
- Ryan Abernathy
- Matthias Bussonnier
- Oren Watson
- Joe Hamman
- Dimitri Papadopoulos Orfanos
- Ray Bell
- John Kirkham
- Mads R. B. Kristensen
- Josh Moore.
If you find the above features useful and end up using them, please mention @zarr_dev on Twitter and tweet using #ZarrData and weβll make sure to get it featured! βπ»
~Sanket Verma