Zarr is a community project to develop specifications and software for storage of large N-dimensional typed arrays, also commonly known as tensors. A particular focus of Zarr is to provide support for storage using distributed systems like cloud object stores, and to enable efficient I/O for parallel computing applications.

Description

Zarr is motivated by the need for a simple, transparent, open, and community-driven format that supports high-throughput distributed I/O on different storage systems. Zarr data can be stored in any storage system that can be represented as a key-value store, including most commonly POSIX file systems and cloud object storage but also zip files as well as relational and document databases.

See the following GitHub repositories for more information:

Applications

  • Simple and fast serialization of NumPy-like arrays, accessible from languages including Python, C, C++, Rust, Javascript and Java
  • Multi-scale n-dimensional image storage, e.g. in light and electron microscopy
  • Geospatial rasters, e.g. following the NetCDF / CF metadata conventions

Features

  • Chunk multi-dimensional arrays along any dimension.
  • Store arrays in memory, on disk, inside a Zip file, on S3, etc.
  • Read and write arrays concurrently from multiple threads or processes.
  • Organize arrays into hierarchies via annotatable groups.

Sponsorship

Zarr is a Sponsored Project of NumFOCUS, a US 501(c)(3) public charity.

NumFOCUS Sponsored Projects rely on the generous support of corporate sponsors, institutional partners, and individual donors.

Videos

Videos of community members talking about Zarr. If you have a video you’d like us to share, let us know!