<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://zarr.dev//blog/feed.xml" rel="self" type="application/atom+xml" /><link href="https://zarr.dev//blog/" rel="alternate" type="text/html" /><updated>2026-03-05T21:04:28+00:00</updated><id>https://zarr.dev//blog/feed.xml</id><title type="html">Zarr Blog</title><subtitle>Zarr is a community project to develop specifications and software for storage of large N-dimensional typed arrays,  also commonly known as tensors.</subtitle><entry><title type="html">Evolving Zarr Governance</title><link href="https://zarr.dev//blog/governance-update/" rel="alternate" type="text/html" title="Evolving Zarr Governance" /><published>2026-03-05T00:00:00+00:00</published><updated>2026-03-05T00:00:00+00:00</updated><id>https://zarr.dev//blog/zarr-governance-update</id><content type="html" xml:base="https://zarr.dev//blog/governance-update/"><![CDATA[<h1 id="evolving-zarrs-governance">Evolving Zarr’s Governance</h1>

<p>Authors: Ryan Abernathey, Alistair Miles, Josh Moore, Norman Rzepka</p>

<p>In the ten years since its inception, Zarr has transformed the landscape of scientific data management. What started as a side project for a single genomics researcher frustrated with existing storage options has blossomed into critical scientific infrastructure for bioinformatics, bioimaging, geospatial, and Earth-system science. Zarr is in use in production at organizations such as NASA, NOAA, ESA, EMBL-EBI, Allen Institutes, Bi[o]hub, RIKEN, NVIDIA, Google, and Microsoft. Zarr was recently chosen as the next-generation format for the Copernicus Sentinel missions, which produce 40 Petabytes of data per year.</p>

<p>The reasons for this widespread adoption fall into two broad categories:</p>

<ul>
  <li><strong>Technical</strong> - Zarr’s design offers superior performance, scalability, and extensibility than alternatives, especially in the cloud. The data model can accommodate a wide range of domain-specific scientific data schemas.</li>
  <li><strong>Social</strong> - Compared to data formats controlled by a single vendor, Zarr’s open-source, community-based governance model appeals to organizations who care about long-term data sovereignty and stewardship.</li>
</ul>

<p>The social reasons are not to be discounted; for many organizations, these are just as important as the technical ones. So as the Zarr community evolves and grows, so must its governance. Learning, evolving, and growing is essential to any well-functioning organization, and we have learned many lessons from the past few years. Today we are proposing updates to Zarr’s governance aimed at streamlining the development process and encouraging new contributors to get involved while recognizing the importance of maintaining stability and continuity.</p>

<p><strong>Current Challenges</strong></p>

<p>Zarr governance started as a “<a href="https://en.wikipedia.org/wiki/Benevolent_dictator_for_life">BDFL model</a>”; the creator of Zarr, Alistair Miles, wrote the initial spec, built the first implementation (zarr-python), and was the sole owner of these things. Recognizing the value of empowering the community, in 2020 the Zarr Steering Council and current governance framework was established while moving the project under the umbrella of NumFocus as a fiscal sponsor. During this period, a significant number of new Zarr implementations emerged.</p>

<p>Building and maintaining a data format in a decentralized and community-driven way, without one organization or individual ultimately calling the shots, is not without its challenges. Today the Zarr community is feeling growing pains around governance in several ways:</p>

<ul>
  <li>Development of extensions and conventions has been somewhat slow moving due to an ambiguous approval process</li>
  <li>The agency of some individual projects within the zarr-developers GitHub org (e.g. zarr-python) is constrained by permissions limitations.</li>
  <li>Ambiguity around the process for membership on Zarr’s various committees and groups.</li>
  <li>Fragmentation of Zarr-related projects across many different GitHub organizations and repositories.</li>
</ul>

<p>Our diagnosis of the root cause of these challenges is that Zarr governance has not fully evolved its single-project, single-repo structure to the multi-project reality of today. Additionally, project administrative issues have become intermingled with the constraints of GitHub’s IAM system, conflating governance with technical capabilities of this specific platform.</p>

<p>Going forward, our new proposed framework aims to clarify the roles and responsibilities of different stakeholders via more precise definitions.</p>

<p><strong>What is Zarr?</strong></p>

<p>The Zarr ecosystem today consists of the following entities.</p>

<ul>
  <li><strong>The Zarr Project</strong> - The umbrella entity which is fiscally sponsored by NumFocus. NumFocus sponsored projects are required to have a formal governance process and may receive financial donations via the NumFocus 501c3.</li>
  <li><strong>The Zarr Specification</strong> - A collection of documents defining the on-disk Zarr format.</li>
  <li><strong>Affiliated Software Projects</strong> - Specific individual software projects (including the specification but typically Zarr implementations) which are considered <em>part of the Zarr Project</em>. These projects are eligible to receive funds via NumFocus and are therefore subject to the overarching Project governance framework.</li>
  <li><strong>Non-Affiliated Software Projects</strong> - Other software projects which implement Zarr or interact with it in some way, but are not under the umbrella of the Zarr Project.</li>
</ul>

<p>Our current governance challenges mostly stem from incomplete disambiguation between these different entities, and the resulting overloading of the authority of the ZSC. This ambiguity is exemplified by the current <a href="https://github.com/zarr-developers/governance/blob/main/GOVERNANCE.md">governance doc</a>, which lives at the zarr-developers org level. It’s unclear whether this applies to all repos within the zarr-developers org, and whether it applies to zarr-affiliated projects under other orgs. We aim to resolve this through the following resolutions.</p>

<ol>
  <li>The Zarr Project as a whole will remain governed by the Zarr Steering Council ZSC. Going forward, the Steering Council’s responsibilities include:
    <ol>
      <li>Interface between The Project and its fiscal sponsor</li>
      <li>Manage the copyrights and trademarks associated with the Project</li>
      <li>Manage the list of affiliated projects</li>
      <li>Manage responsibilities which don’t belong to a specific affiliated project (e.g. Zarr website, GitHub org) in order to ensure smooth operations and effective collaboration.</li>
    </ol>
  </li>
  <li>The Zarr Specification will be governed by a separate Spec Committee. The Spec Committee’s responsibilities are to
    <ol>
      <li>Manage changes to the Zarr core Specification</li>
      <li>Maintain the list of Zarr Extensions</li>
    </ol>
  </li>
  <li>Each Affiliated Software Project will be governed independently by a Core Developers Group (CDG) and will adhere to a simple governance process defined in a standardized document (below for more detail), without oversight from the ZSC.</li>
  <li>We are introducing a simple and transparent process for welcoming new Affiliated Software Projects.</li>
  <li>Non-affiliated Software Projects can of course continue to do their own thing however they wish, outside the boundaries of this framework, with whatever governance (or lack thereof) they choose. Non-affiliated projects are not eligible for direct funding via NumFocus. We welcome and encourage all Zarr-related projects to become affiliated.</li>
</ol>

<p>These changes are formalized as a Zarr Enhancement Proposal (<a href="https://github.com/zarr-developers/zeps/blob/main/draft/ZEP0011.md">ZEP-11</a>), which will now go through an approval process and receive feedback from the community.
Feedback should be shared via the <a href="https://github.com/zarr-developers/zeps/issues/69">dedicated discussion issue</a> in the <code class="language-plaintext highlighter-rouge">zeps</code> repo.</p>

<p><strong>The Spec Committee</strong></p>

<p>The Zarr Spec is the focal point that brings together all of the different Zarr implementations. Because it addresses the on-disk format, decisions about the spec will persist for decades. Ensuring responsible and careful evolution of the spec, and balancing the need for innovation with the need for stability, is the challenge of the Spec Committee.</p>

<p>The evolution of the spec is currently governed by the ZEP process (see <a href="https://zarr.dev/zeps/active/ZEP0000.html">ZEP0</a>). Changes to the spec currently require unanimous approval of the ZSC and majority approval of the Zarr Implementation Council (ZIC). In practice, the ZIC didn’t take shape in the way we had hoped. As implementations appeared and became inactive, the council’s membership adapted too slowly. Members frequently didn’t have the time needed to follow the ongoing GitHub conversations. The steering council also never set expectations around meetings or time limits on when decisions would be made. Looking back, the lack of clear time commitments and processes made it difficult for the ZIC to function effectively.</p>

<p>Instead, we propose moving to a single Spec Committee. This is what other successful, widely adopted file formats (like Apache Parquet) do. To provide continuity with the current system, initial membership of the Spec Committee will include the current members of both the ZSC and the ZIC; however, these members are encouraged to resign if they don’t plan to actively participate. It is expected that the Spec Committee will contain members of the most active Zarr implementations, including both Affiliated and Non-affiliated Software Projects.</p>

<p>The Spec Committee’s primary mandate is to manage the core specification and to review and approve Zarr Extensions. To streamline decision-making and ensure agility, the first task of the newly formed Spec Committee will be to establish and document its own operational governance model, including decision-making procedures (e.g., voting mechanisms and quorum requirements) and membership review. This new process will supersede the unanimous ZSC/majority ZIC approval process currently defined in ZEP0 for all future core specification changes.</p>

<p><strong>Governance for Affiliated Software Projects</strong></p>

<p>Affiliated Software Projects will now follow a simple, standard, merit-based governance process akin to the Apache model. The default governance for Affiliated Software Projects is spelled out in an accompanying governance document: <a href="https://github.com/zarr-developers/governance/pull/52/changes#diff-ff22f45c73526074150d0fd44087e20ee6bfdc7b2b3c45c45ed1e167e7b3ed2e">Draft: Governance for Zarr Affiliated Software Projects</a>.</p>

<p>This governance initially consists of the following key elements:</p>

<ul>
  <li>Each project has a group called the Core Developers Group (CDG) which makes decisions, e.g. about accepting PRs. (A group of one is fine for small projects.)</li>
  <li>Projects aim for consensus, falling back on majority vote of Core Developers when necessary.</li>
  <li>Any contributor is eligible to join the CDG. Existing Core Developers can nominate new members. Nominations should be based on evidence of sustained, quality contribution to the project. Nominations are accepted by majority vote of existing Core Developers.</li>
  <li>Core Developers who become inactive can and should be removed, via a majority vote.</li>
  <li>Larger groups should have a chair, who acts as a facilitator / coordinator.</li>
  <li>Projects must adhere to the Zarr code of conduct, which is a requirement for NumFocus fiscal sponsorship.</li>
</ul>

<p>Projects are free to evolve and change their governance as they see fit, provided it remains within the accepted norms of community open-source projects. If an affiliated project abandons open and transparent governance, the ZSC reserves the right to remove its affiliation.</p>

<p>Current affiliated projects are:</p>

<ul>
  <li>The Zarr Specification</li>
  <li>Zarr-Python (incl. Numcodecs)</li>
  <li>Zarr-java</li>
  <li>jzarr</li>
  <li>GeoZarr</li>
  <li>VirtualiZarr</li>
</ul>

<p>Going forward, we welcome new affiliated projects to join. To be considered for affiliation, a project must be open source, be directly related to Zarr, and demonstrate a critical mass of sustained development activity. The ZSC will make decisions about affiliation based on these criteria and the overall strategic direction of The Project.</p>

<p><strong>Non-Affiliated Software Projects</strong></p>

<p>The Zarr ecosystem contains several important implementations that are not formally affiliated with the numfocus-sponsored Zarr Project. These include Tensorstore, Zarrs, and Zarr.jl. This is great and speaks to the vitality of the Zarr ecosystem.</p>

<p>For projects which do wish to be formally affiliated with the Zarr Project, and thereby benefit from the NumFocus fiscal sponsorship, we extend an open invitation to join. We hope the governance changes introduced here make this appealing and clarify the light-weight governance expectations around affiliated projects. Conversely, non-affiliated projects are totally fine as-is and are under zero pressure to change their approach to governance.</p>

<p><strong>The Role of GitHub</strong></p>

<p>The use of GitHub by the Zarr Project is purely an implementation detail. GitHub is a tool which helps us manage code. GitHub roles, groups, and repos do not have any formal status within the Zarr governance framework.</p>

<p>That said, in practice, we are heavily reliant on GitHub for day-to-day operations, and clarity is needed about how to map various roles and responsibilities to specific capabilities and functions within GitHub. Zarr is mission critical code for hundreds of organizations, and security must be taken seriously. Because privileges in GitHub map to ability to commit and release code, we aim to adhere to a “principle of least privilege” model, wherein different actors have only the minimal privileges required to perform their function. This minimizes the blast radius for any potential threat. For example, a compromised GitHub account for a maintainer in one affiliated project should not be able to impact a different affiliated project.<br />
Consistent with existing practice, we make the choice that all affiliated software projects will live within the same GitHub organization: <code class="language-plaintext highlighter-rouge">zarr-developers</code>. The reasons for this are:</p>

<ul>
  <li>It provides a central entry point into all outputs of of the NumFocus-sponsored Zarr Project, aiding discoverability</li>
  <li>It allows affiliated projects to share resources, including paid capabilities of the GitHub platform</li>
</ul>

<p>Each Affiliated Software Project, including the Zarr Spec itself, will have responsibility for one or more repos under this parent organization. Each affiliated project will manage its own committee membership via a GitHub group. Ownership of the GitHub org as a whole will be managed by the ZSC, including creation of new repos and the creation of groups.</p>

<p>There is one technical limitation of GitHub which must be addressed: new members to the organization can only be added to existing groups by organization owners. This makes the ZSC a bottleneck on the autonomous operation of each affiliated project. As a workaround, we will set up a github bot which automatically updates group membership based on a list of committee members within each repo. This will effectively allow each affiliated project to manage its own membership independently.</p>

<p><strong>What’s Next?</strong></p>

<p>It’s an exciting time for Zarr. The recent creation of the extensions and conventions frameworks are motivating lots of creative new directions for development. Our goal with these governance updates is to cast off the last vestiges of Zarr’s early governance and replace it with a model that is appropriate to the multifaceted character of the Zarr Project today. This model is based on proven, established best practices for multi-stakeholder, community-developed open-source software projects. Once fully implemented, these governance updates will empower developers to work more effectively and eliminate unnecessary centralization of authority within the ZSC.</p>]]></content><author><name></name></author><category term="blog" /><summary type="html"><![CDATA[Updates to Zarr governance to streamline and simplify operations.]]></summary></entry><entry><title type="html">Zarr-Python 3 is here!</title><link href="https://zarr.dev//blog/zarr-python-3-release/" rel="alternate" type="text/html" title="Zarr-Python 3 is here!" /><published>2025-01-09T00:00:00+00:00</published><updated>2025-01-09T00:00:00+00:00</updated><id>https://zarr.dev//blog/zarr-python-3-release</id><content type="html" xml:base="https://zarr.dev//blog/zarr-python-3-release/"><![CDATA[<p>After more than a year of development, we’re thrilled to announce the release of <a href="https://zarr.readthedocs.io/en/v3.0.0/">Zarr-Python 3</a>! This major release brings full support for the Zarr v3 specification, including the new chunk-sharding extension, major performance enhancements, and a thoroughly modernized codebase. Whether you use Zarr to managing large multi-dimensional datasets in the cloud or for high-performance machine learning applications, we’ve built Zarr-Python 3 to help you. Let’s dive into some of the details of this release!</p>

<p>Zarr-Python 3 is available today on <a href="https://pypi.org/project/zarr/">PyPI</a> and <a href="https://anaconda.org/conda-forge/zarr">Conda-Forge</a>. It is compatible with Python 3.11 and above.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pip <span class="nb">install</span> <span class="nt">--upgrade</span> zarr
<span class="c"># or</span>
conda <span class="nb">install</span> <span class="nt">--channel</span> conda-forge zarr
</code></pre></div></div>

<h3 id="support-for-zarrs-v3-specification">Support for Zarr’s v3 specification</h3>

<p>The most notable addition in Zarr-Python 3 is complete support for Zarr’s <a href="https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html">v3 specification</a>. The v3 specification brought greater multi-language interoperability and new extension points for customizing Zarr (codecs, chunk grids, data types, and stores).</p>

<p>Beyond supporting the core v3 specification, Zarr-Python 3 also includes support for the <a href="https://zarr.dev/zeps/accepted/ZEP0002.html">chunk-sharding</a> extension. This feature allows for multiple chunks to be stored in a single file (or object), allowing users to utilize much smaller chunks without increasing the total number of objects in a dataset. Without chunk sharding, users optimizing for read-heavy applications had a difficult choice: either use a small chunk size, but create a huge number of stored objects, or use a large chunk size, but suffer poor IO for random reads into the data. With chunk-sharding, the number of stored objects is decoupled from the chunk size. Users can safely create very large Zarr arrays with very small chunks without generating a glut of stored objects. For more on how sharding works, see Zarr-Python’s <a href="https://zarr.readthedocs.io/en/latest/user-guide/arrays.html#sharding">sharding documentation page</a>.</p>

<p>The code block below show’s off Zarr-Python’s new API for creating sharded arrays:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">zarr</span>

<span class="n">arr</span> <span class="o">=</span> <span class="n">zarr</span><span class="p">.</span><span class="n">create_array</span><span class="p">(</span>
    <span class="s">"data/example-1.zarr"</span><span class="p">,</span>
    <span class="n">dtype</span><span class="o">=</span><span class="s">"int32"</span><span class="p">,</span>
    <span class="n">zarr_format</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span>
    <span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="mi">1000</span><span class="p">,</span> <span class="mi">1000</span><span class="p">),</span>
    <span class="n">shards</span><span class="o">=</span><span class="p">(</span><span class="mi">100</span><span class="p">,</span> <span class="mi">100</span><span class="p">),</span>
    <span class="n">chunks</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">10</span><span class="p">),</span>
<span class="p">)</span>
<span class="n">arr</span><span class="p">[:]</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">100</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="p">(</span><span class="mi">1000</span><span class="p">,</span> <span class="mi">1000</span><span class="p">))</span>
</code></pre></div></div>

<p>Note that Zarr-Python 3 maintains read and write support for data stored according to Zarr’s v2 specification. Some features (e.g. sharding) are not available for v2 data. Users can set <code class="language-plaintext highlighter-rouge">zarr_format=2</code> in the top level API to continue using Zarr v2’s specification.</p>

<h3 id="major-performance-improvements">Major performance improvements</h3>

<p>Zarr-Python 3 delivers significant performance improvements across the board. A large part of the refactor focused on making the core of the library fully asynchronous, using Python’s <a href="https://docs.python.org/3/library/asyncio.html">asyncio</a> library. The new asynchronous core enables efficient I/O operations and better utilization of system resources. This means that multiple I/O operations can be performed concurrently, leading to faster data access and reduced latency, especially when data is stored on high-latency storage backends (like cloud object storage).</p>

<p>For compute bound operations (like compression/decompression), Zarr now dispatches to a managed thread pool. Combined with asynchronous IO, this threaded parallelization allows for Zarr to take full advantage of the compute resources available when reading and writing data.</p>

<p align="center">
  <img src="../assets/images/zarr3-performance.png" alt="zarr3perf" />
  <center> Performance analysis of Zarr-Python 3 relative to Zarr-Python 2.18.4. Test wrote and read a 1GB array (shape=(512, 512, 512), chunks=(512, 512, 8), dtype='float64') to and from AWS S3 from a _m6i.4xlarge_ VM in the same region. </center>
</p>

<p>While early benchmark results appear to show very promising performance results relative to prior versions of Zarr-Python, we have yet to do dedicated performance tuning. Users should expect further performance improvements as Zarr-Python 3 matures. In fact, we are already working on identifying and addressing a number of known performance bottlenecks to further enhance the library’s speed and efficiency.</p>

<h3 id="built-with-extensions-in-mind">Built with extensions in mind</h3>

<p>Zarr-Python 3 is <a href="https://zarr.readthedocs.io/en/latest/user-guide/extending.html">designed to be highly extensible</a>. Key features include:</p>

<ul>
  <li>
    <p><strong>New <code class="language-plaintext highlighter-rouge">Store</code> ABC:</strong> A new abstract base class for defining custom storage backends, making it easier to integrate Zarr with various storage systems. This allows for seamless integration with cloud storage solutions, distributed file systems, and other data storage technologies.</p>

    <p>Zarr-Python 3 includes for the following stores:</p>

    <ul>
      <li><code class="language-plaintext highlighter-rouge">LocalStore</code> - for reading/writing to a <a href="https://zarr-specs.readthedocs.io/en/latest/v3/stores/filesystem/v1.0.html">local file system</a></li>
      <li><code class="language-plaintext highlighter-rouge">FsspecStore</code> - for reading/writing to remote/cloud storage (based on <a href="https://filesystem-spec.readthedocs.io/">fsspec</a>)</li>
      <li><code class="language-plaintext highlighter-rouge">ZipStore</code> - for reading/writing to a ZipFile (experimental)</li>
    </ul>

    <p>Additional stores are also in development (like <a href="https://earthmover.io/">Earthmover’s</a> <a href="https://icechunk.io/icechunk-python/quickstart/">Icechunk</a> store).</p>
  </li>
  <li>
    <p><strong><code class="language-plaintext highlighter-rouge">Codec</code> and <code class="language-plaintext highlighter-rouge">CodecPipeline</code> Entrypoints:</strong> Zarr-Python 3 provides <a href="https://packaging.python.org/en/latest/specifications/entry-points/">Python entry points</a> for defining custom codecs and codec pipelines, enabling flexible data compression and encoding strategies. This empowers developers to tailor data compression and encoding to specific use cases and optimize storage and performance.</p>

    <p><a href="https://numcodecs.readthedocs.io/en/stable/zarr3.html">Numcodecs</a> has been adapted to use Zarr’s <code class="language-plaintext highlighter-rouge">Codec</code> entrypoint system and <a href="https://zarrs-python.readthedocs.io/en/latest/"><code class="language-plaintext highlighter-rouge">Zarrs-python</code></a> has already developed an experimental Rust-based <code class="language-plaintext highlighter-rouge">CodecPipeline</code>.</p>
  </li>
</ul>

<h3 id="modernized-codebase">Modernized Codebase</h3>

<p>The Zarr-Python 3 codebase has been significantly modernized:</p>

<ul>
  <li><strong>100% Type Hint Coverage:</strong> Comprehensive type hints improve code readability, maintainability, and IDE support. This makes the code easier to understand, debug, and refactor, leading to higher code quality and reduced development time.</li>
  <li><strong>Cleanly Defined Public/Private API:</strong> A clear distinction between public and private APIs enhances code organization and stability. This ensures that the public API remains stable and consistent, while allowing for flexibility and future development in the private API.</li>
  <li><strong>Improved Development Environment, CI/CD, and Testing:</strong> A streamlined development workflow, robust CI/CD pipelines, and comprehensive testing ensure high-quality releases. This rigorous development process helps to identify and fix bugs early, leading to more reliable and robust software.</li>
</ul>

<h3 id="migration-from-zarr-python-2-to-3">Migration from Zarr-Python 2 to 3</h3>

<p>We have done everything possible to make the migration from Zarr-Python 2 to 3 as easy as possible. The <a href="https://zarr.readthedocs.io/en/latest/user-guide/v3_migration.html">3.0 migration guide</a> provides details the parts of the Zarr-Python API that have changed and provides suggested actions for migration. Additionally, libraries such as <a href="https://xarray.dev/">Xarray</a>, <a href="https://www.dask.org/">Dask</a>, have already added support for Zarr-Python 3.</p>

<h3 id="conclusion">Conclusion</h3>

<p>Zarr-Python 3.0.0 marks the beginning of a new chapter for the Zarr-Python project. We encourage you to try out this new version and provide feedback. We’re also excited to see the development of new extensions built on top of this solid foundation, such as <a href="https://icechunk.io">Icechunk</a>, <a href="https://zarrs-python.readthedocs.io">Zarrs-Python</a>, and <a href="https://virtualizarr.readthedocs.io">VirtualiZarr</a>.</p>

<p>The development of Zarr-Python 3 was a huge effort, spanning over 12 months and including contributions from over 30 contributors. Special thanks to <a href="https://github.com/d-v-b">Davis Bennett</a> and <a href="https://github.com/normanrz">Norman Rzepka</a> who helped me kick off the initial refactor in Potsdam, Germany in December 2023.</p>

<p><strong>Further reading</strong></p>

<ul>
  <li><a href="https://zarr.readthedocs.io/">Zarr-Python 3 documentation</a></li>
  <li><a href="https://zarr.readthedocs.io/en/latest/developers/roadmap.html">Zarr-Python 3 design doc</a></li>
  <li><a href="https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html">Zarr v3 specification</a></li>
</ul>

<p>~Joe Hamman</p>

<script src="https://giscus.app/client.js" data-repo="zarr-developers/blog" data-repo-id="R_kgDOGxrWVg" data-category="General" data-category-id="DIC_kwDOGxrWVs4CU5q_" data-mapping="pathname" data-strict="0" data-reactions-enabled="1" data-emit-metadata="0" data-input-position="top" data-theme="light" data-lang="en" crossorigin="anonymous" async="">
</script>]]></content><author><name></name></author><category term="blog" /><summary type="html"><![CDATA[Zarr-Python 3 is here! This release brings support for Zarr's v3 specification, new extensions, and major]]></summary></entry><entry><title type="html">Versioning Zarr with EffVer</title><link href="https://zarr.dev//blog/versioning-with-effver/" rel="alternate" type="text/html" title="Versioning Zarr with EffVer" /><published>2025-01-09T00:00:00+00:00</published><updated>2025-01-09T00:00:00+00:00</updated><id>https://zarr.dev//blog/switching-to-effver</id><content type="html" xml:base="https://zarr.dev//blog/versioning-with-effver/"><![CDATA[<p>Back in January we released <a href="https://zarr.readthedocs.io/en/v3.0.0/">Zarr-Python 3</a>, the first new major version of the library since 2016. After making this release, we found that Zarr-Python’s versioning policy didn’t quite fit the needs of the project as it stands today. So we have modified that versioning policy to make it better suited to the needs of Zarr-Python developers and users.</p>

<p>This post will explain what our old versioning policy was, why it wasn’t working well for us, and why we are switching to <a href="https://jacobtomlinson.dev/effver/">“Intended Effort Versioning”</a>, or “EffVer”.</p>

<h3 id="our-old-versioning-policy">Our old versioning policy</h3>

<p>The old Zarr-Python versioning policy was effectively <a href="https://semver.org/">Semantic Versioning</a>, or “SemVer”. In this scheme, backwards-incompatible API changes may only be released in major versions, backwards-compatible API changes (i.e., adding new APIs) may only be released in minor or major versions, and bug fixes can be released in major, minor, or patch versions.</p>

<h4 id="friction-with-semver">Friction with SemVer</h4>

<p>The impetus for releasing Zarr-Python 3 was to support a new version of the Zarr format. So Zarr-Python 3 was released with a lot of new APIs; some of these APIs have bugs or warts.</p>

<p>For example, we released some functions that <em>should</em> have consistent default values, but due to developer error (this author’s error, in fact) those functions are inconsistent. <a href="https://github.com/zarr-developers/zarr-python/pull/2819">The fix</a> is simple, in terms of code changes, but it requires breaking changes to our public API. According to SemVer, fixing this bug would require releasing Zarr-Python 4, only a few months after we released 3.</p>

<p>Besides fixing bugs in new APIs, we released Zarr-Python 3 with deprecation notices for many old APIs. We would like to eventually remove these routines from Zarr-Python, but we don’t think it would match user expectations if we released a new major version of the library just to signify removing code. Again, this is contrary to SemVer.</p>

<p>Releasing version 4 of Zarr-Python because we changed the default values of a few functions would likely confuse people. Our users assume that major releases Zarr-Python will contain sweeping changes, not minor refinements of public APIs. This suggests SemVer does not actually fit our project very well, so we decided to update the versioning policy accordingly. If you are interested in the developer discussion about this topic, see <a href="https://github.com/zarr-developers/zarr-python/issues/2889">Github issue 2889</a>.</p>

<h3 id="our-new-versioning-policy">Our new versioning policy</h3>

<p>We were accustomed to thinking about major releases of Zarr-Python as epochal events that offer lots of new functionality to users (like a brand new version of the underlying Zarr format), but also require substantial changes to existing code. In other words, while major releases likely contain backwards-incompatible changes, backwards-incompatible changes on their own don’t really warrant a major release.</p>

<p><a href="https://jacobtomlinson.dev/">Jacob Tomlinson</a> extended this framing to a full versioning scheme, which he calls <a href="https://jacobtomlinson.dev/effver/">Intended Effort Versioning</a>, or “EffVer” for short. The basic idea of EffVer is that you version your project according to the expected effort a user will spend in upgrading to that version.</p>

<ul>
  <li>Major releases should contain changes have the most impact on users, and thus require the most effort to adopt.</li>
  <li>Minor releases can require some adoption effort from some users.</li>
  <li>Patch releases should require no adoption effort from users.</li>
</ul>

<p>While SemVer indexes code changes by whether they are backwards compatible or not, EffVer indexes changes on how much effort is required for users to adapt to them. Thus EffVer allows us to ship small-but-breaking changes – like changing default values of some recently-added functions – in a minor release, so long as we think these changes will be easy for users to integrate.</p>

<h3 id="conclusion">Conclusion</h3>

<p>We think switching to EffVer is right for Zarr-Python development. It lets us refine newly-added APIs while reserving major releases for epochal changes, like the Zarr-Python 2 -&gt; 3 transition.</p>

<p>If you have any thoughts or concerns about this decision, we would love to hear from you. The best way to reach us is to open an <a href="https://github.com/zarr-developers/zarr-python/issues">issue</a> or <a href="https://github.com/zarr-developers/zarr-python/discussions">discussion</a> on our <a href="https://github.com/zarr-developers/zarr-python">Github page</a>. Thanks for your time!</p>]]></content><author><name></name></author><category term="blog" /><summary type="html"><![CDATA[We are updating the versioning policy for the Zarr python library.]]></summary></entry><entry><title type="html">Steering council membership update</title><link href="https://zarr.dev//blog/steering-council-update-2024/" rel="alternate" type="text/html" title="Steering council membership update" /><published>2024-11-28T00:00:00+00:00</published><updated>2024-11-28T00:00:00+00:00</updated><id>https://zarr.dev//blog/steering-council-membership-update</id><content type="html" xml:base="https://zarr.dev//blog/steering-council-update-2024/"><![CDATA[<h1 id="steering-council-membership-update">Steering council membership update</h1>

<p>From the <a href="https://github.com/zarr-developers/governance/blob/main/GOVERNANCE.md#zarr-steering-council">Governance documentation</a>:</p>

<blockquote>
  <p>The Zarr Steering Council (ZSC) members are core developers who have
additional responsibilities to ensure the smooth running of the project. ZSC
members are expected to participate in strategic planning, approve changes to
the governance model, and make decisions about funding granted to the project
itself… The purpose of the ZSC is to ensure smooth progress from the
big-picture perspective. Changes that impact the full project require
analysis informed by long experience with both the project and the larger
ecosystem. When the core developer community (including the ZSC members)
fails to reach such a consensus in a reasonable timeframe, the ZSC is the
entity that resolves the issue.</p>
</blockquote>

<p>On October 14th, Ryan Williams, a founding member of the ZSC, stepped back. In
the name of the community, the other members would like to thank Ryan for his
support and leadership over the past four years and wish him the best in his
new pursuits!</p>

<p>In his place, <a href="https://github.com/normanrz">Norman Rzepka</a> will join the
council. Norman has been a key driver of the Zarr v3 specification, the zarrita
prototype, a zarr-java implementation, and more recently the rewrite of
zarr-python for v3.</p>

<p>Please join us in welcoming him! 👏🏽</p>

<p>~ <a href="https://github.com/joshmoore">Josh Moore</a> for the ZSC</p>

<script src="https://giscus.app/client.js" data-repo="zarr-developers/blog" data-repo-id="R_kgDOGxrWVg" data-category="General" data-category-id="DIC_kwDOGxrWVs4CU5q_" data-mapping="pathname" data-strict="0" data-reactions-enabled="1" data-emit-metadata="0" data-input-position="top" data-theme="light" data-lang="en" crossorigin="anonymous" async=""> </script>]]></content><author><name></name></author><category term="blog" /><summary type="html"><![CDATA[Please welcome the new steering council member, Norman Rzepka (normanrz)!]]></summary></entry><entry><title type="html">Get Together OME Transforms 🧠</title><link href="https://zarr.dev//blog/get-your-brain-together-hackathon/" rel="alternate" type="text/html" title="Get Together OME Transforms 🧠" /><published>2024-07-15T00:00:00+00:00</published><updated>2024-07-15T00:00:00+00:00</updated><id>https://zarr.dev//blog/get-together-ome-transforms</id><content type="html" xml:base="https://zarr.dev//blog/get-your-brain-together-hackathon/"><![CDATA[<h1 id="join-the-3rd-get-your-brain-together-hackathon">Join the 3rd Get Your Brain Together Hackathon!</h1>

<p>Join us for third edition of the <a href="https://insightsoftwareconsortium.github.io/GetYourBrainTogether/HCK03_2024_UNC_Hybrid/">Get Your Brain Together</a> Hackathon! This exciting event invites neuroimage data generators, image registration researchers, and neurodata compute infrastructure providers to come together for a hands-on, collaborative experience. <a href="https://forms.gle/LL4quQsbSWawKYSa6">Register</a> now and be part of this vibrant community working towards creating reproducible, open-source resources that unlock the mysteries of brain structure and function.</p>

<p>This hackathon will focus on <strong>advancing <a href="http://dx.doi.org/10.1007/s00418-023-02209-1">OME-Zarr</a> spatial transformations</strong>.</p>

<h2 id="overview-of-ome-zarr">Overview of OME-Zarr</h2>

<p><a href="http://dx.doi.org/10.1007/s00418-023-02209-1">OME-Zarr</a> is a cloud-optimized bioimaging file format that enjoys international community support and widespread adoption in neuroscience. It supports large-scale bioimages with spatial metadata, making it a critical tool for scientific research. The current OME-Zarr standard is enhanced by the <a href="https://github.com/ome/ngff/pull/138">coordinate transformations draft</a>, which introduces robust support for spatial transformations. This is particularly important for neuroimaging and other scientific imaging practices, as it facilitates:</p>

<ul>
  <li><strong>Reproducibility and Consistency:</strong> Explicit support for spatial transformations ensures consistent application across various platforms and applications. This feature aligns with the <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4792175/">FAIR</a> principles, enabling independent researchers to verify results.</li>
  <li><strong>Integration with Analysis Workflows:</strong> By treating spatial transformations as a first-class entity within file formats, OME-Zarr allows seamless integration with diverse image analysis workflows, eliminating the need for additional conversion steps.</li>
  <li><strong>Efficiency and Accuracy:</strong> Embedding transformations within the file format minimizes the need for re-sampling, thereby reducing sampling errors and preserving analysis accuracy. This standardization is crucial for handling the massive data volumes generated by modern microscopy techniques.</li>
  <li><strong>Flexibility in Analysis:</strong> Native support for spatial transformations provides researchers with the flexibility to apply, modify, or reverse transformations as needed, facilitating longitudinal studies, multi-modal imaging, and comparative analysis.</li>
</ul>

<h2 id="hackathon-agenda">Hackathon Agenda</h2>

<p>The hackathon is structured into three key components:</p>

<ol>
  <li><strong>Day 1:</strong> Tutorial sessions covering the application needs for coordinate transformations, mathematical principles, and current computational standards and tools available in the open-source ecosystem.</li>
  <li><strong>Day 2:</strong> Small working groups will review and propose enhancements to the current coordinate transformations draft and relevant neuroimaging additions.</li>
  <li><strong>Day 3:</strong> Hands-on activities where participants will implement and apply the proposed improvements to the standards.</li>
</ol>

<h2 id="event-details">Event Details</h2>

<ul>
  <li><strong>Dates:</strong> Friday, July 26th - Sunday, July 28th, 2024</li>
  <li><strong>Location:</strong> Hybrid event at the University of North Carolina-Chapel Hill, via Google Meet videoconferencing, Image.sc Island Gather.Town virtual space, and Image.sc Zulip Chat.</li>
  <li><strong>Cost:</strong> <a href="https://forms.gle/LL4quQsbSWawKYSa6">Registration</a> is free!</li>
</ul>

<p><strong><a href="https://insightsoftwareconsortium.github.io/GetYourBrainTogether/HCK03_2024_UNC_Hybrid/">Register now and add the event to your calendar!</a></strong></p>

<p>~ <a href="https://github.com/thewtex">Matt McCormick</a></p>

<script src="https://giscus.app/client.js" data-repo="zarr-developers/blog" data-repo-id="R_kgDOGxrWVg" data-category="General" data-category-id="DIC_kwDOGxrWVs4CU5q_" data-mapping="pathname" data-strict="0" data-reactions-enabled="1" data-emit-metadata="0" data-input-position="top" data-theme="light" data-lang="en" crossorigin="anonymous" async="">
</script>]]></content><author><name></name></author><category term="blog" /><summary type="html"><![CDATA[Join the 3rd Get Your Brain Together Hackathon!]]></summary></entry><entry><title type="html">NASA POWER 🤝🏻 Zarr</title><link href="https://zarr.dev//blog/nasa-power-and-zarr/" rel="alternate" type="text/html" title="NASA POWER 🤝🏻 Zarr" /><published>2024-06-11T00:00:00+00:00</published><updated>2024-06-11T00:00:00+00:00</updated><id>https://zarr.dev//blog/nasa-power-and-zarr</id><content type="html" xml:base="https://zarr.dev//blog/nasa-power-and-zarr/"><![CDATA[<h2 id="hi-zarr-community-">Hi Zarr Community! 👋🏻</h2>

<p>Zarr’s user, developer, and contributor base is growing every day across several scientific domains, including those responsible for mitigating climate change, solving complex biomedical issues, pushing the boundaries of AI development, and more.</p>

<p>National Aeronautics and Space Administration (NASA) is a prominent user and deeply invested in Zarr among the geospatial community. In this blog post, we’d like to highlight the NASA Prediction Of Worldwide Energy Resources (POWER) project, which has been using Zarr for its data storage needs. The POWER project is based at the NASA Langley Research Center (LaRC), which is located in Hampton, Virginia, USA.</p>

<h3 id="introduction-to-power-and-zarr-">Introduction to POWER and Zarr 🎙</h3>

<p>The <a href="https://power.larc.nasa.gov/">Prediction Of Worldwide Energy Resources (POWER)</a> project is a cornerstone “Energy and Infrastructure” Earth Action Program project. The Project’s mission is to improve learning, decisions, and outcomes in the renewable energy, sustainable infrastructure, and agroclimatology user communities. For any location in the world, the project provides easily accessible, customized, and trusted NASA solar and meteorological data for past, current, and soon future climates. POWER improves the public and private capability for integrating these NASA Earth Observations (EO) and assimilation model data into their workflows by offering a diverse suite of tools and services to access this data. The project provides access to its Analysis Ready Data (ARD) via POWER’s Application Programming Interface (API), Data Access Viewer (DAV), geospatial services, and cloud-enabled data store. POWER offers no-cost, no-account-needed access to all of its tools, services, and data, which lowers the barrier to entry for many users across the globe. Additionally, through each of its tools and services, POWER’s multi-decadal, low-latency, high-accuracy, community-specific datasets are offered in user-customizable units and a wide variety of formats.</p>

<p>POWER is currently using Zarr as a backend data store for our API services that a lot of our users access to implement in their project’s needs. We have made the backend data stores directly available and freely accessible to the public with no use constraints through Amazon Web Services® (AWS®). We also plan to work with digital twins to integrate POWER data directly in for their data modelling. The POWER Project will continue leveraging the Zarr to store and serve data, which includes dynamic updates at Near Real Time (NRT) by the POWER Project’s data processing code base. Zarr enables the POWER Project to more efficiently support its user communities and impact decision-making for government agencies, non–profit organizations, universities, and private companies around the world.</p>

<p>🗄️ <strong>Check POWER’s Zarrs on AWS® Registry of Open Data: <a href="https://registry.opendata.aws/nasa-power/">https://registry.opendata.aws/nasa-power/</a></strong></p>

<h3 id="a-brief-history-of-data-archives-">A Brief History of Data Archives 📚</h3>

<p>POWER’s meteorological parameters, such as temperature, humidity, precipitation, or wind are derived from the NASA’s Global Modeling and Assimilation Office (GMAO) Modern – Era Retrospective analysis for Research and Applications Version 2 (MERRA-2) assimilation model. MERRA – 2 is a version of NASA’s Goddard Earth Observing System (GEOS) Data Assimilation System. This data is available starting back in 1981.</p>

<p>The energy flux parameters, like solar irradiance and cloud properties, are derived from NASA’s <a href="https://science.larc.nasa.gov/gewex-srb/">GWEX SRB</a> archive and NASA’s <a href="https://ceres.larc.nasa.gov/data/">CERES SYN1deg and FLASHFlux</a> projects. This data is available dating back to 1984.</p>

<p>POWER’s ability to provide historical data allows users to access variability, make decisions, and conduct analyses based on past and current information. For more information on POWER’s history, please see our <a href="https://power.larc.nasa.gov/docs/methodology/">documentation</a>.</p>

<h3 id="what-format-did-power-use-before-">What format did POWER use before? 🔙</h3>

<p>Previously, POWER used a NetCDF file that was structed support temporal access by chunking along the line of dimension, in conjunction with OPeNDAP software used as middleware to support the POWER API’s temporal requests efficiently. While this system met initial requirements to provide daily data, the team had to explore new formats to meet the growing demand for hourly data.
NetCDF was no longer efficient for hourly data needs because NetCDFs is a condensed structure which prompted us to assess the use for Zarr since it is segmented.</p>

<h3 id="why-power-switched-">Why POWER switched? 🔍</h3>

<p>The POWER Project selected the Zarr format as our Analysis Ready Data (ARD) format as we were transitioning from a monolithic server architecture to a microservice-based hybrid cloud hosted architecture environment, with the foresight of fully transitioning to a cloud environment. To support the key component of our services endpoints, the efficient and fast distribution of time-series data, we wanted to remove any middleware software, improve data access, and implement higher levels of data compression.</p>

<h3 id="benefits-after-switching-">Benefits after switching 💪🏻</h3>

<p>Switching to Zarr enabled complete and direct access to the POWER data archives in an Analysis-Ready Cloud Optimized (ARCO) data store. Zarr also allows for asynchronous writing, JSON metadata, and a folder- based structure which allows us to add to the datastore faster and keep the data neatly sorted. Furthermore, POWER is able to utilize the higher level of data compression which reduces the speed for data acquisition. Lastly, being able to load small parts of the data improves efficiency, relevance and reduces costs.</p>

<p>For POWER’s future datastore, the Zarr’s enhanced compression was leveraged recently in initial testing which resulted in orders of magnitude smaller of total storage volume without losing any data precision. The result of this compression was saving on storage costs and increasing read/write performance of the system.</p>

<p>As a part of NASA’s Space Act Agreement with AWS®, this archive is hosted in S3 as part of the Open Data Registry, which provides the data freely to the public.</p>

<h3 id="future-plans-for-zarr-">Future plans for Zarr 🔮</h3>

<p>The POWER Project plans to include both a spatial and time series based chunked data structures to better meet user demand for data and fulfill data
orders more quickly, to support more efficient access and analysis. To promote further understanding and enable effective search and discovery, the team will develop enhanced slice-based metadata.</p>

<p>POWER plans to use Zarr spatial data stores for ArcGIS Image Services in future data versions to cater to the needs of the GIS community.</p>

<p>~ <a href="https://www.linkedin.com/in/zoe-waring/">Zoe Waring</a>, <a href="https://power.larc.nasa.gov/docs/team/">NASA POWER team</a></p>

<script src="https://giscus.app/client.js" data-repo="zarr-developers/blog" data-repo-id="R_kgDOGxrWVg" data-category="General" data-category-id="DIC_kwDOGxrWVs4CU5q_" data-mapping="pathname" data-strict="0" data-reactions-enabled="1" data-emit-metadata="0" data-input-position="top" data-theme="light" data-lang="en" crossorigin="anonymous" async="">
</script>]]></content><author><name></name></author><category term="blog" /><summary type="html"><![CDATA[Blog post on Zarr usage at NASA LARC's POWER project]]></summary></entry><entry><title type="html">Toward Zarr-Python 3.0</title><link href="https://zarr.dev//blog/zarr-python-v3-update/" rel="alternate" type="text/html" title="Toward Zarr-Python 3.0" /><published>2024-05-09T00:00:00+00:00</published><updated>2024-05-09T00:00:00+00:00</updated><id>https://zarr.dev//blog/zarr-python-v3-update</id><content type="html" xml:base="https://zarr.dev//blog/zarr-python-v3-update/"><![CDATA[<p>We released Zarr-Python <a href="https://zarr.readthedocs.io/en/stable/release.html#release-2-18-0">2.18.0</a> this week. Although this release was quite light in terms of user-facing changes, it represents the beginning of a new phase for the project. In this post, we’ll walk through our plan for Zarr-Python 3.0 and what users of the library can expect in the coming months.</p>

<h2 id="zarr-python-218">Zarr-Python 2.18</h2>

<p>Before we get into the 3.0 release, we’ll first cover a few details about the 2.18 release series. The first thing to know is that we will continue to support 2.18 with bug fixes up until the release of 3.0. Additionally, we expect to use the 2.18 series to communicate changes in the Zarr-Python API, which will come in 3.0. For example, this week’s release included a number of new deprecation warnings for parts of the Zarr-Python API that we expect to remove in 3.0 (e.g. exotic stores, experimental v3 API).</p>

<h2 id="what-to-expect-with-zarr-python-30">What to expect with Zarr-Python 3.0</h2>

<p>In mid-2023, we formed a <a href="https://github.com/zarr-developers/zarr-python/discussions/1480">working group</a> to look at modernizing Zarr-Python and, crucially, adding support for the <a href="https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html">V3 specification</a>. One of the early outcomes of this effort was a <a href="https://github.com/zarr-developers/zarr-python/blob/056657ca5ed70aa3d77a9e2db42253fca39800b0/v3-roadmap-and-design.md">design document</a> detailing the plan for a major refactor to the library. The goals for the refactor effort are to:</p>

<ul>
  <li>Provide a complete implementation of Zarr V3 through the Zarr-Python API,</li>
  <li>Clear the way for exciting extensions / ZEPs (i.e. <a href="https://zarr-specs.readthedocs.io/en/latest/v3/codecs/sharding-indexed/v1.0.html">sharding</a>, <a href="https://zarr.dev/zeps/draft/ZEP0003.html">variable chunking</a>, etc.),</li>
  <li>Provide a developer API that can be used to implement and register V3 extensions,</li>
  <li>Improve the performance of Zarr-Python by streamlining the interface between the Store layer and higher level APIs (e.g. Groups and Arrays),</li>
  <li>Clean up the internal and user-facing APIs,</li>
  <li>Improve code quality and robustness (e.g. achieve 100% type hint coverage), and</li>
  <li>Align the Zarr-Python array API with the <a href="https://data-apis.org/array-api/latest/">array API Standard</a>.</li>
</ul>

<p>In late 2023, we started working on the next version of the library, iterating on core concepts and restructuring the code base. While this effort continues today, here are a few highlights that we are particularly excited about:</p>

<ul>
  <li>New asynchronous APIs across the library, including at the <code class="language-plaintext highlighter-rouge">Store</code>, <code class="language-plaintext highlighter-rouge">Group</code>, <code class="language-plaintext highlighter-rouge">Array</code>, and <code class="language-plaintext highlighter-rouge">Codec</code> levels. The ability for Zarr-Python to leverage asynchronous computation will dramatically improve performance in the library, particularly for workloads that depend on data coming from high-latency stores. We expect most users will interact with these classes through a synchronous interface but the asynchronous alternatives will be available for users that can take advantage of them.</li>
  <li>Complete spec-complaint implementation supporting both V2 and V3. Zarr-Python will support reading and writing in either format. Additionally, the V2 and V3 code paths will benefit from the new asynchronous interfaces as well as other performance improvements.</li>
  <li>New plugin interface for codecs. Previously, codec support was required to run through <a href="https://numcodecs.readthedocs.io/en/stable/">Numcodecs</a>. Going forward, additional codecs may be registered with Zarr-Python using the <code class="language-plaintext highlighter-rouge">zarr.codecs</code> <a href="https://packaging.python.org/en/latest/specifications/entry-points/">Entry Point</a>. While Numcodecs will continue to supply the Zarr-Python project with most codecs, the plugin interface will support the integration of codecs from other libraries.</li>
</ul>

<h2 id="release-plan">Release plan</h2>

<p>We are still working hard on the 3.0 development branch. You can follow our progress on our <a href="https://github.com/orgs/zarr-developers/projects/5">GitHub Project Board</a>. In the coming weeks, we expect to move our development to the <code class="language-plaintext highlighter-rouge">main</code> branch of the library and make a series of pre-releases.</p>

<h2 id="get-involved">Get involved</h2>

<p>It’s not too late to get involved with the 3.0 effort. The <a href="https://github.com/orgs/zarr-developers/projects/5">GitHub Project Board</a> provides an up-to-date summary of outstanding issues. If you maintain a library that depends on Zarr-Python, the 3.0.0-alpha release will be a great time to start testing against the upcoming release. Finally, we continue to hold bi-weekly developer meetings to discuss and coordinate work on Zarr-Python. This is an open meeting so please come if you are interested in getting involved. Check out the Zarr community calendar <a href="https://zarr.dev/community-calls/">here</a>.</p>

<p>~Joe Hamman</p>

<script src="https://giscus.app/client.js" data-repo="zarr-developers/blog" data-repo-id="R_kgDOGxrWVg" data-category="General" data-category-id="DIC_kwDOGxrWVs4CU5q_" data-mapping="pathname" data-strict="0" data-reactions-enabled="1" data-emit-metadata="0" data-input-position="top" data-theme="light" data-lang="en" crossorigin="anonymous" async="">
</script>]]></content><author><name></name></author><category term="blog" /><summary type="html"><![CDATA[A status update on the development of Zarr-Python 3]]></summary></entry><entry><title type="html">Zarr Sprint Recap</title><link href="https://zarr.dev//blog/zarr-sprint-2024/" rel="alternate" type="text/html" title="Zarr Sprint Recap" /><published>2024-04-04T00:00:00+00:00</published><updated>2024-04-04T00:00:00+00:00</updated><id>https://zarr.dev//blog/zarr-sprint</id><content type="html" xml:base="https://zarr.dev//blog/zarr-sprint-2024/"><![CDATA[<p>A few weeks ago, a group of us met up in person, at LEAP in New York City, and virtually to hack on the Zarr specifications and ecosystem.</p>

<p>In this blog, I give a very brief overview of each of the topic areas. More importantly, I link out to the open issues, pull requests, discussions, and meeting opportunities for continued development. You can follow these links to both better understand each effort and also to contribute yourself.</p>

<p>Please keep this work going by adding reviews and comments to online conversations and joining any relevant meetings/working sessions.</p>

<h2 id="zarr-specification">Zarr Specification</h2>

<p><a href="https://zarr.readthedocs.io/en/stable/spec.html">Zarr Specification</a> refers to the specification for a chunked, compressed, N-dimensional array format primarily designed for storing large numerical arrays efficiently. It is commonly used in scientific computing, geospatial, bioimaging, and data analysis contexts.</p>

<p>The specification defines how data should be organized within a Zarr store, including details on chunking, compression, metadata, and other attributes necessary to efficiently store and retrieve array data. This specification helps ensure interoperability between different software implementations that support the Zarr format.</p>

<p>The latest version of the specification is <a href="https://zarr.readthedocs.io/en/stable/spec/v2.html">V2</a>, but <a href="https://zarr-specs.readthedocs.io/en/latest/specs.html">Version 3</a> is in the works.</p>

<h3 id="chunk-manifest--virtual-concatenation">Chunk Manifest / Virtual Concatenation</h3>

<p>In this breakout session, the group engaged in a long technical discussion about a way to define arrays in a Zarr store as concatenations of other arrays in the store. You can read a full ZEP-like description of the discussion <a href="https://github.com/zarr-developers/zarr-specs/issues/288#issuecomment-1939265240">here</a>. Shoutout to Tom Nicholas for documenting this so well!</p>

<h3 id="zarr-python">Zarr-Python</h3>

<p>Joe Hamman led a group focusing on enabling support for V3 in Zarr-Python. This was part of an ongoing effort working toward Zarr-Python version 3.0 (<a href="https://github.com/zarr-developers/zarr-python/blob/main/v3-roadmap-and-design.md">roadmap</a>).
The focus of this group was on closing outstanding issues on the roadmap and testing the development branch in common geospatial applications. Zarr-Python has traditionally been the canonical implementation of Zarr, and it is therefore a current priority since this effort delivers immediate impact to the largest swath of users, including those that use Zarr through downstream libraries (e.g. Xarray, Dask, Anndata, etc.).</p>

<h3 id="geospatial-multi-scalespyramids">Geospatial Multi-Scales/Pyramids</h3>

<p>In the Zarr pyramids breakout group, Thomas Maschler and Max Jones discussed the motivations for following the <a href="https://docs.ogc.org/is/17-083r4/17-083r4.html">OGC TileMatrixSet 2.0</a> specification within the GeoZarr specification, which will be shared as a new issue to supersede <a href="https://github.com/zarr-developers/geozarr-spec/issues/30">GeoZarr Issue #30</a>. They also discussed reading those TMS into <a href="https://github.com/cogeotiff/rio-tiler">rio-tiler</a> using Xarray and started a refactor of ndpyramid to support the TMS specification.</p>

<h3 id="alternate-backend-for-reading-remote-zarr-stores">Alternate backend for reading remote Zarr stores</h3>

<p>Kyle Barron worked on a prototype for an alternate store for Zarr Python using new async Python bindings to Rust’s object-store project. You can see a prototype of object-store-based store implementation at <a href="https://github.com/zarr-developers/zarr-python/pull/1661">zarr-python#1661</a>.</p>

<h2 id="geozarr-specification">GeoZarr Specification</h2>

<p>Throughout the sprint, the GeoZarr focus group worked on examining the interoperability of GeoZarr and different existing tooling and store support. You can see the table <a href="https://github.com/zarr-developers/geozarr-spec/blob/main/geozarr-interop-table.md">here</a>.</p>

<p>One of the biggest realizations was that ArcGIS has a lot of existing support for Zarr, which is really exciting news! For other tools, there is still work to be done, especially for GeoTIFF-like data being stored in Zarr, which translates to updates needed within the GeoZarr specification. For example, there are functionality issues tied to support or lack thereof for specific compression algorithms. The GeoZarr Steering Working Group is working on providing a list of supported compressions for commonly used tools. There is also work to be done on specifying the organizational structure of GeoZarr and understanding where requirements from CF diverge from the Zarr data model. For this, we are focusing efforts on involving folks with CF expertise to guide these conversations.</p>

<p>If you are interested in helping out, please join the next bi-weekly GeoZarr meeting every other Wednesday at 11 EST. The next will be April 17th and you can find the invite on the Zarr calendar or join directly from <a href="https://meet.google.com/jth-rstn-fwb">this link</a>. Check out the notes from past meetings at the <a href="https://hackmd.io/@briannapagan/geozarr-spec-swg/edit">hackmd</a>.</p>

<h2 id="http-extension">HTTP Extension</h2>

<p>A final priority of the Zarr Sprint was to get efforts rolling on how to better visualize Zarr on the web.</p>

<p>Kevin Booth is the lead on this effort. Currently, he has added some sidecar files with links to reference parent, child, and root relationships in the Zarr to be able to use something like <a href="https://github.com/xaviernogueira/traverzarr">traverzarr</a>, the first attempt at traversing a Zarr JSON as if it were a filesystem in developed by Xavier Nogueira during the sprint, to navigate a Zarr in a manner like the Spatio-Temporal Asset Catalog (STAC). A more detailed blog post with updates on this work to come in the next week.</p>

<p>This work continues to be worked on after the sprint. Cloud-Native Geospatial Foundation has started holding bi-weekly meetings to hack on this work. If you would like to be involved in this, email hello@cloudnativegeo.org to be added to the meeting invite, or find the meeting link at the Zarr calendar <a href="https://zarr.dev/community-calls/">here</a>.</p>

<h2 id="more-efforts-to-come">More efforts to come!</h2>

<p>It was great to get a group of people together to spend some dedicated time on Zarr, and the work is nowhere near done. Please help keep the momentum of these efforts going by responding to any GitHub Pull Requests, Issues, or Discussions that you have opinions on and joining any of the established Zarr meetings that are of interest to you. Again, the Zarr calendar can be found <a href="https://zarr.dev/community-calls/">here</a>.</p>

<p>~ Michelle Roby</p>

<script src="https://giscus.app/client.js" data-repo="zarr-developers/blog" data-repo-id="R_kgDOGxrWVg" data-category="General" data-category-id="DIC_kwDOGxrWVs4CU5q_" data-mapping="pathname" data-strict="0" data-reactions-enabled="1" data-emit-metadata="0" data-input-position="top" data-theme="light" data-lang="en" crossorigin="anonymous" async="">
</script>]]></content><author><name></name></author><category term="blog" /><summary type="html"><![CDATA[A summary of what we did at the 2024 Zarr Sprint.]]></summary></entry><entry><title type="html">Levelling Up: Zarr Community Transitions to Zulip</title><link href="https://zarr.dev//blog/zulip-transition/" rel="alternate" type="text/html" title="Levelling Up: Zarr Community Transitions to Zulip" /><published>2024-02-27T00:00:00+00:00</published><updated>2024-02-27T00:00:00+00:00</updated><id>https://zarr.dev//blog/moving-to-zulip</id><content type="html" xml:base="https://zarr.dev//blog/zulip-transition/"><![CDATA[<p>Hi, Zarr Community! 👋🏻</p>

<p>We’ve got an exiciting announcement for you all! 😄</p>

<h2 id="community-update-moving-to-zulip-">Community Update: Moving to Zulip 💬</h2>

<p>We’re excited to announce that the Zarr community has made a move
from <a href="https://gitter.im/zarr-developers/community">Gitter</a> to <a href="https://ossci.zulipchat.com/">Zulip</a>
as our primary chat platform. This transition marks a new chapter for our
community and offers several advantages for our members.</p>

<p>Join here → <a href="https://ossci.zulipchat.com/">https://ossci.zulipchat.com/</a></p>

<h3 id="why-zulip-">Why Zulip? 🤔</h3>

<p>Zulip offers a robust and versatile platform for communication and
collaboration. Its threading model allows for organized and focused
discussions, making it easier for community members to follow and participate
in conversations effectively. Additionally, Zulip provides powerful search
capabilities, ensuring that valuable information shared in the past remains
accessible to all.</p>

<p>Zulip’s unique message sharing feature allows conversations to be easily shared
around the web via unique links. In addition, Zulip’s indexing of all content
by search engines ensures that the knowledge base is easily accessible to all
users.</p>

<h3 id="hosting-thanks-">Hosting Thanks 🙏🏻</h3>

<p>We extend our sincere gratitude to the good humans at the <a href="https://opensource.science/">Open Source Science
Initiative</a> (OSSCi) for generously hosting the
Zulip server. Their commitment to supporting open science and collaborative
research is commendable, and we’re thrilled to partner with them on this
endeavour.</p>

<p>Shoutout to <a href="https://www.linkedin.com/in/jonathan-starr-b04032284/">Jonathan Starr</a>
for helping us! 🙌🏻</p>

<h3 id="building-a-hub-for-open-science-">Building a Hub for Open Science 🧬</h3>

<p>The OSSCi Zulip server will serve as a hub for various projects in the
scientific Python ecosystem, starting with Zarr. By centralising communication
within this platform, we aim to foster greater collaboration, knowledge
sharing, and community building among like-minded individuals passionate about
open science and research.</p>

<h3 id="official-chat-platform-️">Official Chat Platform ™️</h3>

<p>With this migration, the OSSCi Zulip server becomes the official chat platform
for the Zarr community. We encourage all Zarr users, contributors, and
enthusiasts to join us on Zulip to stay updated on the latest developments,
seek assistance, and engage with fellow community members.</p>

<h3 id="your-feedback-matters-">Your Feedback Matters 🔁</h3>

<p>At Zarr, we value the input and ideas of our community members. We’re committed
to continuously improving our platform and user experience. Therefore, we
welcome any feedback, suggestions, or ideas you may have regarding the Zulip
migration or any other aspect of our community. Your input helps us better
serve the needs of our users and advance our shared goals.</p>

<p>Please create an issue in <a href="https://github.com/zarr-developers/community/issues">zarr-developers/community</a>
or join one of our <a href="https://zarr.dev/community-calls/">community meetings</a>
if you’d like to chat with us!</p>

<h3 id="join-us-on-zulip-">Join Us on Zulip! 🔗</h3>

<p>Ready to join the conversation? Head over to the <a href="https://ossci.zulipchat.com/">OSSCi Zulip</a>
server and dive into discussions surrounding Zarr and other exciting projects 
in the scientific Python ecosystem.</p>

<p>With our shift from Gitter to Zulip, it’s worth mentioning that the majority of
discussions on Zulip have involved the core developers of Zarr. Now, we’re
extending our warm invitation to the wider community to join us on Zulip. Your
involvement is crucial as we foster a more inclusive and vibrant community.</p>

<p>We look forward to connecting with you there! ✌🏻</p>

<p>~Sanket Verma</p>

<script src="https://giscus.app/client.js" data-repo="zarr-developers/blog" data-repo-id="R_kgDOGxrWVg" data-category="General" data-category-id="DIC_kwDOGxrWVs4CU5q_" data-mapping="pathname" data-strict="0" data-reactions-enabled="1" data-emit-metadata="0" data-input-position="top" data-theme="light" data-lang="en" crossorigin="anonymous" async="">
</script>]]></content><author><name></name></author><category term="blog" /><summary type="html"><![CDATA[Blog Post on Zarr Community Update → Moving to Zulip]]></summary></entry><entry><title type="html">Zarr, as seen in the public 📣</title><link href="https://zarr.dev//blog/zarr-talks/" rel="alternate" type="text/html" title="Zarr, as seen in the public 📣" /><published>2023-06-23T00:00:00+00:00</published><updated>2023-06-23T00:00:00+00:00</updated><id>https://zarr.dev//blog/zarr-talks</id><content type="html" xml:base="https://zarr.dev//blog/zarr-talks/"><![CDATA[<h2 id="hi-zarr-community-">Hi Zarr Community! 👋🏻</h2>

<p>Recently, I and several community members have been speaking at various conferences and events. There has been an exciting development in the Zarr ecosystem, like finalising V3 specification, submitting new ZEPs, initiating new implementations, etc.</p>

<p>While I’m mostly giving beginner talks on Zarr, which answers how, why, and what, the enthusiastic community members have been talking about other exciting stuff!</p>

<p>In this blog post, I highlight a few talks which were delivered in the past two months. Also, we’re maintaining a playlist on YouTube, which has a more extensive collection of talks from various domains and diverse speakers. Check the playlists: <a href="https://youtube.com/playlist?list=PLvkeNUPrCU04Xvcph4ErxsRkZq28Oucr7">Zarr: Introductory Talks</a> and <a href="https://youtube.com/playlist?list=PLvkeNUPrCU05qHkZso_T74yoayqLFHzkI">Zarr: Projects, Uses, Research and Workflows</a>.</p>

<h2 id="pycon-de-and-pydata-berlin-2023-">PyCon DE and PyData Berlin 2023 🇩🇪</h2>

<p>I went to Berlin, Germany, in April to speak at <a href="https://2023.pycon.de/">PyCon DE and PyData Berlin 2023</a>. My talk was titled “<a href="https://2023.pycon.de/program/JY3R3Z/">The Beauty of Zarr</a>”, where I emphasised the inner workings using some near illustrations by <a href="https://github.com/manzt">Trevor Manz</a>. I highlighted how simple, convenient and hackable it is to use Zarr. After going through various explanations, I focused on some critical issues that Zarr eradicates because of its design and workings, i.e. chunking, compression, cloud-enabled etc.</p>

<p>Towards the end, I prepared a <a href="https://github.com/MSanKeys963/presentations/blob/main/pycon_de_pydata_berlin_2023/notebook.ipynb">Jupyter notebook</a> where I walked through <a href="https://zarr.readthedocs.io/en/stable/tutorial.html">Zarr 101</a> code to create, read, write and manipulate arrays. I also converted the Zarr pixelated logo from <code class="language-plaintext highlighter-rouge">.png</code> to <code class="language-plaintext highlighter-rouge">.zarr</code> format, which was a neat closing for my talk.</p>

<blockquote>
  <p>The slides and notebook can be accessed <a href="https://github.com/MSanKeys963/presentations/tree/main/pycon_de_pydata_berlin_2023">here</a>.</p>
</blockquote>

<p>Please watch the video here: 👇🏻</p>

<iframe width="800" height="500" src="https://www.youtube.com/embed/OYaMi9WnQpA" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen="">
</iframe>

<h2 id="esip-meetings-">ESIP Meetings 🌏</h2>

<p><a href="https://www.esipfed.org/">Earth Science Information Partners (ESIP)</a> is a community of data and information technology practitioners working together to coordinate earth science interoperability efforts. ESIP has various <a href="https://www.esipfed.org/get-involved/collaborate">collaboration areas</a>. ESIP Collaboration areas are made up of administrative committees and small working groups that are called clusters. Some of them are:</p>

<ul>
  <li>Agriculture &amp; Climate</li>
  <li>Open Science</li>
  <li>Cloud Computing</li>
  <li>Soli Ontology and Informatics</li>
  <li>Data Management Training Clearinghouse</li>
  <li>Council of Data Facilities</li>
</ul>

<p>And many more.</p>

<p>The ESIP <a href="https://wiki.esipfed.org/Cloud_Computing">Cloud Computing Cluster</a> organised a three-part series on Zarr titled “<a href="https://discourse.pangeo.io/t/join-the-esip-cloud-computing-cluster-session-april-24-for-zarr-the-next-generation-part-2-3/3354">Zarr: The Next Generation</a>” In every part, the Zarr Community members talked about several things ranging from V3 to conventions to ZEPs.</p>

<blockquote>
  <p>The first part took place on March 27th where:</p>
</blockquote>

<ul>
  <li><a href="https://github.com/rabernat/">Ryan Abernathey</a> presented on the <a href="https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html">Zarr V3 Specification</a>, i.e. <a href="https://zarr.dev/zeps/draft/ZEP0001.html">ZEP0001</a></li>
  <li><a href="https://github.com/martindurant/">Martin Durant</a> presented on the variable chunking, i.e. <a href="https://zarr.dev/zeps/draft/ZEP0003.html">ZEP0003</a></li>
</ul>

<p>The video recording of the session can be seen here: 👇🏻</p>

<iframe width="800" height="500" src="https://www.youtube.com/embed/50_LwbIUXi0" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen="">
</iframe>

<blockquote>
  <p>The second part took place on April 24th where:</p>
</blockquote>

<ul>
  <li><a href="https://github.com/briannapagan">Briana Pagán</a> spoke about the current state of <a href="https://github.com/zarr-developers/geozarr-spec">GeoZarr specification</a> and working group</li>
  <li><a href="https://github.com/normanrz">Norman Rzepka</a> spoke about the Sharding specification, i.e. <a href="https://zarr.dev/zeps/draft/ZEP0002.html">ZEP0002</a></li>
</ul>

<p>The video recording of the session can be seen here:</p>

<iframe width="800" height="500" src="https://www.youtube.com/embed/a4-vmJRQcrg" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen="">
</iframe>

<blockquote>
  <p>The third part took place on May 22nd where:</p>
</blockquote>

<ul>
  <li><a href="https://github.com/hailiangzhang/">Hailiang Zhang</a> presented the accumulation proposal, i.e. <a href="https://zarr.dev/zeps/draft/ZEP0005.html">ZEP0005</a></li>
  <li><a href="https://github.com/maxrjones">Max Jones</a> spoke about <a href="https://github.com/fsspec/kerchunk">Kerchunk</a> and <a href="https://pangeo-forge.org/">Pangeo-Forge</a> recipes developments</li>
</ul>

<p>The video recording of the session can be seen here:</p>

<iframe width="800" height="500" src="https://www.youtube.com/embed/ROsHdJI3-yw" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen="">
</iframe>

<p>These meetings covered a great deal of recent developments in the Zarr ecosystem. The ZEPs mentioned above explained the V3 specification, sharding, and a couple of new exciting features the community is working on. The interesting thing to note here is that the ZEP0003 and ZEP0005 are something the community members wrote to support their use-case in their domain. This shows the openness and flexibility of the Zarr open-source community and how we support everyone. Though these ZEPs are still in the draft state, they’ll be finalised soon for adoption.</p>

<p>I will discuss about V3 specification in a separate blog post, so I’d not go into the details here. But it’s worth noticing GeoZarr specification and what Briana presented. GeoZarr is one of the conventions on top of Zarr specification, which support various use cases of the geospatial community on how they store their data and metadata. The GeoZarr SWG (Steering Working Group) has been working quickly despite the roadblocks (as mentioned by Briana). The progress and specification can be seen <a href="https://github.com/zarr-developers/geozarr-spec">here</a>.</p>

<h2 id="conclusion">Conclusion</h2>

<p>These are some of the public engagements done by the Zarr Community members in the past months. If you spoke on Zarr recently or in the past and would like me to highlight your talk, please don’t hesitate to contact <a href="mailto:svsanketverma5@gmail.com">me</a>. If you’re working on something interesting which involves Zarr and want to share it with the community, please say ‘Hi’ to me!</p>

<p>I’ll be talking to you all soon.</p>

<p>Until next time, peace! ✌🏻</p>

<p>~Sanket Verma</p>

<script src="https://giscus.app/client.js" data-repo="zarr-developers/blog" data-repo-id="R_kgDOGxrWVg" data-category="General" data-category-id="DIC_kwDOGxrWVs4CU5q_" data-mapping="pathname" data-strict="0" data-reactions-enabled="1" data-emit-metadata="0" data-input-position="top" data-theme="light" data-lang="en" crossorigin="anonymous" async="">
</script>]]></content><author><name></name></author><category term="blog" /><summary type="html"><![CDATA[Blog Post on Zarr talks]]></summary></entry></feed>