2024-05-15

Attending: Davis Bennett (DB), Sanket Verma (SV), Dennis Heimbigner (DH), Brianna Pagān (BP), Thomas Nicholas (TN), Jeremy Maitin-Shepard (JMS)

TL;DR:

The team discussed the new Zarr-Python 2.18.0 release, ongoing work on validators, and the potential integration of VirtualiZarr into zarr-developers. They also explored the challenges and future directions for supporting V2 and V3 implementations, codecs, and storage transformers.

Updates:

Meeting Minutes:

  • SV: How’s the implementation coming up?
    • DH: ArraytoArray codecs is still needs to go there
    • DB: Are you trying to support implicit groups?
    • DH: Trying to add it
    • DB: We’re trying to remove it
    • DH: Focusing on adding support for V2 and V3
    • DB: Poses a bit of problem than making things easier
  • Zarr Validators - tool to check if the Zarr data conforms to the specification or not
  • Codecov not reports lagging in recent PRs - any ideas how to fix it?
  • TN: Interested in having VirtualiZarr under zarr-developers!
    • DB: Sounds good!
    • SV: We can transfer the repo to zarr-developers and provide TN required privleges
  • TN: Storage transformers discussion
    • DH: recap of the past discussion would be appreciated
    • TN: starts recapping
    • DH: Can this be handled by codecs in V3? - you already have ArraytoArray, ArraytoBytes, BytestoBytes…
    • JMS: It could be! - would be a interesting to see
    • DH: Maybe a experimentation first would be a good idea?
    • DB: Using codec might not be desirable if you want to change the data
    • DH: Codec would only to take care of the existing dimensions and nothing else
    • JMS: ArraytoByte covers endian
    • DH: Perhaps the virtual concept of storage transformers can be defined as a codec
    • DB: V2 codecs don’t know anything about the storage - only converting bytes to nd-array and vice versa
    • DH: But in V3 the concept of codecs has been expanded
    • DH: Shards cannot be moved around or compressed because the sharding codecs doesn’t have a mechanism to track the shards - needs to be looked at
    • TN: All of the manifests of array, metadata, groups are combined when you use Kerchunk but VirtualiZarr separates it and makes it easier for various tasks like concatenation