2023-10-19

Attending: Sanket Verma (SV), Mark Kittisopikul (MK), Ward Fisher (WF), Davis Bennett (DB), Norman Rzepka (NR), Isaac Virshup (IV)

TL;DR:

In this meeting, the attendees discussed updates on ZEP0002, with a focus on potential changes related to handling checksums and compatibility with Kerchunk. ZEP0006 (Zarr Object Model) discussions included the suggestion of obtaining a JSON schema for ZOM and the idea of a separate ZEP for serializing children metadata. Additionally, there were considerations about namespacing codecs, discussions on ZEP0003 progress, and the potential adoption of HDF filters in Zarr. The meeting also touched on accommodating changes to ZEP0001 and the upcoming ZEP0002 voting deadline on October 31.

Updates:

Meeting Minutes:

  • MK: Changes to ZEP0001 are still coming in - how do we handle them?
    • SV: ZEP1 was provisionally accepted but not at the final stage
    • NR: These changes are minor and would need voting from ZIC
    • MK: Mention potential grace period in ZEP0000 would be helpful
    • NR: Needs to be written out!
  • MK: Zarr shards as HDF5 file
    • ZEP0002 should proceed as it is atm
    • Having a null codec to ignore the checksum would be helpful - can work on this
    • Recent numcodecs release helps a lot - getting Jenkins lookup checksum
    • Relation with ZEP0002 and Kerchunk?
    • NR: Don’t know if there is
    • IV: Zarrita can read the HDF5 file using Kerchunk
  • IV: Multiple arrays in a single Zarr shard
    • NR: I don’t think it’ll be possible
  • DB: Why the chunks in the directory called C?
    • NR: Helps when when scanning down the groups and arrays
  • DB: Any questions for ZEP0006?
    • NR: Getting a JSON schema out of the ZEP0006 would be helpful
    • DB: This would also help us to get a container level validation
    • IV: How differs from consolidated metadata?
    • DB: Consolidated metadata
    • NR: Serializing children metadata would be helpful - could be a different ZEP
    • DB: Flattening array
    • DB: Pydantic-Zarr would have the reference implementation for ZOM
  • ZEP0002 discussions at Unidata
    • Mostly going for ‘Yes’ - but looking for resources who can handle and complete it
  • MK: NetCDF has adopted HDF filters? Making a part of Zarr filters?
  • DB: Getting away from storing F9 transformations in metadata
  • IV: Thoughts on namespacing codecs?
    • NR: Would be helpful
    • WF: The overhead of maintenance and administration is daunting - experience from Unidata - How would I guarantee that information would be there in 10 years?
    • IV: Pointing at URI where the codec is defined - Having this in Zenodo would be helpful
  • IV: ZEP0003 progress
    • SV: Waiting for technical review - would help if Martin/IV could make it the ZEP meetings to raise the discussion
    • IV: Sure