2023-05-18

Attending: Sanket Verma (SV), Josh Moore (JM), Ryan Abernathey (RA), Norman Rzepka (NR), Jeremy Maitin-Shepard(JMS)

TL;DR:

The meeting covered the acceptance of ZEP0001, signaling the initiation of implementations. Discussions revolved around updates for ZEP0001 and V3, with Zarrita noted as a comprehensive PoC. ZEP0002 discussions focused on shard structures and an iterative ZEP process. Additionally, there were insights into experimentation with virtualization and metadata linking, and considerations for handling unmaintained Zarr implementations, emphasizing a shift towards V3 developments.

Meeting minutes:

  • Discussions about climate and weather 🌡️
  • ZEP0001 is finally accepted! 🎉
    • Implementors can start working on their implementations
      • Will be moving the ZEP0001 under the new Accepted section
      • Will move it under Final section once we have atleast one complete reference implementation
    • SV - updates and next steps for ZEP0001 and V3:
      • 1 year into the process… (2019 first discussion)
      • gdal moving quickly
      • will be checking in on the various implementations
      • zarrita as one of the most complete (in terms of code, not docs or tests), i.e. PoC
        • JM: Reference implementation needs to be useable!
        • NR: Yeah!
        • RA: why it’s not a complete implementation?
        • NR: no optimizations (sharding, etc.). meant to be easy to read code.
          • lacks features like buffer protocol, etc. but could be used.
          • don’t currently plan to maintain it over a long period of time.
      • SV: was thinking less of end-user and more of supporting all the features so others can refer to it.
        • NR: that probably could be done now.
        • NR: could write an intro for people to read. (don’t want to write end-user docs)
      • RA: NR’s production implementation? use different file format currently. must have sharding.
        • considering using zarrita as the implementation (for Python stack)
        • also have a scala stack (baked into software)
  • JM: ZEP0002 voting and discussions
    • JM: We could open up the voting for ZEP0002 and give a month/two month/full summer for voting - any open issues?
    • NR: None!
    • JM: Shard as recrusive Zarrs? - treat internals of shards like another Zarr
    • NR: Sharding being a codec would work that way but it’s more of a implementation detail
    • JM: What would it look like from a URI structure? - similar to what Saalfeld is doing in N5 ecosystem - if I access a chunk inside a shard and I could treat it as a Zarr array and not as a blob
    • NR: Fair enough! I could have something like this in Zarrita
    • NR: Not have implementation details in the spec but rather point to the reference implementation for the details
    • RA: ZEP process should be more of an iterative process and not an ultimatum
    • JMS: I feel, most of the implementors would be working on sharding and V3 together
  • RA: flat files to virtualization (experimentation). (Once ZEP0003 lands!)
    • Also want to see linked referencing in metadata (to allow browsing through HTTP). Better than consolidated. For e.g. infinite hierarchies, allows browsing like a catalog
    • Allow parent to list it’s children in Zarr groups
    • Also multiscales. Relates the arrays. Within the array directory?
    • NR: listing parent/children
    • NR: paths…
    • RA: see “link” in https://github.com/radiantearth/stac-spec/blob/master/collection-spec/collection-spec.md#link-object for any node (self, root, parent, child)
  • NR: Discussions on umaintained Zarr implementations
    • SV: Dissolve them with the help of the maintainers
    • JM: Tricky to get a hold of them
    • SV: Start deprecating them and then removing them from https://zarr.dev/implementations/
    • JMS: But all of that is V2 - if there’s going to be something on V3 we can certainly help them