6th April, 2022

Attending:

Agenda: Josh Moore (JM), Sanket Verma (SV), Norman Rzepka (NR), Parth Tripathi (PT), Davis Bennet (DB), Shivank, Gregory Lee (GL), Ishan Bansal (IB), Hailey Johnson (HJ), Martin Durant (MD), Isaac Vishup (IV), Ward Fisher (WF), John Kirkham (JK)

  • Introductions (incl. favorite food!)
  • Cloud Native Outreach Day is happening in 2 weeks i.e. April 19th. Register here if you still haven’t.
  • GSoC contributors proposal are open now until April 19th (ideas-lists)
  • Guest blog posts are welcome for http://zarr.dev/blog
  • A small surprise for y’all! 🎉 ()
  • write_empty_chunks debacle (DB)
    • Wrong choice? If we can’t handle the edge cases, probably.
    • DB: make fill_value required? (break change). don’t see zero as the obvious fill value even for numeric types.
    • MD: would want a default since there is data that doesn’t need a fill_value
    • “auto” which sets write_empty_chunks to False if a fill_value is passed
    • DB: and/or a global mapping from data_type to fill_value
    • WF: that’s how NetCDF handles it (on by default, can be turned off by API). Anecdotally, few complaints.
    • DB: proposing “auto” if no one objects (post-2.11.3)
    • WF: is there a way to know what was requested? NC captures provenance metadata (now).
    • MD: other provenance metadata that could be included: “written in python with zarr-python 2.11” (JK: great spec issue!)
  • multi-resolution images (MD)
    • JM: description of the xarray/datatree work
      • https://github.com/zarr-developers/zarr-specs/issues/125
    • MD: ok to concatenate multiple non-T volumes into a time-series? JM: think so
    • DB: some clients may not be able to
  • TBD in gitter: fsspec (MD)
    • can fetch many chunks in an array concurrently
    • not an API for fetching from many zarrs in a group
    • DB: sounds like it is undercovering a larger issue (Futures)
    • MD: like dask
  • moving the sharding spec forward (NR)
    • https://github.com/zarr-developers/zarr-specs/issues/127
    • JM: getting the abstraction correct / we seem to be risk-adverse
    • JM: editor group? but regardless major decision is does sharding get defined before or after v3.
    • NR: (voluntary) implementors group? Then a question of voting vs. consensus vs. …
    • JK: looks like there are different asks at the moment, so maybe the roadmap is the most important thing

Action Items

  • Davis Bennet to submit a follow-up PR to this to propose an “auto” setting for write_emtpy_chunks