9th February, 2022

Attending: Jonathan Striebel, Davis Bennett, Eric Perlman, Josh Moore (JAM), Sanket Verma, Jeremy Maitin-Shepard, Hailey Johnson, Erik Welch (Anaconda → NVIDIA), John Kirkham, Dennis Heimbigner, Dave Mellert, Greggory Lee, Matt McCormick

Davis: gdoc → hackmd? Sure!
- Sanket: suggestions
Sanket:
- updated webpage, blog update
- https://zarr.dev/blog/welcoming-community-manager/
Intros & various links here
- Erik: Mr. Sparse
- ….
Jeremy:
- C(++) implementations: quick report to Dennis from
  
  https://imagesc.zulipchat.com
- v3 support for non-zero origin
  
  (issue)
  - Dennis: would attribute specifying the origin be enough? have benefited from cfconventions which standardize the meaning of attributes (in atmospheric/related domains)
  - JMS: benefit of allowing indexing to be affected.
  - DH: second system syndrome problem (link)
  - DB: see the utility (working with cutouts) but the workflow puts us outside what’s in the core spec. zarr array shouldn’t have metadata that references another array. perhaps a nice formalization of transforms, since it defines a coordinate space.
  - JMS: julia has offset arrays (numpy is always 0-origin’ed)
  - DB: meaning of 0-origin is that there’s a coordinate space
  - EP: like the functionality, but it can be at a different level.
  - DH: specify the array it came from as well as the origin?
  - zaJMS: have translate to origin
  - DB: in xarray use piece of data to coordinate-aware indexing, two methods of getting into an array.
  - JAM: prioritizing the many recent spec proposals
  - DB: suggest talking to other consumers and see what they want.
Addition of sharding in spec v3 (issue/py → issue/spec)
- Status: 2 prototype PRs (plus a [*minimal
  
  implementation*](https://github.com/alimanfoo/zarrita/pull/40) in zarrita)
- Moving forward with v3 would be fine.
- Try a PR as a on the v3 spec for a translation layer (i.e.
  
  generically)
  - sharding, checksumming, IPFS, etc.
- JMS: don’t see the relationship between sharding & checksumming
  - JAM: due to content-addressable storage
  - DH: just as a compressor/filter that attaches the checksum
  - JS: partial read would need to be handled somewhere that’s not in the compression
  - JAM: kerchunk API of `key → (uri, offset, length)`
  - JMS: for the write path it is more complicated
- DH: makes me nervous when we worry about limitations of the
  
  underlying store w.r.t to the specification. spec should be storage-agnostic.
  - DH: v2 is agnostic of relative location of chunk/metadata are laid out on disc. Don’t need to be “next” to one another. This is introducing pieces and that they are together.
  - Be sure you want to get rid of the independence property
  - JK: more simple way of describing it. (An ordering).
  - DH: the proposal needs to specify the relationships between chunks that are supported.
  - DB: agreed complicated but 100% worth it for some domains.
- re-writing methods?
  - disallowed
  - only if uncompressed?
    - JS: not yet, but doesn’t currently exist for simplicity
  - rewrite index
v3 (Greg)
- In terms of the dtypes supported, I have not worked on those
  
  protocol extensions related to that. Is that something I should spend time on? The other thing I could do is make a WIP PR to Dask and Xarray with minimal changes for how they could support v3 as currently implemented in that branch.
‘zarr_implementations’: remote implementations: http/s3/etc
- good point
- EP to create an issue
- JAM:
Tabled
- xarray:
  
  dataclasses / datatree (issue) - (if Matt McCormick shows up)