29th June, 2022

Attending: Davis Bennett (DV), Sanket Verma (SV), Josh Moore (JM), Jeremy Matiin-Shepard (JMS), Parth Tripathi (PT), Ward Fisher (WF), Hailey Johnson (HJ), Shivank Chaudhary (SC), Ryan Abernathey (RA) +30 min

Updates (SV):

Zarr-Python 2.12.0 has been released! 🎉

Agenda:

ZEP1: https://github.com/zarr-developers/zeps/pull/1
- Authored by Alistair and Jonathan
- includes details on sharding & transformers
- addresses pain points & lack of clarity in v2
- Alistair to open spec changes against zarr-specs repo
- see https://zarr.dev/zeps for these changes
- comment on PR as desired
- otherwise, merging very soon
- further discussion to take place on the zarr-specs PR
Briefly (Josh), NFDI recommended for funding :tada:
- https://twitter.com/notjustmoore/status/1541776908043567104
JMS spec discussions
- NB: right forum? JM: just need to communicate thoughts back on the PR since there is no requirement to be at the community calls
- Dimension labels
  - there seemed to be interest in writing it up as a spec
  - requirement that they are unique strings OR the empty string to say that they are unlabeled
  - DB: motivation for unlabeled? Currently all are unlabeled. DB: disagree they are all labeled with integers.
  - JMS: then strings are optional/additional alternatives.
  - DB: see it leading to issues. potentially: “if you add labels then you must add all”
  - JMS: case of automating inputs to outputs could lead to inventing fake labels but perhaps that’s preferable to empty
  - DB: drawback from type theory is that you want the unlabel to be a different type. JMS: Use Null? disallow "" anyway?
  - WF: dimensions are label and id parent? or conflating NC/H5
  - JMS: was just thinking within a given array. goal would be to not need to know it’s “dimension[2]”
  - DB: could see having arrays logically identical with different dimension ordering. want to enable use of, e.g., "z"
  - WF: “dim_$N” gets assigned automatically.
  - JM: need for buy-in from xarray and nczarr
  - JM: in .zattrs? .zarray? JMS: don’t really mind.
  - DB: err on the side of having zarrs more like numpy arrays
  - JM: names in numpy are part of the dtype
  - DB: backwards-compatible way to specify the defaults if they don’t exist
  - JMS: and added to the zarr-python library? Yes.
- Single string to identify zarr root path + zarr array/dataset within root
  - SV: Greg left a comment today. See also shoyer issue
  - DB: an issue. problematic ergonomics
  - JMS: was hoping to find a resolution
  - JM: couple proposed
    - sensible defaults
  - DB: reason for separate hierarchy
    - JM: possible extensions (like consolidated)
    - JMS: range-requests to see full listings
  - RA: strongly believe that V3 doesn’t introduce such a breaking change
  - RA: NC uses path/to/file.h5/path/to/group
    - JM: would require an increased number of lookups for the root JSON
    - WF: correction – NC uses two strings
  - JMS: neuroglancer has a data source URL. can make up a convention but it would be nice to preserve the single-string semantics
  - RA: xarray only opens groups. more complicated for arrays.
  - RA: good to formalize the URI/URL semantics (good to specify your data with a string)
  - JMS: applies to groups, too.
  - RA: xarray supports extra path to a sub-group. also gaining datatree functionality.
    - DB: going into mainline? RA: Yes. DB: super cool.
  - DB: couldn’t you just pass the absolute.
    - JM: you don’t pass “data” or “meta”. only the logical group.
    - DB: that means that could completion won’t work. could irritate people.
    - DB: would pass the array. job of library is to find the array.
  - RA: use hash tag or standardized file ending (.zarr) to parse URL
    - DB: .zarr seems 100% reasonable (since slash is taken)
    - DB: recommendation for people who want to live their truth
    - JMS: would like to make this a MUST
  - DB: jpeg vs jpg vs …
    - RA: mimetype
  - JM: make the .json files the default?
    - RA: getting Zarr into STAC was problematic because it’s to a URL rather than a file. i.e. it fundamentally becomes a JSON file. Becomes a catalog.
    - DB: like it. Directories are not real, files are real.
    - JMS: could define a different ending?
    - RA: .json is good
    - JM: it’s .zarr.json which isn’t bad
    - DB: natural when moving from local file system to a KVS
    - RA: opens up absolute paths to chunks potentially
    - JMS: with more changes to the spec, yeah.
    - JM: consolidated metadata will be problematic.
DB: PR
- mypy issues
- annotations breaks linter
- JM: generally :+1: for type annotations, also ok to start looking at dropping 3.7 now
Tabled
- Support for inf/nan/binary data in attributes
- Endianness
- Zarr’s website
  - What do you feel about our current website?
  - What would you like to see in the new website?
  - Any ideas for good Jekyll/any static website generator themes?