2022-11-16

Attending: Sanket Verma (SV), Josh Moore (JM), Davis Bennett (DB), Ryan Abernathey (RA), Dennis Heimbigner (DH), Hailey Johnson HJ), Ward Fisher (WF), Jonathan Striebel (JS)

TL;DR:

SV is going to speak at PyData Global 2022 next week and also going to run a sprint along with JM. Details for the talk are here and sprint here. During the meeting a proposal to host a Discord server for chunked formats was raised by DB which later converted to creating a Discord server for the Zarr community. JM gave updates revolving around zarr-java, and lastly SV initiated the discussion on how we can separate Zarr (file storage format) from zarr-python (Python implementation of the Zarr spec).

Updates:

ZEP1 Update, see here
- Check out the ZEP1 GH Project board here; maintained by Jonathan Striebel
PyData Global 2022 Sprint next-to-next week(1st-3rd Dec.), anyone interested in helping out?
- Need to know by this week
- Also: Sanket giving a talk, “The Beauty of Zarr”

Meeting Minutes:

DB: Discord for Chunked Formats?
- Had problem and TypeScript Community was very helpful.
- New thread gets created per issue
- Downside: not indexed by google
- WF: like it, use it socially, want this to be a solution
- WF: have people pushing people to github
- RA: pangeo uses discourse. bringing more dialogue together?
  - critical is the granularity so we get take home
- DB: activation for forum post is 10x to discord message
- WF: discord seems more synchronous
  - wouldn’t work for NC since there wouldn’t be enough critical mass
  - gitter straddles the line, since there’s a lot to catch up on
  - but one for sync and async would be good
- RA: don’t see the other chunked formats being psyched to be in a channel with us. workflow might be the more useful framing
zarr-java: discussions tomorrow about bringing back two forks of jzarr
- JM (re-surfacing) care with Python discord? Yes.
  - HJ: always start by clarifying library vs file format
- WF: good feedback on Zarr at unidata user committee meeting
  - being asked for in THREDSS
- DH: recent discussion around ragged arrays
  - most people encounter zarr through python
  - that leads to an imprinting
  - when they run into nczarr, they are perplexed (things missing)
  - still a big problem
- JS: incompatibilities between libraries could bring us down
  - v3 is spec first with feedback from different implementors
  - hopefully to drop python-specifics and possible to know what the incompatibilities are
  - claim: “Zarr is X”, not just “a community project”
    - posters, repos, webpages, etc…
    - currently hard to grasp
- SV: how to separate from Python
  - DH: go through tutorial and move things that are not in the v2 spec
  - DH: probably many go through the tutorial
  - DH: e.g. fortran community
  - JM: good points, but the same will be true for nczarr. v3 will give us the chance to label things more clearly.
  - WF: have to be clear about “NetCDF”. Was specific talking to the user committee about the “Zarr data model”, or “Zarr data storage”
    - specificity in language. data model and the format should be cross-language.
  - DH: ultimately goal is the nczarr extensions to be v3 extensions
  - JM: wonderful :tada: when would be a good time to plan for that?
  - DH: currently working on DAP4, but will shoehorn some time for bullet point list of extensions (and why)