2022-11-16

Attending: Sanket Verma (SV), Josh Moore (JM), Davis Bennett (DB), Ryan Abernathey (RA), Dennis Heimbigner (DH), Hailey Johnson HJ), Ward Fisher (WF), Jonathan Striebel (JS)

TL;DR:

SV is going to speak at PyData Global 2022 next week and also going to run a sprint along with JM. Details for the talk are here and sprint here. During the meeting a proposal to host a Discord server for chunked formats was raised by DB which later converted to creating a Discord server for the Zarr community. JM gave updates revolving around zarr-java, and lastly SV initiated the discussion on how we can separate Zarr (file storage format) from zarr-python (Python implementation of the Zarr spec).

Updates:

  • ZEP1 Update, see here
    • Check out the ZEP1 GH Project board here; maintained by Jonathan Striebel
  • PyData Global 2022 Sprint next-to-next week(1st-3rd Dec.), anyone interested in helping out?
    • Need to know by this week
    • Also: Sanket giving a talk, “The Beauty of Zarr”

Meeting Minutes:

  • DB: Discord for Chunked Formats?
    • Had problem and TypeScript Community was very helpful.
    • New thread gets created per issue
    • Downside: not indexed by google
    • WF: like it, use it socially, want this to be a solution
    • WF: have people pushing people to github
    • RA: pangeo uses discourse. bringing more dialogue together?
      • critical is the granularity so we get take home
    • DB: activation for forum post is 10x to discord message
    • WF: discord seems more synchronous
      • wouldn’t work for NC since there wouldn’t be enough critical mass
      • gitter straddles the line, since there’s a lot to catch up on
      • but one for sync and async would be good
    • RA: don’t see the other chunked formats being psyched to be in a channel with us. workflow might be the more useful framing
  • zarr-java: discussions tomorrow about bringing back two forks of jzarr
    • JM (re-surfacing) care with Python discord? Yes.
      • HJ: always start by clarifying library vs file format
    • WF: good feedback on Zarr at unidata user committee meeting
      • being asked for in THREDSS
    • DH: recent discussion around ragged arrays
      • most people encounter zarr through python
      • that leads to an imprinting
      • when they run into nczarr, they are perplexed (things missing)
      • still a big problem
    • JS: incompatibilities between libraries could bring us down
      • v3 is spec first with feedback from different implementors
      • hopefully to drop python-specifics and possible to know what the incompatibilities are
      • claim: “Zarr is X”, not just “a community project”
        • posters, repos, webpages, etc…
        • currently hard to grasp
    • SV: how to separate from Python
      • DH: go through tutorial and move things that are not in the v2 spec
      • DH: probably many go through the tutorial
      • DH: e.g. fortran community
      • JM: good points, but the same will be true for nczarr. v3 will give us the chance to label things more clearly.
      • WF: have to be clear about “NetCDF”. Was specific talking to the user committee about the “Zarr data model”, or “Zarr data storage”
        • specificity in language. data model and the format should be cross-language.
      • DH: ultimately goal is the nczarr extensions to be v3 extensions
      • JM: wonderful :tada: when would be a good time to plan for that?
      • DH: currently working on DAP4, but will shoehorn some time for bullet point list of extensions (and why)