21st April, 2021

Attendees: Ward, Josh, Matthias, Dennis, Jackson, RyanW., Greg

  • EOSS 4: received invitation (May 19)

    • Ward: chair Unidata DEI. Developer staff is above industry

      standard.

    • Matthias: NumFOCUS overhead goes to inclusive ecosystem

    • Quansight numbers? Unclear. Split overhead.
  • Genomic data in Zarr → Alistair

  • Ward: nczarr 4.8 released (!)

    • Represents a different data model? Wasn’t the intent

    • But has been released. Kudos to Dennis.

    • 4.8.1 to be released ASAP with xarray support.

    • Josh: similar to xarray.

    • Dennis: added notes to the documentation about compatibility

      • great lengths that existing zarr implementation should read any nczarr file. Works to some degree. Can also read any existing zarr file.

      • Pure superset for supporting netcdf 4 features

      • Limitation: zarr can read nczarr if willing to:

        • ignore keys that are unknown

        • ignore objects that are unknown

      • Treading on thin ice. Hopefully people will ignore

    • Josh: good to clarify these holes in the spec on the repo

    • Dennis: looking at the extensions if there’s a way to adapt, but

      good to have those conversations.

    • The HDF Group (THG) is hosting a presentation on May 14 to lay

      out their own cloud efforts. Registration is free, information here: https://zoom.us/meeting/register/tJYud-qgrjwjG91UF7SM7PiRU_QQQZ3jE6nG

  • (v2) Nested arrays

    • Dennis: trying to implement. One minor bug but working ok.

    • Those who chimed in were happy with it.

  • (v3) Matthias

    • Getting dimension_separator in

    • Rebasing base store work on that

    • Moving forward!

  • Ryan: next step on contract round 1

    • Report due at the end of the month

    • Ask for another extension? Yeah, let’s go for it.

  • European add on: Tobias Koelling, Stephan Saalfeld

    • Timezones

    • Saalfeld: caching JSON is a huge win. Also think about file

      listing.
      Anything that incurs latency.

      • Containers are very full, multiscale.
      • Tree discovery is very slow.
      • Async helps, but caching metadata is really useful.
      • Josh: tricky semantics that you
      • S. for a read instance on one computer, changes will be consistent
      • Josh: timeout? by size?
      • S: not yet. Really small. Also assume that these readers are short-lived.
    • Tobias: consolidating data in atmospheric sciences (EUREC4A

      ATOMIC)