13th January, 2021

Attending: Matthias Bussonnier, Josh Moore, Ryan Williams, John Kirkham

  • Matthias: numcodecs

    • possible have osx build on GHA (GCC flag)

    • Josh: will re-test bump of blosc in numcodecs

    • … unless we don’t need it?


    • John: all dependencies outside of blosc had wheels


      • Can have pure numcodecs
      • Bigger fix, but longer term
    • Matthias: ok to figure this work. Was just turning off warnings


    • Josh: ok, won’t move forward

      numcodecs#259 instead looking to delete subdir

    • John: will need to pick up new dependencies and slightly adjust

      the APIs that are different (e.g. LZ4)

    • John: Everything is on conda-forge and and wheels

    • Josh: versions? Drop of Python 3.5 is 0.8.0; need to go to 0.9.0

      for dropping the subdir? Don’t think so.

    • … testing of M1

      • John: in conda-forge can either emulate with ARM/Linux, new SDK, or also can use Mac-Stadium
      • https://xkcd.com/1172/ …
  • Partial read: https://github.com/zarr-developers/zarr-python/pulls

    • Opt-in. Could have bad performance impact if used incorrectly.

    • John: relationship to memory-mapping? (since memmap follows

      Python array protocol) Matthias: Dunno, but more important in latent situations (S3)

    • Matthias: could grow in heuristics. Currently can’t open an

      array with both states.

    • Josh: need to be considered in the spec? Matthias:

      implementation detail.

    • John: looked into caching? (since multiple reads) Matthias: not


    • i.e. assumption is that person using knows exactly what they are

      doing it.

    • Driver: very large filesets, chunked in a particular way (and

      chunking is too expensive)

    • Benefit: when accessing the first or last index of a dimension.

    • If we can get it in, then we can move on BaseStore, etc.
  • John: NumFOCUS, got email re: annual project reports

    • deadline is next Wednesday.

    • Ryan: Nicole asked for a couple of sentences by this Friday

    • John: deadline by Jan. 20

    • Asks for:

      • 3 applications

      • 3 New Feature / Upgrade

        • Partial Read

        • Remove support for Python 3.5

        • Async multi get via fsspec.

      • 3 Project Need / Estimated Cost or Hours to complete

        • Getting us to V3

          • Josh: would like to actual move development to V3…

          • Matthias: get PartialRead, then BaseStore, clean heuristics to BaseStore, then we should be ready. (Write some first extensions, hammer out registration,etc.) but fewer hours per week (cf. napari)

          • Matthias: hesitant to give timeline since 2020 seems to not be over …

          • scipy as a target? (Virtual) Posters/talks due by Feb. 16th.

            • “The State of Zarr V3”

            • “Surgeon General Warning:

              Zarr V2 considered harmful”

  • Josh: 1st of February deadline for interim report

    • Matthias: plus “fixed GHA due to travis credits”

    • Ryan: working on that from the EOSS content.

    • Matthias: if shared, then can add to the gdoc. (Presenting to

      Quansight meeting tomorrow)

  • Misc:

    • Matthias: working on documentation (for Jupyter) with

      cross-linking, inline popups, etc. Something needed for zarr/numcodecs?

      • Josh: maybe something around heuristics for getting the most out of chunking, etc.
      • John: yeah, have something like that in dask. graphics? (Yes!)
      • Matthias: do have “X is mentioned over there…” (docs as a graph)
      • John: indexing comes up a bunch. (Differences between projects)
      • Matthias: rather than sphinx that generates HTML, but generates intermediate representation and then HTML is generated on the fly. Can then, e.g., disable showing signatures. or patch older version representation to say a new method is coming. (Have prototype: papyri)
    • RyanW: grant period extended to Mar 1, report deadline is May 1.

      • Do we need more time for the 80 hours
      • Engaging with Ralf for more allocation