Skip to main content Link Search Menu Expand Document (external link)

12th August, 2020

Attending: Dennis, Ola Tarkowska, Josh Moore, Martin Durant, John Kirkham, Matthias, Andrew, Ryan Williams

Apologies:

  • Standing items

    • Introductions: new joinees should feel welcome to introduce

      themselves and their interest in zarr/n5.

    • Ola: looking into zarr (IO performance, features, compare to

      HDF5) after being introduced to Alistair. Context of imaging data.

  • Add community stuff here (30 minutes)

    • Josh: likely to have tileDB versions of the zarr images for

      (the start of) a benchmarking comparision

      • Martin: running locally? Josh: would think both. Martin: would like to look into mult-get/get_items and fsspec. (async s3fs isn’t released yet; is breaking pandas currently) Got blocked by the various implementations Nested (which only shows up in the tests)
      • Josh: could also see an intermediate solution (pre-v3) for the storage issue.
    • Matthias: also fighting with Windows. For async everywhere

      need to rethink the zarr API. Martin: work at the moment is focused on the fetching of multiple keys (lots of overhead on s3). E.g. xarray with a few hundred bytes per chunk. Looked at Matthias’ implementation (async def), but am closer to dask-distributed (i.e. asyncio). Can be added to v2 though not very deep (e.g. file listings)

    • **John: **saturated with rapids

    • **Dennis: **just gave talk to NASA about netcdf cloud work with

      Ward

  • Add core stuff here (30 minutes)

    • Matthias: Quantstack got contract on Monday. Will be joining

      Friday meetings. Wednesdays are too late. Have them participate on the C implementation or green field? At least list in one place. People are confused that Zarr isn’t just Python. Something early in the docs. As well as some form of test suite to compare some set of files.

    • John: cf.

    • Matthias: tests passing (though not on Windows due to

      slashes). Talked to Sylvain. May need to rethink types not multiple of 8. Worth the complexity?

    • Matthias: could use testers / mergers for various smallish

      pull requests. Josh: any priority? Not really.