22nd September, 2021
Attending: Josh Moore, Norman Rzepka, Erik Welch, Ward Fisher, John Kirkham, Greg Lee, Hailey Johnson
-
2.10 released 🎉
-
Josh: https://www.outreachy.org/ draft proposals → community issue (& twitter) soon
-
Norman (if he can make it): Sharded chunk storage
- built: https://webknossos.org/ for
collab. looking at large 3D volumes
-
presented: Slides on sharding
- Caterva → Webknossos
- Chunk file → Shard
- Blocks → Chunks
- about: Format implementation
(similar to neuroglancer)
- Interested in adopting Zarr as a primary format but need
something like shards for the same performance in streaming the data while storing on a cluster system.
-
Looking for how to get involved, spec it out, etc.
- John: familiar with Caterva
(#713)? Read up about it
- Josh: cloud IO problem. Any solution? No, only on
(local/distributed) filesystem
- Neuroglancer access buckets directly (Python for disk)
- Only need the file URI.
- Conversation on Caterva
(15.Sep)
- Neuroglancer: Uses range queries to read the index (size known
beforehand). Cached locally. Another range query to pick out the chunk data.
-
Languages X Cloud providers+FS
- John: Raised an fsspec issue for support of range queries.
Didn’t see existing support. So hopefully we can discuss more there:
-
Ward: V3 questions
- Meeting @ Unidata. NSF solicitation. Focus on continued
maintainability of s/w.
-
Adoption of V3 into netcdf could play a part of that.
- Josh: if need be can put a stamp on it, otherwise waiting on
Alistair
- However people are starting to code against it.
-
Ward: have a couple of months before proposals are due.
-
Greg: looked at xarray & dask test suites with v3
- for dasks, it works but you need to specify “to_zarr(component=’’)”. None is ok for V2.
- for xarray, lot of dtypes in their tests that aren’t supported (need to write to the metadata). complex, datatype, structured, object arrays are all in zarr tests. xarray has unicode and byte ← need adding.
- Josh: possible “low hanging fruit” or “needs help” issue for outreachy contributors
- see: https://github.com/zarr-developers/zarr-python/pull/789
- Meeting @ Unidata. NSF solicitation. Focus on continued
-
Erik: “have a plan” (for sparse data in zarr and non-zarr)
- Just presented/led a discussion about designing sparse file
format at HPEC
- https://docs.google.com/presentation/d/e/2PACX-1vRFN9gy5Rexzge63kxwakZvQqk1zVlWUjqPtPNXDlllP7jaZ_uQ9nD46yDfADEpMRfnQrnS4p3egaQH/pub
- Just presented/led a discussion about designing sparse file