2022-10-20
Attending: Ward Fisher (WF), John Kirkham (JK), Jeremy Maitin-Shepard (JMS)
TL;DR
WF is working on the maintenance NetCDF release candidate (v4.9.1-rc1
), and JMS added CMake support to TensorStore. After this, JMS initiated a discussion on Path structure and was stretched for the remaining meeting.
Updates:
- (WF) Working on maintenance netcdf release candidate (
v4.9.1-rc1
). No new features, just bug fixes and improvements. - (JMS) Added CMake support to TensorStore
- Discussion about CMake, dependency management
- https://cmake.org/cmake/help/book/mastering-cmake/chapter/CDash.html
- https://github.com/cpm-cmake/CPM.cmake
Meeting Minutes:
- (JMS) Path structure
- Require or encourage root directory to end in .zarr
- How to name all the metadata files?
- Root metadata could contain extension information
- (JK) Mentioned
.zmeta
metadata file with paths to metadata file- (JMS) About listing
- (WF) Possible issues with writing
- (WF) Spec vs. library tension
- (JK) Have file expire?
- (JMS) Handle as read-only
- (JK) Could also delete as part of writing?
- (JMS) HDF5 has hierachary and Zarr replicates this
- Have some array and non-array data next to each other
- (JK) Examples?
- (JMS) Segmentations & mesh representations
- (JMS) Collection of volumes with annotations related to them
- (WF) Have Zarr hierarchy with non-Zarr?
- (JMS) Only have single individual arrays
- (WF) Wouldn’t have considered this structure
- (WF) Does there need to be something in the spec about interleaving data?
- (WF) Maybe interleaving poses some challenges
- (JMS) Doesn’t NetCDF have extra files as well?
- (WF) Yes. Extra metadata used to map Zarr model to NetCDF model.
- (JMS) Reason to use this structure as opposed to Zarr metadata files?
- (WF) NetCDF supports different formats HDF5, Zarr, etc.
- (JMS) Have user defined attributes. Types are stored in metadata file? Could those be in zattrs?
- (WF) Yes. Not sure
- (JMS) Hierarchy becomes more apparent with V3 as opposed to V2
- (WF) Groups were a new feature that users were slow to pick up on
- (JK) Does adding more top-level metadata cause issues?
- (JMS) Could it contain the metadata?
- (WF) Maybe include subset of metadata
- (WF/JMS) Perhaps special case single array use case
- (JK) How does data relate in non-hierachical form
- (JMS) Related, but not all Zarr data
- (JK) Would other kinds of chunk formats (standardizing on kerchunk) be useful
- (JMS) Meshes probably don’t make sense in this way
- (JMS) Neuroglancer meshes are a good example
- (JMS) Sparse arrays seem similar in that they might be better handled by being their own file format
- (WF) NetCDF users mention performance issues in moving to new version. Usually suggest using old NetCDF. Maybe same with V2/V3?
- (JMS) Want to use V3 (sharding being of value).
- (JK) Including unstructured binary blobs in Zarr?
- (JMS) Has a group of files for mesh
- (JK) Maybe ignore specific paths?
- (WF) Having mixed media is valuable though can be logisticially tricky
- (WF) What defines Zarr as a data model? At least need to say some behavior is undefined (mixed media). Ideally ignores mixed media files.