2022-10-20

Attending: Ward Fisher (WF), John Kirkham (JK), Jeremy Maitin-Shepard (JMS)

TL;DR

WF is working on the maintenance NetCDF release candidate (v4.9.1-rc1), and JMS added CMake support to TensorStore. After this, JMS initiated a discussion on Path structure and was stretched for the remaining meeting.

Updates:

  • (WF) Working on maintenance netcdf release candidate (v4.9.1-rc1). No new features, just bug fixes and improvements.
  • (JMS) Added CMake support to TensorStore
  • Discussion about CMake, dependency management
    • https://cmake.org/cmake/help/book/mastering-cmake/chapter/CDash.html
    • https://github.com/cpm-cmake/CPM.cmake

Meeting Minutes:

  • (JMS) Path structure
    • Require or encourage root directory to end in .zarr
    • How to name all the metadata files?
    • Root metadata could contain extension information
    • (JK) Mentioned .zmeta metadata file with paths to metadata file
      • (JMS) About listing
      • (WF) Possible issues with writing
      • (WF) Spec vs. library tension
      • (JK) Have file expire?
      • (JMS) Handle as read-only
      • (JK) Could also delete as part of writing?
    • (JMS) HDF5 has hierachary and Zarr replicates this
      • Have some array and non-array data next to each other
      • (JK) Examples?
        • (JMS) Segmentations & mesh representations
        • (JMS) Collection of volumes with annotations related to them
      • (WF) Have Zarr hierarchy with non-Zarr?
        • (JMS) Only have single individual arrays
        • (WF) Wouldn’t have considered this structure
        • (WF) Does there need to be something in the spec about interleaving data?
        • (WF) Maybe interleaving poses some challenges
        • (JMS) Doesn’t NetCDF have extra files as well?
        • (WF) Yes. Extra metadata used to map Zarr model to NetCDF model.
        • (JMS) Reason to use this structure as opposed to Zarr metadata files?
        • (WF) NetCDF supports different formats HDF5, Zarr, etc.
        • (JMS) Have user defined attributes. Types are stored in metadata file? Could those be in zattrs?
        • (WF) Yes. Not sure
      • (JMS) Hierarchy becomes more apparent with V3 as opposed to V2
        • (WF) Groups were a new feature that users were slow to pick up on
        • (JK) Does adding more top-level metadata cause issues?
        • (JMS) Could it contain the metadata?
        • (WF) Maybe include subset of metadata
        • (WF/JMS) Perhaps special case single array use case
      • (JK) How does data relate in non-hierachical form
        • (JMS) Related, but not all Zarr data
        • (JK) Would other kinds of chunk formats (standardizing on kerchunk) be useful
        • (JMS) Meshes probably don’t make sense in this way
        • (JMS) Neuroglancer meshes are a good example
        • (JMS) Sparse arrays seem similar in that they might be better handled by being their own file format
        • (WF) NetCDF users mention performance issues in moving to new version. Usually suggest using old NetCDF. Maybe same with V2/V3?
        • (JMS) Want to use V3 (sharding being of value).
        • (JK) Including unstructured binary blobs in Zarr?
        • (JMS) Has a group of files for mesh
        • (JK) Maybe ignore specific paths?
        • (WF) Having mixed media is valuable though can be logisticially tricky
        • (WF) What defines Zarr as a data model? At least need to say some behavior is undefined (mixed media). Ideally ignores mixed media files.