2022-10-06
Attending: Ward Fisher (WF), Josh Moore (JM), Jeremy Maitin-Shepard (JMS), Greg Lee (GL), Jonathan Striebel (JS)
TL;DR
JM shared that there were some good conversations around OME-Zarr yesterday. The summary is available here. WF shared that Kitware is looking for partners and a link to the sign-up form. GL shared that during the CZI Open-Science Summit 2022, he worked on writing tests for Xarray. After this, there was an extensive discussion on URL syntax initiated by JMS.
Updates:
- miscellaneous reading before the meeting (JM)
- NGFF (JM)
- https://forum.image.sc/t/ome-ngff-community-call-transforms-and-tables/71792/10
- Good conversations around OME-Zarr yesterday
- Enthusiasm for Kitware (WF)
- Looking for partners. Have form on webpage.
- Unidata an option. They’ve mentioned Zarr a couple of times (Kitware Blog).
- xarray test (GL)
- during czi conference.
- release of 2.13 hopefully fixed it all :tada:
Meeting Minutes:
- URL syntax? (JMS)
- helps to figure out the metadata location.
- Josh: great idea. have several ongoing discussions at the NGFF level
- current proposal would be to support URIs internally (relative, absolute, remote)
- however, in V2
- JMS: in v3 the root exists
- though not entirely clear that the new metadata organization is necessary
- designed for S3 where there’s no directory, but other problems exist
- Josh: summarized previous discussions for Greg
- GL, thoughts on the V3 situation?
- GL: at the moment, you need helper methods to do that.
- JM: one proposal was to have the metadata be the main directory which lets you then bootstrap the chunk loading
- JMS: support multiple?
- JM: conceivably. as extension or configuration.
- JM: downside for consolidated metadata is that nothing exists in the metadata hierarchy
- workaround of having a thin-hierarchy only with references to where the metadata exists
- JS: losing the ability to be able to next any hierarchy. (everything is a root)
- JM: are we proposing rolling it back completely
- JMS: problem is the URI+rootpath metadata
- JS: walking up the hierarchy would be an option (URL doesn’t actually point)
- JMS: would be nicer if you don’t have to perform a search
- Use case
- URL case
- Desktop double click on something
- Similar issue: Zips :warning:
- JMS: have an additional level
- JM: except ZipStore v2 assumes the whole zip is a zgroup
- JS: propose zip is a special case which is easier
- JMS: unless you are mixing volumetric with a zarr then it wouldn’t be at the top-level
- Btree (JMS): need to be able to compose multiple layers (similar to fsspec and double colons)
- Remote chunk store (or point to V2 chunks)
- Renaming folders (keep data with arrays)
- Options
- Keep “/meta”, clients must know
- Drop “/meta”, direct URLs
?param
syntax#param
syntax- Separator syntax (e.g. “
//
”) - root dir ends in .zarr
- fsspec
::
separator - multiple protocols (git+ssh, zip+zarr)
- further discussion
- JS: without /meta and .zarr requirement, you still don’t know where the root is
- JS: if you drop “/meta” then you can’t name anything “/data”
- JMS: could use something more obfuscated
- JS: why split?
- JMS: if you are not using the filesystem (s3 or gcs) and you want to list all the metadata, it’s not (as) efficient
- JM: “data” could be registered in the metadata so it’s a known (and configurable) thing
- WF: NC anything with leading underscore is assumed reserved for the library
- permitted to create them, but the spec says “please don’t”
- JM:
.z
prefix - WF: utilities and tools can scrape everything with that
- WF: also don’t have to put too much thought into new features
- JMS: would prefer not a
.
prefix because of archiving tools, etc. - then
_z
?
- JMS: root metadata file doesn’t really do anything
- JS: creates ambiguity
- JM: think it was largely for bootstrapping global plugins (e.g. transformers)
- JS: perhaps V2 compatibility
- JMS: not clear you would nest sharding with other transformers. it would be the thing applied to the chunks.
- JS: the metadata needs to be somewhere, and for that can be at the array level
- brief summary
- zarrs are essentially a metadata hierarchy
- that configure (possibly remote) chunk stores
- and the root is identified with .zarr