2024-04-03
Attending: Josh Moore (JM), Sanket Verma (SV), Thomas Nicholas (TN), Alfonso Ladino Rincon (ALR), Davis Bennett (DB), Florian Ziemen (FZ), Norman Rzepka (NR), Eric Perlman (EP), Gabor Kovacs (GB)
TL;DR:
The team discussed updates on the CZI EOSS6 application, upcoming Zarr-Python core-devs meeting, and VirtualiZarr project. Topics included improving chunk loading, cloud storage examples, monthly PR/issue meetings, and resolving ABSStore test failures.
Updates:
- CZI EOSS6 Application Update
- Not funded
- Looking for other grant opportunities if anyone has ideas
- Zarr-Python core-devs assemble!
- Please respond to the poll: https://whenisgood.net/tswj9kd (meeting timeline: 4/15-4/30)
- Looking to get the people who don’t attend the meeting :smile:
Meeting Minutes:
- Introductions w/ where did you grow up
- Sanket - Delhi, India
- Tom - Small village in english countryside
- Alfonso - Bogota, Columbia
- Davis - Chapel Hill, North Carolina
- Eric - Berkeley, California
- Josh - Fairhope, Alabama
- Florian - Places in Germany and US
- Norman - mid-sized city in Germany
- Gabor - Chaperone, Hungary
- VirtualiZarr
- (Tom: Sanket told me to talk about it in this meeting)
- repo
- Zulip thread
- TN: explains VirtualiZarr
- DB: Pydantic model for Zarr?
- TN: Yes, because it loads in-memory objects
- DB: Made Pydantic model for Zarr: https://github.com/janelia-cellmap/pydantic-zarr - currently under Janelia but can ask to move it under zarr-developers
- DB: Several class can be decorated in ZP-V3
- DB: Would be super interested in VirtualiZarr - have already uploaded legacy data to cloud
- TN: Would be good to go in a direction to have multiple readers for various file formats
- FZ: One index file has all the metadata in Kerchunk - is there something in your package?
- TN: Instead of storing chunks in disk you’re storing
.JSON
s in memory - which would be language agnostic - FZ: Have started using parquet instead of
.JSON
as it doesn’t scale up - TN: We can store in parquet as well - JM: Could write the filepaths/byte ranges into their own specialized zarr arrays - that would be scalable
- TN: To deal with the scaling -
- JM: Could place in
must_understand
flag while writing ZEP which looks up for Zarr - DB: Is concatenation always chunk aligned? - TN: Treating it as chunk aligned for now
- DB: shares screen and shows
test_array.py
from VirtualiZarr - TN: Need to think a bit more about the implication of changing the concatenation style
- DB: Will create an issue in VirtualiZarr for slicing issue
- TN: Has there been any resolution? NR: Not yet, to not break the existing API
- TN: Lazy indexing problem - see xarray for example https://github.com/pydata/xarray/issues/5081
- EP: Will definitely take a look!
- NR: Steps to standardise it?
- TN: Waiting for Zarr-Python refactor to complete and then chunk manifest ZEP accepted
- Monthly meetings to close issues and merge PRs
- https://ossci.zulipchat.com/#narrow/stream/423692-Zarr/topic/.5BPROPOSAL.5D.20Monthly.20meetings.20to.20close.20PRs.20.26.20issues
- :thumbsup: from Davis, Joe and Josh; thoughts?
- EP: Napari did this and there were some good outcomes - PR and issues were merged and closed respectively but also some of them were kept open for continued discussion
- JM: Would also be good to have them quaterly
- SV to find a good time and schedule meeting
- The tests related to ABSStore are failing with an internal server error
azure.core.exceptions.HttpResponseError
very frequently these days - any workaround?- DB: Pulling it out would be the best idea!
- https://github.com/zarr-developers/zarr-python/pull/1714 - good to go?
- Thanks, JM for merging!
- ALR: shares screen and starts representing
- ALR: Discovered issues
- TN: Haven’t optimised
open_datatree
so the issue is not surprising - TN: Development effort is move datatree upstream to Xarray - part of the Xarray
- ALR: Will be opening issue in datatree repo
- ALR: Slides: https://drive.google.com/file/d/1N9Zq4Uly3O1bNYzFVfeFNavyyS1Jn26o/view?usp=sharing
- DB: shares screen and shows PyDantic-Zarr new features
- TABLED
- Appetite for Jupyter notebooks in tutorials? - https://github.com/zarr-developers/zarr-python/pull/1163