2024-04-03

Attending: Josh Moore (JM), Sanket Verma (SV), Thomas Nicholas (TN), Alfonso Ladino Rincon (ALR), Davis Bennett (DB), Florian Ziemen (FZ), Norman Rzepka (NR), Eric Perlman (EP), Gabor Kovacs (GB)

TL;DR:

The team discussed updates on the CZI EOSS6 application, upcoming Zarr-Python core-devs meeting, and VirtualiZarr project. Topics included improving chunk loading, cloud storage examples, monthly PR/issue meetings, and resolving ABSStore test failures.

Updates:

  • CZI EOSS6 Application Update
    • Not funded
    • Looking for other grant opportunities if anyone has ideas
  • Zarr-Python core-devs assemble!
    • Please respond to the poll: https://whenisgood.net/tswj9kd (meeting timeline: 4/15-4/30)
    • Looking to get the people who don’t attend the meeting :smile:

Meeting Minutes:

  • Introductions w/ where did you grow up
    • Sanket - Delhi, India
    • Tom - Small village in english countryside
    • Alfonso - Bogota, Columbia
    • Davis - Chapel Hill, North Carolina
    • Eric - Berkeley, California
    • Josh - Fairhope, Alabama
    • Florian - Places in Germany and US
    • Norman - mid-sized city in Germany
    • Gabor - Chaperone, Hungary
  • VirtualiZarr
    • (Tom: Sanket told me to talk about it in this meeting)
    • repo
    • Zulip thread
    • TN: explains VirtualiZarr
    • DB: Pydantic model for Zarr?
    • TN: Yes, because it loads in-memory objects
    • DB: Made Pydantic model for Zarr: https://github.com/janelia-cellmap/pydantic-zarr - currently under Janelia but can ask to move it under zarr-developers
    • DB: Several class can be decorated in ZP-V3
    • DB: Would be super interested in VirtualiZarr - have already uploaded legacy data to cloud
    • TN: Would be good to go in a direction to have multiple readers for various file formats
    • FZ: One index file has all the metadata in Kerchunk - is there something in your package?
    • TN: Instead of storing chunks in disk you’re storing .JSONs in memory - which would be language agnostic
    • FZ: Have started using parquet instead of .JSON as it doesn’t scale up - TN: We can store in parquet as well
    • JM: Could write the filepaths/byte ranges into their own specialized zarr arrays - that would be scalable
    • TN: To deal with the scaling -
    • JM: Could place in must_understand flag while writing ZEP which looks up for Zarr
    • DB: Is concatenation always chunk aligned? - TN: Treating it as chunk aligned for now
    • DB: shares screen and shows test_array.py from VirtualiZarr
    • TN: Need to think a bit more about the implication of changing the concatenation style
    • DB: Will create an issue in VirtualiZarr for slicing issue
    • TN: Has there been any resolution? NR: Not yet, to not break the existing API
    • TN: Lazy indexing problem - see xarray for example https://github.com/pydata/xarray/issues/5081
    • EP: Will definitely take a look!
    • NR: Steps to standardise it?
    • TN: Waiting for Zarr-Python refactor to complete and then chunk manifest ZEP accepted
  • Monthly meetings to close issues and merge PRs
  • The tests related to ABSStore are failing with an internal server error azure.core.exceptions.HttpResponseError very frequently these days - any workaround?
    • DB: Pulling it out would be the best idea!
  • https://github.com/zarr-developers/zarr-python/pull/1714 - good to go?
    • Thanks, JM for merging!
  • ALR: shares screen and starts representing
  • DB: shares screen and shows PyDantic-Zarr new features
  • TABLED