2023-08-23
Attending: Davis Bennett (DB), Sanket Verma (SV), Josh Moore (JM), Norman Rzepka (NR), Eric Perlman (EP), Ryan Williams (RW), Frederic Leclercq (FL), Brianna Pagan (BP), Ward Fisher (WF)
TL;DR:
We started the meeting by going with a round of introductions. Also, folks shared the last TV show they watched as an ice-breaker. DB has sent a PR to refactor the synchronisation API. To which WF responded, JM wondered how things were in the C land. DB presented thoughts on the Zarr group and the representation of Zarr chunks as dirty arrays.
Updates:
- Pizzarr by Mark Keller is listed on our webpage - https://zarr.dev/implementations/ 🎉
- Zarr-Python 2.16.1 is out - https://github.com/zarr-developers/zarr-python/releases/tag/v2.16.1
- New working groups kicked off by Ryan A., ~1 month ago
- Refactoring (syncing v3 impl) and Benchmarking & Performance
- Running for about 6 months.
- One kick off last week.
- Zarr-Python Refactoring meetings on community calendar (starting from September 6th) - https://zarr.dev/community-calls/
- Any updates from the attendees?
- BP: OGC Charter open for comments: https://www.ogc.org/requests/ogc-to-form-geozarr-standards-working-group-public-comment-sought-on-draft-charter/
- Webinar: OGC - Geozarr : Wed Aug 30th 10:00 AM EDT: https://www4.gotomeeting.com/join/884085509
- JM: Zeiss, C++, and other updates
- NR: ZEP2 got some votes! Please vote!
Meeting Minutes:
- Introductions with the last T.V show you watched
- SV: Two and Half Men
- DB: Wolf hall
- EP: Heartstopper - Season 2
- WF: What we do in the shadows
- BP: Seawolves
- RW: Succession
- JM: Heartstopper - Season 2
- FL: The Glory
- NR: Peppa Pig
- DB: Refactor Sync API PR: https://github.com/zarr-developers/zarr-python/pull/1495
- Contributions are welcome!
- JM: lessons from Bio-Formats – I would have started with immutability
- NR: Zarrita is designed with immutability. only 2 methods for mutating an array (apart from changing the data): resizing and writing attributes
- DB: Should all stores have caching: https://github.com/zarr-developers/zarr-python/issues/1500?
- DB: Improve testing and performance - require a good amount of work but useful
- NR: Current design is composable. testing matrix might be avoided through mocking
- DB: When do you not want a cache? - Turn off cache whenever you want
- NR: Use-case specific - not always useful to turn it off
- BP: https://github.com/NASA-IMPACT/zarr-lakefs
- JM: Using the MOMO card: http://www.meeting-facilitation.co.uk/blog/files/move-on-move-on.html
- JM: How’s the C land?
- WF: Amazon S3 - solving a few issues
- WF: Submitted abstract about NCZarr at AGU w/ Ethan Davis - how can this help the cf community
- WF: Also discussing about ZarrCon at Unidata
- JM: Considering good places to host the conference
- WF: NASA (Roses?) can help too!
- BP: Thinking of having a GeoZarr hackathon during the AGU week
- JM: Domain specific conferences or get everyone together?
- NR and DB: Let’s bring everyone together and we can discuss important stuff like multiscales
- DB: Why not allow child nodes of array? https://github.com/zarr-developers/zarr-python/discussions/1501
- JM: Good to have this conversation with Ward and Dennis
- DB: As an abstract questions - What would break if we have this? (disruptive for ecosystem)
- JM: Third state apart from ON and OFF could be HAVE BOTH
- DB: Formally represent Zarr as dirty arrays? - Using a compressor and roll it up with another compressor
- JM: IPFS use case
- DB: On-boarding someone to Zarr ecosystem - spending time to think about things - No record when writing Zarr w/ Dask fails
- SV: Good to hear the onboarding process - Is there anything critical which should be brought up to refactor team?
- DB: Convert N5 to Zarr - leave N5 behind - copying array method in Zarr breaks in N5 - any operation which is serial in Zarr is a trap (when data is in TB) - will be good to have a warning for large size
- DB: Lazy representation of array before copying can also help
- DB: Creating a hierarchy - using procedural way is not efficient - can write a new ZEP - acronym: ZOM (Zarr Object Model)
- DB: In a perfect world, Zarr-Python would use threads and there would no-GIL Python ;)
- SB: Would be good to see what the perfomance group tackles in the upcoming months!