2023-06-28

Attending: Josh Moore (JM), Sanket Verma (SV), Davis Bennett (DB), Ward Fisher (WF), Patty Fricke (PF), Jeremy Maitin-Shepard (JMS), Norman Rzepka (NR)

TL;DR:

We started the meeting by going around the table and asking what the last movie/TV show everyone watched was! Some excellent recommendations here.

PF asked about the ZEP0003 progress. SV has been working on a blog titled ‘Zarr vs HDF5’ and asked the community what they’d like to see in the blog. DB’s Pydantic-Zarr will be publicised shortly. WF shared the development at Unidata regarding NetCDF, and Zarr is one of the focus in the upcoming months.

Updates:

New blog post: https://zarr.dev/blog/zarr-talks/ 🎉

Meeting Minutes:

Introductions with the recent movie/tv show you watched 🍿🎬🥤
- SV: Black Mirror
- DB: A Room with a View
- WF: RRR
- JM: Heartstopper
- PF: Extraction 2
- NR: Lidia Poet
- JMS: John Wick Chapter 4
PF: ZEP0003 progress
- SV: Not a lot
- PF: reading in CSVs (each file a timepoint) with xarray
- JMS: previous use cases were looking to move chunk size over time (t=1 to begin; then later t>1)
- DB: delicate point – anticipating a change of the metadata. performing a rechunk live could be trickier. code samples would be useful.
- SV: Please ping in the discussion post to get involved
SV: Zarr vs. HDF5 blog post
- What would you like to have in there?
- WF: Audience conflate NetCDF and HDF5
  - They are not same data model
  - Storage layers - cloud ready - big and cool community
  - Zarr doesn’t need a server to run - lot of good momentum
  - There isn’t no implementation and it would be complex to come up with one
  - NetCDF and Zarr has straightforward API
  - HDF5 parallel support with the help of DOE funding
  - Supercomputer - Uses HDF5 based MPI
- DB: HDF5 has C implementation
  - Comparitively cheap to spin up implementations - maybe a bad thing?
  - Much more complicated to implement
  - Difficult to do massively HDF5 parallel writing
  - Zarr could have a HDF5 backend
- JMS: Key point - parallel write support
  - Single threaded - fetching chunks from a single thread
- WF: HPC/Supercomputing community almost exclusively using MPI
SV: DV: Pydantic-Zarr - ready to publicise it?
- DV: Wait, until I polish the readme
- DB: Any apetite for having this in Zarr-Python? May need to change the creation API
- JM: There are some PRs which should be merged first in order to introduce the Pydantic change into Zarr-Python - we don’t want you to wait on another PR
- DB: I think the code is computationaly cheap and it’d be easy to integrate it
JM: Looking forward for SciPy 2023 - if anyone around please come and meet us
WF: NetCDF roadmap
- Working on high-level roadmapping for 5+ year cycle
- Will be adopting V3 very soon