Links:

News in zarr-python v3:

Sharding

https://zarr.readthedocs.io/en/latest/user-guide/arrays.html#sharding

Shards are the units of writing, chunks are the units of reading.

VirtualiZarr

https://virtualizarr.readthedocs.io/en/latest/

Provides abstraction over non-cloud native formats, at least NetCDF, to be read efficiently in a cloud environments.

Other related terms:

icechunk

Bringing transactions into action solves the issue with parallel reading and writing to the zarr dataset.

Replaces fsspec for manipulation of the data between object storage and zarr.

Implemented in Rust.

Enables multiplayer mode for shared zarr datasets in cloud.

Workflow

  1. create icechunk store
  2. manipulate with zarr dataset
  3. commit the changes

Enables Data version control for zarr.

Virtual datasets in Icechunk