The best way to distribute large scientific datasets is via the Cloud, in Cloud-Optimized formats 1. But often this data is stuck in archival pre-Cloud file formats such as netCDF.
VirtualiZarr2 makes it easy to create "Virtual" Zarr datacubes, allowing performant access to archival data as if it were in the Cloud-Optimized Zarr format, without duplicating any data.
Please see the documentation.
open_virtual_dataset
.ManifestStore
.xarray.concat
.xarray.open_zarr
.VirtualiZarr grew out of discussions on the Kerchunk repository, and is an attempt to provide the game-changing power of kerchunk but in a zarr-native way, and with a familiar array-like API.
You now have a choice between using VirtualiZarr and Kerchunk: VirtualiZarr provides almost all the same features as Kerchunk.
Development Status and RoadmapVirtualiZarr version 1 (mostly) achieved feature parity with kerchunk's logic for combining datasets, providing an easier way to manipulate kerchunk references in memory and generate kerchunk reference files on disk.
VirtualiZarr version 2 brings:
ManifestStore
abstraction, which allows for loading data without serializing to Kerchunk/Icechunk firstobstore
Future VirtualiZarr development will focus on generalizing and upstreaming useful concepts into the Zarr specification, the Zarr-Python library, Xarray, and possibly some new packages.
We have a lot of ideas, including:
If you see other opportunities then we would love to hear your ideas!
This package was originally developed by Tom Nicholas whilst working at [C]Worthy, who deserve credit for allowing him to prioritise a generalizable open-source solution to the dataset virtualization problem. VirtualiZarr is now a community-owned multi-stakeholder project.
Apache 2.0
Cloud-Native Repositories for Big Scientific Data, Abernathey et. al., Computing in Science & Engineering. ↩
(Pronounced "Virtual-Eye-Zarr" - like "virtualizer" but more piratey 🦜) ↩
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4