A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/fsspec/kerchunk/issues/377 below:

Refactor MultiZarrToZarr into multiple functions · Issue #377 · fsspec/kerchunk · GitHub

Problem

MultiZarrToZarr is extremely powerful but rather hard to use.

This is important - kerchunk has been transformative, so we increasingly recommend it as the best way to ingest large amounts of data into the pangeo ecosystem's tools. However that means we should make sure the kerchunk user experience is smooth, so that new users don't get stuck early on.

Part of the problem is that this one MultiZarrToZarr function can do many different things. Contrast with xarray - when combining multiple datasets into one, xarray takes some care to distinguish between a few common cases/concepts (we even have a glossary):

  1. Concatenation along a single existing dimension. Achieved by xr.concat where dim is a str
  2. Concatenation along a single new dimension (optionally providing new coordinates to use along that new dimension). Achieved by xr.concat where dim is a set of values
  3. Merging of multiple variables which already share dimensions, first aligned according to their coordinates. Achieved by xr.merge
  4. "Combining" by order given, which means some ordered combination of concatenation along one or more dimensions and/or merging. Achieved by xr.combine_nested
  5. "Combining" by coordinate order, which again means some ordered combination of concatenation along one or more dimensions and/or merging, but the order is specified by information in the datasets' coordinates. Achieved by xr.combine_by_coords

In kerchunk it seems that the recommended way to handle operations resembling all 5 of these cases is through MultiZarrToZarr. It also cannot currently easily handle certain types of multi-dimensional concatenation.

Suggestion

Break up MultiZarrToZarr by defining a set of functions similar to xarray's merge/concat/combine/unify_chunks that consume and produce VirtualZarrStore objects (EDIT: see #375).

Advantages Questions

cisaacstern, ivirshup and abkfenriscisaacstern, abkfenris, sharkinsspatial, ahuang11, moriahc and 1 more


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4