Master issue to track progress of merging xarray-datatree into xarray main
. Would close #4118 (and many similar issues), as well as one of the goals of our development roadmap.
Also see the project board for DataTree integration.
On calls in the last few dev meetings, we decided to forget about a temporary cross-repo from xarray import datatree
(so this issue supercedes #7418), and just begin merging datatree into xarray main directly.
See #8747
Task list:To happen in order:
open_datatree
in xarray. This doesn't need to be performant initially, and it would initially return a datatree.DataTree
object. EDIT: We decided it should return an xarray.DataTree
object, or even xarray.core.datatree.DataTree
object. So we can start by just copying the basic version in datatree/io.py
right now which just calls open_dataset
many times. add open_datatree to xarray #8697
Triage and fix issues: figure out which of the issues on xarray-contrib/datatree need to be fixed before the merge (if any).
Merge in code for DataTree
class. I suggest we do this by making one PR for each module, and ideally discussing and merging each before opening a PR for the next module. (Open to other workflow suggestions though.) The main aim here being lowering the bus factor on the code, confirming high-level design decisions, and improving details of the implementation as it goes in.
Suggested order of modules to merge:
datatree/treenode.py
- defines the tree structure, without any dimensions/data attached, Migrate treenode module. #8757datatree/datatree.py
- adds data to the tree structure, Migrate datatree.py module into xarray.core. #8789datatree/iterators.py
- iterates over a single tree in various ways, currently copied from anytree, Migrate iterators.py for datatree. #8879datatree/mapping.py
- implements map_over_subtree
by iterating over N trees at once Migrate datatree mapping.py #8948,datatree/ops.py
- uses map_over_subtree
to map methods like .mean
over whole trees ( Migration of datatree/ops.py -> datatree_ops.py #8976),datatree/formatting_html.py
- HTML repr, works but could do with some optimization Migrate formatting_html.py into xarray core #8930,datatree/{extensions/common}.py
- miscellaneous other features e.g. attribute-like access ( Migrate datatreee assertions/extensions/formatting #8967).Expose datatree API publicly. Actually expose open_datatree
and DataTree
in xarray's public API as top-level imports. The full list of things to expose is:
open_datatree
DataTree
map_over_subtree
assert_isomorphic
register_datatree_accessor
Refactor class inheritance - Dataset
/DataArray
share some mixin classes (e.g. DataWithCoords
), and we could probably refactor DataTree
to use these too. This is low-priority but would reduce code duplication.
Can happen basically at any time or maybe in parallel with other efforts:
xr.open_datatree
exists, we can start refactoring xarray's backend classes to support a general Backend.open_datatree
method for any backend that can open multiple groups. Then we can make sure this is more performant than the naive implementation, i.e. only opening the file once. See also Improving performance of open_datatree #8994.open_datatree
in BackendEntrypoint for preliminary DataTree support #7437,.reorder_nodes
and ideas we've only discussed like API for filtering / subsetting xarray-contrib/datatree#79 and Tree-aware dataset handling/selection xarray-contrib/datatree#254 (cc @dcherian who has had useful ideas here)datatree
repositoryxarray.tutorial.open_datatree
- I've been meaning to make a tutorial datatree object for ages. There's an issue about it, but actually now I think something close to the CMIP6 ensemble data that @jbusecke and I used in our pangeo blog post would already be pretty good. Once we have this it becomes much easier to write docs about some advanced features.Anyone is welcome to help with any of this, including but not limited to @owenlittlejohns , @eni-awowale, @flamingbear (@etienneschalk maybe?).
cc also @shoyer @keewis for any thoughts as to the process.
owenlittlejohns, flamingbear, eni-awowale, norlandrhagen, bzah and 1 morejhamman
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4