I tried to set up the problem of calculating the anomaly with respect to the group mean from pangeo-data/distributed-array-examples#4. I used fake data with the same chunking scheme instead of real data to make it easier to test.
I was not expecting this plan though 😬
No idea what's going on here, especially as #145 also has a groupby in it.
I set up the problem like this:
from datetime import datetime, timedelta import numpy as np
time = np.arange(datetime(1979,1,1), datetime(2022,1,1), timedelta(hours=1)).astype('datetime64[ns]') lat = np.linspace(-90.0, 90.0, 721)[::-1].astype(np.float32) lon = np.linspace(0.0, 359.8, 1440).astype(np.float32)
def create_cubed_data(t_length): return xr.DataArray( name="asn", data=cubed.array_api.astype(cubed.random.random((t_length, 721, 1440), chunks=(31, -1, -1), spec=spec), np.float32), dims=['time', 'latitude', 'longitude'], coords={'time': time[:t_length], 'latitude': lat, 'longitude': lon} ).to_dataset()
datasets = { '1.5GB': create_cubed_data(372), '15GB': create_cubed_data(3720), '150GB': create_cubed_data(37200), '1.5TB': create_cubed_data(372000), }
1.5 GB dataset,
1.5e+01 GB dataset,
1.5e+02 GB dataset,
1.5e+03 GB dataset
for scale, ds in datasets.items(): print(f'{ds.nbytes / 1e9:.2} GB dataset, ')
workloads['1.5GB']['asn'].data.visualize()
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4