As a back-end provider who wants to provide its datasets, processes and infrastructure to a broader audience through a standardized interface you may want to implement a driver for openEO.
First of all, you should read carefully the getting started guide for service providers.
There are two main components involved with openEO and Xarray:
# Process Graph Parser for PythonThis pg-parser parses OpenEO process graphs from raw JSON into fully traversible networkx graph objects.
The ProcessRegistry
can be imported from the pg-parser and includes Process
objects, that include a
The ProcessRegistry
automatically maps from the name of a process to the spec
and to the implementation
. Every Process
in the ProcessRegistry
requires a spec
, while implementation
and namespace
are optional.
An example on how to use the pg-parser can be found here (opens new window).
# Python Processes for openEOThis package includes the implementations of openEO processes, using Xarray and Dask. Currently, the load_collection
and save_result
process are not included as these implementations can differ widely for different backends.
The specs
can be found in the openeo-processes-dask
as a submodule. That way, the specification and the implementation are stored close to each other.
As mentioned before, the load_collection
and save_result
processes are back-end-specific and therefore not included in openeo-processes-dask (opens new window). The load_collection (opens new window) process should return a raster-cube
object - to be compliant with the openeo-processes-dask
implementations, this should be realized by a xarray.DataArray
loaded with dask
.
For testing purposes with DataArrays
- which can be loaded from one file - the xarray.open_dataarray()
function can be used to implement a basic version of load_collection
.
Large data sets can be organised as opendatacube Products
or as STAC Collections
.
opendatacube Products
: The implementation of load_collection
can include the opendatacube
function datacube.Datacube.load()
. It is recommended to use the dask_chunks
parameter, when loading the data. The function returns a xarray DataSet
, in order to be compliant with openeo-processes-dask
, it can be converted to a DataArray
using the Dataset.to_array(dim='bands')
function. A sample load_collection
process using OpenDatacube can be found here (opens new window).
STAC Collections
: Alternatively, the load_collection
process can be implemented using the odc.stac.load()
function. To make use of dask
, the chunks
parameter must be set. Just as in the previous case, the resulting xarray DataSet
can be converted to a DataArray
with Dataset.to_array(dim='bands')
. A similar implementation is the one of the load_stac
process available here (opens new window).
The client-side processing functionality allows to test and use openEO with its processes locally, i.e. without any connection to an openEO back-end. It relies on the projects openeo-pg-parser-networkx (opens new window), which provides an openEO process graph parsing tool, and openeo-processes-dask (opens new window), which provides an Xarray and Dask implementation of most openEO processes.
You can find more information and usage examples in the openEO Python client documentation available here (opens new window).
# Adding a new processTo add a new process, there are changes required in the openeo-processes-dask (opens new window).
The HTTP rest interface should have a processes
endpoint that reflects the process specs from openeo-processes-dask
.
Currently, openeo-processes-dask (opens new window) includes the process definitions as a submodule
in the openeo-processes-dask/specs
. The submodule can be found under https://github.com/eodcgmbh/openeo-processes, which is a fork from https://github.com/Open-EO/openeo-processes to reflect which processes (with their implementations) are actually available in openeo-processes-dask
.
openeo-processes-dask/tests
ideally using dask. The create_fake_rastercube
from the openeo-processes-dask/tests/mockdata
can be used for testing, with the backend
parameter set to numpy
or dask
.The next step would be to set up a HTTP REST interface (i.e. an implementation of the openEO HTTP API) for the new openEO environment. It must be available in front of the process implementations to properly answer openEO client requests. Currently, the EODC (opens new window) and Eurac Research (opens new window) back-ends use Xarray and Dask and thus are the first implementations of back-ends to look at.
If you have any questions, please contact us.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4