Use K8s CustomResourceDefinition to replace Airflow Git Sync strategy. The main idea of the project is to start a synchronization service with Quarkus Operator on each airflow pod to synchronize the DAG/files into the DAG folder.
This project has included the docker image packaging part (buildconfig or quarkus build) and the modified helm chart template based on the official airflow project (https://github.com/apache/airflow/tree/main/chart).
kubectl
to create rbac and CustomResourceDefinition
(operator can automatically create resources)Resource description can be referred to 02-crd.yaml . There are several important attributes in CRD
, which are described here:
type
Type of CRD, it can be dag_file
, file
or dag_yaml
. dag_file
must be a DAG description file. file
can be a python or other text format file.dag_yaml
reference dag-factory, but add some changes dag_file
path
File path. If the file path is empty, it defaults to the root directory of dags
, otherwise it is a subdirectory under dags
file_name
If type
is file
, we need a file_name
. dag_name
If type
is dag_file
or dag_yaml
, we need a dag_name
. If dag_name
don't have .py
suffix, the operator will automatically append it. crd name content
If type
is dag_file
or file
, It is the content of the file. paused
If paused
is not empty, the operator will scan the DAG status and automatically pause / unpause the task. dag_yaml
The described of DAG by yaml, For details, please refer to dag-factory
We can run our application in dev mode that enables live coding using:
./mvnw compile quarkus:dev
An example has been in /example
folder. In /example
, it includes RBAC
,CRD
, some cases and Deployment
for test.
If we use OpenShift, we can use BuildConfig
or Tekton/Pipline
to build a native image. Otherwise, we can create a native executable using:
./mvnw package -Pnative # if use macOS, you should use -Dquarkus.native.container-build=true to build quarkus in docker with a linux environment docker build -f src/main/docker/Dockerfile.native -t quarkus/airflow-dag-operator .
Or, if we don't have GraalVM installed, we can run the native executable build in a container using:
./mvnw package -Pnative -Dquarkus.native.container-build=true docker build -f src/main/docker/Dockerfile.native -t quarkus/airflow-dag-operator .
Helm dependency update to add postgresql chart and lint
. We need helm3 to build.
# dependency update helm dep update # lint helm lint # debug helm install --dry-run --debug -f values.yaml airflow -n airflow .
Deploy Chart
# install helm install -f values.yaml airflow -n airflow . # upgrade helm upgrade -f values.yaml airflow -n airflow . # uninstall helm uninstall airflow
We need to rebuild the image
# if want to support pause, we need to build by change `quarkus.datasource.jdbc` from false to true ./mvnw package -Pnative -Dquarkus.datasource.jdbc=true
Due to the complexity of parsing DAG's python codes, we need to ensure that dag_name
and dag_id
are consistent for now.
Note that the helm has not been modified yet right now! by design, the operator will only turn on support pause
on the scheduler node to avoid repeated executions.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4