GitHub Actions can be used to trigger runs of your CI/CD workflows from your GitHub repositories and allows you to automate your build, test, and deployment CI/CD pipeline.
This article provides information about the GitHub Actions developed by Databricks and examples for common use cases. For information about other CI/CD features and best practices on Databricks, see CI/CD on Databricks and Best practices and recommended CI/CD workflows on Databricks.
Databricks GitHub ActionsâDatabricks has developed the following GitHub Actions for your CI/CD workflows on GitHub. Add GitHub Actions YAML files to your repo's .github/workflows
directory.
note
This article covers GitHub Actions, which is developed by a third party. To contact the provider, see GitHub Actions Support.
Run a CI/CD workflow that updates a Production Git folderâThe following example GitHub Actions YAML file updates a workspace Git folder when a remote branch updates. For information about the Production Git folder approach for CI/CD, see Production Git folder.
This example uses workload identity federation for GitHub Actions for enhanced security, and requires that you first follow the steps in Enable workload identity federation for GitHub Actions to create a federation policy.
YAML
name: Sync Git Folder
concurrency: prod_environment
on:
push:
branches:
- git-folder-cicd-example
permissions:
id-token: write
contents: read
jobs:
deploy:
runs-on: ubuntu-latest
name: 'Update git folder'
environment: Prod
env:
DATABRICKS_AUTH_TYPE: github-oidc
DATABRICKS_HOST: ${{ vars.DATABRICKS_HOST }}
DATABRICKS_CLIENT_ID: ${{ secrets.DATABRICKS_CLIENT_ID }}
steps:
- uses: actions/checkout@v3
- uses: databricks/setup-cli@main
- name: Update git folder
run: databricks repos update /Workspace/<git-folder-path> --branch git-folder-cicd-example
Run a CI/CD workflow with a bundle that runs a pipeline updateâ
The following example GitHub Actions YAML file triggers a test deployment that validates, deploys, and runs the specified job in the bundle within a pre-production target named âdevâ as defined within a bundle configuration file.
This example requires that there is:
working-directory: .
This bundle configuration file should define a Databricks workflow named my-job
and a target named dev
. See Databricks Asset Bundle configuration.SP_TOKEN
, representing the Databricks access token for a Databricks service principal that is associated with the Databricks workspace to which this bundle is being deployed and run. See Encrypted secrets.YAML
name: 'Dev deployment'
concurrency: 1
on:
pull_request:
types:
- opened
- synchronize
branches:
- main
jobs:
deploy:
name: 'Deploy bundle'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: databricks/setup-cli@main
- run: databricks bundle deploy
working-directory: .
env:
DATABRICKS_TOKEN: ${{ secrets.SP_TOKEN }}
DATABRICKS_BUNDLE_ENV: dev
pipeline_update:
name: 'Run pipeline update'
runs-on: ubuntu-latest
needs:
- deploy
steps:
- uses: actions/checkout@v3
- uses: databricks/setup-cli@main
- run: databricks bundle run my-job --refresh-all
working-directory: .
env:
DATABRICKS_TOKEN: ${{ secrets.SP_TOKEN }}
DATABRICKS_BUNDLE_ENV: dev
You may also want to trigger production deployments. The following GitHub Actions YAML file can exist in the same repo as the preceding file. This file validates, deploys, and runs the specified bundle within a production target named âprodâ as defined within a bundle configuration file.
YAML
name: 'Production deployment'
concurrency: 1
on:
push:
branches:
- main
jobs:
deploy:
name: 'Deploy bundle'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: databricks/setup-cli@main
- run: databricks bundle deploy
working-directory: .
env:
DATABRICKS_TOKEN: ${{ secrets.SP_TOKEN }}
DATABRICKS_BUNDLE_ENV: prod
pipeline_update:
name: 'Run pipeline update'
runs-on: ubuntu-latest
needs:
- deploy
steps:
- uses: actions/checkout@v3
- uses: databricks/setup-cli@main
- run: databricks bundle run my-job --refresh-all
working-directory: .
env:
DATABRICKS_TOKEN: ${{ secrets.SP_TOKEN }}
DATABRICKS_BUNDLE_ENV: prod
Run a CI/CD workflow that builds a JAR and deploys a bundleâ
If you have a Java-based ecosystem, your GitHub Action will need to build and upload a JAR before deploying the bundle. The following example GitHub Actions YAML file triggers a deployment that builds and uploads a JAR to a volume, then validates and deploys the bundle to a production target named "prod" as defined within the bundle configuration file. It compiles a Java-based JAR, but the compilation steps for a Scala-based project are similar.
This example requires that there is:
working-directory: .
DATABRICKS_TOKEN
environment variable that represents the Databricks access token that is associated with the Databricks workspace to which this bundle is being deployed and run.DATABRICKS_HOST
environment variable that represents the Databricks host workspace.YAML
name: Build JAR and deploy with bundles
on:
pull_request:
branches:
- main
push:
branches:
- main
jobs:
build-test-upload:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Java
uses: actions/setup-java@v4
with:
java-version: '17'
distribution: 'temurin'
- name: Cache Maven dependencies
uses: actions/cache@v4
with:
path: ~/.m2/repository
key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
restore-keys: |
${{ runner.os }}-maven-
- name: Build and test JAR with Maven
run: mvn clean verify
- name: Databricks CLI Setup
uses: databricks/setup-cli@v0.9.0
- name: Upload JAR to a volume
env:
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
run: |
databricks fs cp target/my-app-1.0.jar dbfs:/Volumes/artifacts/my-app-${{ github.sha }}.jar --overwrite
validate:
needs: build-test-upload
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Databricks CLI Setup
uses: databricks/setup-cli@v0.9.0
- name: Validate bundle
env:
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
run: databricks bundle validate
deploy:
needs: validate
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Databricks CLI Setup
uses: databricks/setup-cli@v0.9.0
- name: Deploy bundle
env:
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
run: databricks bundle deploy --target prod
Additional resourcesâ
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4