This topic provides steps and details for authorizing access to Databricks resources when automating Databricks CLI commands or calling Databricks REST APIs from code that will run from an unattended process.
Databricks uses OAuth as the preferred protocol for user authorization and authentication when interacting with Databricks resources outside of the UI. Databricks also provides the unified client authentication tool to automate the refresh of the access tokens generated as part of OAuth's authentication method. This applies to service principals as well as user accounts, but you must configure a service principal with the appropriate permissions and privileges for the Databricks resources it must access as part of its operations.
For more high-level details, see Authorizing access to Databricks resources.
What are my options for authorization and authentication when using a Databricks service principal?âIn this topic, authorization refers to the protocol (OAuth) used to negotiate access to specific Databricks resources through delegation. Authentication refers to the mechanism by which credentials are represented, transmitted, and verifiedâwhich, in this case, are access tokens.
Databricks uses OAuth 2.0-based authorization to enable access to Databricks account and workspace resources from the command line or code on behalf of a service principal with the permissions to access those resources. Once a Databricks service principal is configured and its credentials are verified when it runs a CLI command or calls a REST API, an OAuth token is given to the participating tool or SDK to perform token-based authentication on the service principal's behalf from that time forward. The OAuth access token has a lifespan of one hour, following which the tool or SDK involved will make an automatic background attempt to obtain a new token that is also valid for one hour.
Databricks supports two ways to authorize access for a service principal with OAuth:
Before you start, you must configure a Databricks service principal and assign it the appropriate permissions to access the resources it must use when your automation code or commands request them.
Prerequisite: Create a service principalâAccount admins and workspace admins can create service principals. This step describes creating a service principal in a Databricks workspace. For details on the Databricks account console itself, see Add service principals to your account.
The service principal is added to both your workspace and the Databricks account.
Step 1: Assign permissions to your service principalâBefore you can use OAuth to authorize access to your Databricks resources, you must first create an OAuth secret, which can be used to generate OAuth access tokens for authentication. A service principal can have up to five OAuth secrets.
OAuth secrets have a maximum lifetime of two years. Account admins and workspace admins can create an OAuth secret for a service principal.
On your service principal's details page click the Secrets tab.
Under OAuth secrets, click Generate secret.
Set the secret's lifetime in days. OAuth secrets have a maximum lifetime of 730 days (two years).
Copy the displayed Secret and Client ID, and then click Done.
The secret will only be revealed once during creation. The client ID is the same as the service principal's application ID.
Account admins can also generate an OAuth secret from the service principal details page in the account console.
As an account admin, log in to the account console.
In the sidebar, click User management.
On the Service principals tab, select your service principal.
Under OAuth secrets, click Generate secret.
Set the secret's lifetime in days. OAuth secrets have a maximum lifetime of 730 days (two years).
Copy the displayed Secret and Client ID, and then click Done.
To use OAuth authorization with the unified client authentication tool, you must set the following associated environment variables, .databrickscfg
fields, Terraform fields, or Config
fields:
https://accounts.cloud.databricks.com
for account operations or the target workspace URL, for example https://dbc-a1b2345c-d6e7.cloud.databricks.com
for workspace operations.To perform OAuth service principal authentication, integrate the following within your code, based on the participating tool or SDK:
For the Databricks CLI, do one of the following:
.databrickscfg
file as specified in this article's âProfileâ section.Environment variables always take precedence over values in your .databrickscfg
file.
See also OAuth machine-to-machine (M2M) authentication.
note
OAuth service principal authentication is supported on the following Databricks Connect versions:
For Databricks Connect, you can either:
.databrickscfg
file as described on the Profile tab. Also set the cluster_id
to your workspace instance URL.DATABRICKS_CLUSTER_ID
to your workspace instance URL.Values in .databrickscfg
take precedence over environment variables.
To initialize Databricks Connect with these settings, see Compute configuration for Databricks Connect.
For the Databricks extension for Visual Studio Code, do the following:
.databrickscfg
file for Databricks workspace-level operations as specified in this article's âProfileâ section.https://dbc-a1b2345c-d6e7.cloud.databricks.com
, and then press Enter
.For more details, see Set up authorization for the Databricks extension for Visual Studio Code.
For account-level operations, for default authentication:
provider "databricks" {
alias = "accounts"
}
For direct configuration (replace the retrieve
placeholders with your own implementation to retrieve the values from the console or some other configuration store, such as HashiCorp Vault. See also Vault Provider). In this case, the Databricks account console URL is https://accounts.cloud.databricks.com
:
provider "databricks" {
alias = "accounts"
host = <retrieve-account-console-url>
account_id = <retrieve-account-id>
client_id = <retrieve-client-id>
client_secret = <retrieve-client-secret>
}
For workspace-level operations, for default authentication:
provider "databricks" {
alias = "workspace"
}
For direct configuration (replace the retrieve
placeholders with your own implementation to retrieve the values from the console or some other configuration store, such as HashiCorp Vault. See also Vault Provider). In this case, the host is the Databricks workspace URL, for example https://dbc-a1b2345c-d6e7.cloud.databricks.com
:
provider "databricks" {
alias = "workspace"
host = <retrieve-workspace-url>
client_id = <retrieve-client-id>
client_secret = <retrieve-client-secret>
}
For more information about authenticating with the Databricks Terraform provider, see Authentication.
For account-level operations, use the following for default authentication:
Python
from databricks.sdk import AccountClient
a = AccountClient()
For direct configuration, use the following, replacing the retrieve
placeholders with your own implementation, to retrieve the values from the console or other configuration store, such as AWS Systems Manager Parameter Store. In this case, the Databricks account console URL is https://accounts.cloud.databricks.com
:
Python
from databricks.sdk import AccountClient
a = AccountClient(
host = retrieve_account_console_url(),
account_id = retrieve_account_id(),
client_id = retrieve_client_id(),
client_secret = retrieve_client_secret()
)
For workspace-level operations, specifically default authentication:
Python
from databricks.sdk import WorkspaceClient
w = WorkspaceClient()
For direct configuration, replace the retrieve
placeholders with your own implementation to retrieve the values from the console, or other configuration store, such as AWS Systems Manager Parameter Store. In this case, the host is the Databricks workspace URL, for example https://dbc-a1b2345c-d6e7.cloud.databricks.com
:
Python
from databricks.sdk import WorkspaceClient
w = WorkspaceClient(
host = retrieve_workspace_url(),
client_id = retrieve_client_id(),
client_secret = retrieve_client_secret()
)
For more information about authenticating with Databricks tools and SDKs that use Python and that implement Databricks client unified authentication, see:
For account-level operations, for default authentication:
Java
import com.databricks.sdk.AccountClient;
AccountClient a = new AccountClient();
For direct configuration (replace the retrieve
placeholders with your own implementation to retrieve the values from the console, or other configuration store, such as AWS Systems Manager Parameter Store). In this case, the Databricks account console URL is https://accounts.cloud.databricks.com
:
Java
import com.databricks.sdk.AccountClient;
import com.databricks.sdk.core.DatabricksConfig;
DatabricksConfig cfg = new DatabricksConfig()
.setHost(retrieveAccountConsoleUrl())
.setAccountId(retrieveAccountId())
.setClientId(retrieveClientId())
.setClientSecret(retrieveClientSecret());
AccountClient a = new AccountClient(cfg);
For workspace-level operations using default authentication:
Java
import com.databricks.sdk.WorkspaceClient;
WorkspaceClient w = new WorkspaceClient();
For direct configuration (replace the retrieve
placeholders with your own implementation to retrieve the values from the console, or other configuration store, such as AWS Systems Manager Parameter Store). In this case, the host is the Databricks workspace URL, for example https://dbc-a1b2345c-d6e7.cloud.databricks.com
:
Java
import com.databricks.sdk.WorkspaceClient;
import com.databricks.sdk.core.DatabricksConfig;
DatabricksConfig cfg = new DatabricksConfig()
.setHost(retrieveWorkspaceUrl())
.setClientId(retrieveClientId())
.setClientSecret(retrieveClientSecret());
WorkspaceClient w = new WorkspaceClient(cfg);
For more information about authenticating with Databricks tools and SDKs that use Java and implement Databricks client unified authentication, see:
For account-level operations using default authentication:
Go
import (
"github.com/databricks/databricks-sdk-go"
)
w := databricks.Must(databricks.NewWorkspaceClient())
For direct configuration (replace the retrieve
placeholders with your own implementation to retrieve the values from the console, or other configuration store, such as AWS Systems Manager Parameter Store). In this case, the Databricks account console URL is https://accounts.cloud.databricks.com
:
Go
import (
"github.com/databricks/databricks-sdk-go"
)
w := databricks.Must(databricks.NewWorkspaceClient(&databricks.Config{
Host: retrieveAccountConsoleUrl(),
AccountId: retrieveAccountId(),
ClientId: retrieveClientId(),
ClientSecret: retrieveClientSecret(),
}))
For workspace-level operations using default authentication:
Go
import (
"github.com/databricks/databricks-sdk-go"
)
a := databricks.Must(databricks.NewAccountClient())
For direct configuration (replace the retrieve
placeholders with your own implementation to retrieve the values from the console, or other configuration store, such as AWS Systems Manager Parameter Store). In this case, the host is the Databricks workspace URL, for example https://dbc-a1b2345c-d6e7.cloud.databricks.com
:
Go
import (
"github.com/databricks/databricks-sdk-go"
)
a := databricks.Must(databricks.NewAccountClient(&databricks.Config{
Host: retrieveWorkspaceUrl(),
ClientId: retrieveClientId(),
ClientSecret: retrieveClientSecret(),
}))
For more information about authenticating with Databricks tools and SDKs that use Go and that implement Databricks client unified authentication, see Authenticate the Databricks SDK for Go with your Databricks account or workspace.
Manually generate and use access tokens for OAuth service principal authenticationâDatabricks tools and SDKs that implement the Databricks client unified authentication standard will automatically generate, refresh, and use Databricks OAuth access tokens on your behalf as needed for OAuth service principal authentication.
Databricks recommends using client unified authentication, however if you must manually generate, refresh, or use Databricks OAuth access tokens, follow the instructions in this section.
Use the service principal's client ID and OAuth secret to request an OAuth access token to authenticate to both account-level REST APIs and workspace-level REST APIs. The access token will expire in one hour. You must request a new OAuth access token after the expiration. The scope of the OAuth access token depends on the level that you create the token from. You can create a token at either the account level or the workspace level, as follows:
An OAuth access token created from the account level can be used against Databricks REST APIs in the account, and in any workspaces the service principal has access to.
As an account admin, log in to the account console.
Click the down arrow next to your username in the upper right corner.
Copy your Account ID.
Construct the token endpoint URL by replacing <my-account-id>
in the following URL with the account ID that you copied.
https://accounts.cloud.databricks.com/oidc/accounts/<my-account-id>/v1/token
Use a client such as curl
to request an OAuth access token with the token endpoint URL, the service principal's client ID (also known as an application ID), and the service principal's OAuth secret you created. The all-apis
scope requests an OAuth access token that can be used to access all Databricks REST APIs that the service principal has been granted access to.
<token-endpoint-URL>
with the preceding token endpoint URL.<client-id>
with the service principal's client ID, which is also known as an application ID.<client-secret>
with the service principal's OAuth secret that you created.Bash
export CLIENT_ID=<client-id>
export CLIENT_SECRET=<client-secret>
curl --request POST \
--url <token-endpoint-URL> \
--user "$CLIENT_ID:$CLIENT_SECRET" \
--data 'grant_type=client_credentials&scope=all-apis'
This generates a response similar to:
JSON
{
"access_token": "eyJraWQiOiJkYTA4ZTVjZâ¦",
"token_type": "Bearer",
"expires_in": 3600
}
Copy the access_token
from the response.
An OAuth access token created from the workspace level can only access REST APIs in that workspace, even if the service principal is an account admin or is a member of other workspaces.
Construct the token endpoint URL by replacing https://<databricks-instance>
with the workspace URL of your Databricks deployment:
https://<databricks-instance>/oidc/v1/token
Use a client such as curl
to request an OAuth access token with the token endpoint URL, the service principal's client ID (also known as an application ID), and the service principal's OAuth secret you created. The all-apis
scope requests an OAuth access token that can be used to access all Databricks REST APIs that the service principal has been granted access to within the workspace that you are requesting the token from.
<token-endpoint-URL>
with the preceding token endpoint URL.<client-id>
with the service principal's client ID, which is also known as an application ID.<client-secret>
with the service principal's OAuth secret that you created.Bash
export CLIENT_ID=<client-id>
export CLIENT_SECRET=<client-secret>
curl --request POST \
--url <token-endpoint-URL> \
--user "$CLIENT_ID:$CLIENT_SECRET" \
--data 'grant_type=client_credentials&scope=all-apis'
This generates a response similar to:
JSON
{
"access_token": "eyJraWQiOiJkYTA4ZTVjZâ¦",
"token_type": "Bearer",
"expires_in": 3600
}
Copy the access_token
from the response.
You can use the OAuth access token to authenticate to Databricks account-level REST APIs and workspace-level REST APIs. The service principal must have account admin privileges to call account-level REST APIs.
Include the access token in the authorization header using Bearer
authentication. You can use this approach with curl
or any client that you build.
This example uses Bearer
authentication to get a list of all workspaces associated with an account.
<oauth-access-token>
with the service principal's OAuth access token that you copied in the previous step.<account-id>
with your account ID.Bash
export OAUTH_TOKEN=<oauth-access-token>
curl --request GET --header "Authorization: Bearer $OAUTH_TOKEN" \
'https://accounts.cloud.databricks.com/api/2.0/accounts/<account-id>/workspaces'
Example workspace-level REST API requestâ
This example uses Bearer
authentication to list all available clusters in the specified workspace.
<oauth-access-token>
with the service principal's OAuth access token that you copied in the previous step.<workspace-URL>
with your base workspace URL, which has the form similar to dbc-a1b2345c-d6e7.cloud.databricks.com
.Bash
export OAUTH_TOKEN=<oauth-access-token>
curl --request GET --header "Authorization: Bearer $OAUTH_TOKEN" \
'https://<workspace-URL>/api/2.0/clusters/list'
Additional resourcesâ
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4