This article describes how to configure your Git credentials in Databricks so that you can connect a remote repo using Databricks Git folders (formerly Repos).
For a list of supported Git providers (cloud and on-premises), read Supported Git providers.
Configure Git credentials for a service principal
Although this article walks you through configuring Git credentials for a user, you can also configure Git credentials for a service principal. Service principals are a better choice when implementing jobs, CI/CD pipelines, or any other automated workflows that you don't want to associate with a user.
To learn how to authorize a service principal to access your workspace's Git folders, see Use a service principal for automation with Databricks Git folders.
GitHub and GitHub AEâThe following information applies to GitHub and GitHub AE users.
Why use the Databricks GitHub App instead of a PAT?âDatabricks Git folders allows you to choose the Databricks GitHub App for user authentication instead of PATs if you are using a hosted GitHub account. The GitHub App has the following benefits over PATs:
note
If you are having trouble installing the Databricks GitHub App for your Databricks account or organization, see the GitHub App installation documentation for troubleshooting guidance.
Per standard OAuth 2.0 integration, Databricks stores a user's access and refresh tokens. GitHub manages all other access control. Access and refresh tokens follow GitHub's default expiry rules, with access tokens expiring after 8 hours (which minimizes risk in the event of credentials leaking). Refresh tokens have a 6-month lifetime if unused. Linked credentials expire after 6 months of inactivity, requiring users to reconfigure them.
You can optionally encrypt Databricks tokens using customer-managed keys (CMK).
Link your GitHub account using Databricks GitHub Appânote
In Databricks, link your GitHub account on the User Settings page:
In the upper-right corner of any page, click your username, then select Settings.
Click the Linked accounts tab.
Change your provider to GitHub, select Link Git account, and click Link.
The Databricks GitHub App authorization page appears. Authorize the GitHub App to complete the setup, which allows Databricks to act on your behalf when you perform Git operations in Git folders (such as cloning a repository). See the GitHub documentation for more details on app authorization.
To allow access to GitHub repositories, follow the steps below to install and configure the Databricks GitHub app.
You can install and configure the Databricks GitHub App on GitHub repositories that you want to access from Databricks Git folders. See the GitHub documentation for more details on app installation.
Open the Databricks GitHub App installation page.
Select the account that owns the repositories you want to access.
If you are not an owner of the account, you must have the account owner install and configure the app for you.
If you are the account owner, install the GitHub App. Installing it gives read and write access to code. Code is only accessed on behalf of users (for example, when a user clones a repository in Databricks Git folders).
Optionally, you can give access to only a subset of repositories by selecting the Only select repositories option.
Important limitation for EMU accounts
If you have a GitHub Enterprise Managed User (EMU) account, you cannot install the Databricks GitHub app on your personal repositories. This is a GitHub platform limitation.
Recommended solution: Create a GitHub Personal Access Token (PAT) instead, which works with both organization and personal repositories on EMU accounts.
How to identify an EMU accountâYour GitHub account is an EMU account if:
_<enterprise-name>
(e.g., john.doe_databricks
)In GitHub, follow these steps to create a personal access token that allows access to your repositories:
To use single sign-on, see Authorizing a personal access token for use with SAML single sign-on.
Connect to a GitHub repo using a fine-grained personal access tokenâAs a best practice, use a fine-grained PAT that only grants access to the resources you will access in your project. In GitHub, follow these steps to create a fine-grained PAT that allows access to your repositories:
In the upper-right corner of any page, click your profile photo, then click Settings.
Click Developer settings.
Click Personal access tokens > Fine-grained tokens.
Click Generate new token.
Configure your new fine-grained token from the following settings:
Token name: Provide a unique token name. Write it down somewhere so you don't forget or lose it!
Description: Add some short text describing the purpose of the token.
Resource owner: The default is your current GitHub ID. Set this to the GitHub organization that owns the repo(s) you will access.
Expiration: Select the time period for token expiry. The default is 30 days.
Under Repository access, choose the access scope for your token. As a best practice, select only those repositories that you will be using for Git folder version control.
Under Permissions, configure the specific access levels granted by this token for the repositories and account you will work with. For more details on the permission groups, read Permissions required for fine-grained personal access tokens in the GitHub documentation.
Set the access permissions for Contents to Read and write. (You find the Contents scope under Repository permissions.) For details on this scope, see the GitHub documentation on the Contents scope.
Click Generate token.
Copy the token to your clipboard. You enter this token in Databricks under User Settings > Linked accounts.
In GitLab, follow these steps to create a personal access token that allows access to your repositories:
From GitLab, click your user icon in the upper-left corner of the screen and select Preferences.
Click Access Tokens in the sidebar.
Click Add new token in the Personal Access Tokens section of the page.
Enter a name for the token.
Select the specific scopes to provide access by checking the boxes for your desired permission levels. For more details on the scope options, read the GitLab documentation on PAT scopes.
Click Create personal access token.
Copy the token to your clipboard. Enter this token in Databricks under User Settings > Linked accounts.
See the GitLab documentation to learn more about how to create and manage personal access tokens.
GitLab also provides support for fine-grained access using âProject Access Tokensâ. You can use Project Access Tokens to scope access to a GitLab project. For more details, read GitLab's documentation on Project Access Tokens.
AWS CodeCommitâIn AWS CodeCommit, follow these steps to create a HTTPS Git credential that allows access to your repositories:
The following steps show you how to connect a Databricks repo to an Azure DevOps repo when they aren't in the same Microsoft Entra ID tenancy.
The service endpoint for Microsoft Entra ID must be accessible from both the private and public subnets of the Databricks workspace. For more information, see VPC peering.
Get an access token for the repository in Azure DevOps:
In Azure DevOps, follow these steps to get an access token for the repository. Azure DevOps documentation contains more information about Azure DevOps personal access tokens.
note
By default, you cannot use Bitbucket Repository Access Tokens or Project Access Tokens. To override this for specific workspaces, contact support.
In Bitbucket, follow these steps to create an app password that allows access to your repositories:
If your Git provider is not listed, selecting âGitHubâ and providing it the PAT you obtained from your Git provider often works, but is not guaranteed to work.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4