note
This article covers Databricks Connect for Databricks Runtime 13.3 LTS and above.
This article describes how to install Databricks Connect for Python. See What is Databricks Connect?. For the Scala version of this article, see Install Databricks Connect for Scala.
RequirementsâTo install Databricks Connect for Python, the following requirements must be met:
If you are connecting to serverless compute, your workspace must meet the requirements for serverless compute.
note
Serverless compute is supported in Databricks Connect version 15.1 and above. In addition, Databricks Connect versions at or lower than the Databricks Runtime release on serverless are fully compatible. See Release notes. To verify if the Databricks Connect version is compatible with serverless compute, see Validate the connection to Databricks.
If you are connecting to a cluster, your target cluster must meet the cluster configuration requirements, which includes Databricks Runtime version requirements.
You must have Python 3 installed on your development machine, and the minor version of Python installed on your development machine must meet the version requirements in the table below.
If you are using user-defined functions (UDFs), the local minor version of Python must match the minor version of Python of the Databricks Runtime version of the cluster or serverless compute. To find the minor Python version of the Databricks Runtime version of your cluster, refer to the System environment section of the Databricks Runtime release notes for that version. See Databricks Runtime release notes versions and compatibility and Serverless compute release notes.
The following table shows compatible Databricks Connect and Python versions. Databricks Connect version numbers correspond to Databricks Runtime version numbers.
For UDF support, see Python base environment.
Activate a Python virtual environmentâDatabricks strongly recommends that you have a Python virtual environment activated for each Python version that you use with Databricks Connect. Python virtual environments help to make sure that you are using the correct versions of Python and Databricks Connect together. For more information about these tools and how to activate them, see venv or Poetry.
Install the Databricks Connect clientâThis section describes how to install the Databricks Connect client with venv or Poetry.
Install the Databricks Connect client with venvâWith your virtual environment activated, uninstall PySpark, if it is already installed, by running the uninstall
command. This is required because the databricks-connect
package conflicts with PySpark. For details, see Conflicting PySpark installations. To check whether PySpark is already installed, run the show
command.
Bash
pip3 show pyspark
pip3 uninstall pyspark
With your virtual environment still activated, install the Databricks Connect client by running the install
command. Use the --upgrade
option to upgrade any existing client installation to the specified version.
Bash
pip3 install --upgrade "databricks-connect==16.4.*"
note
Databricks recommends that you append the âdot-asteriskâ notation to specify databricks-connect==X.Y.*
instead of databricks-connect=X.Y
, to make sure that the most recent package is installed. While this is not a requirement, it helps make sure that you can use the latest supported features for that cluster.
With your virtual environment activated, uninstall PySpark, if it is already installed, by running the remove
command. This is required because the databricks-connect
package conflicts with PySpark. For details, see Conflicting PySpark installations. To check whether PySpark is already installed, run the show
command.
Bash
poetry show pyspark
poetry remove pyspark
With your virtual environment still activated, install the Databricks Connect client by running the add
command.
Bash
poetry add databricks-connect@~16.4
note
Databricks recommends that you use the âat-tildeâ notation to specify databricks-connect@~16.4
instead of databricks-connect==16.4
, to make sure that the most recent package is installed. While this is not a requirement, it helps make sure that you can use the latest supported features for that cluster.
After you have installed Databricks Connect, you need to configure a connection to Databricks. See Compute configuration for Databricks Connect.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4