A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://docs.databricks.com/dev-tools/go-sql-driver.html below:

Databricks SQL Driver for Go

Databricks SQL Driver for Go

The Databricks SQL Driver for Go is a Go library that allows you to use Go code to run SQL commands on Databricks compute resources. This article supplements the Databricks SQL Driver for Go README, API reference, and examples.

Requirements​ Getting started with the Databricks SQL Driver for Go​
  1. On your development machine with Go 1.20 or above already installed and an existing Go code project already created, create a go.mod file to track your Go code's dependencies by running the go mod init command, for example:

  2. Take a dependency on the Databricks SQL Driver for Go package by running the go mod edit -require command, replacing v1.5.2 with the latest version of the Databricks SQL Driver for Go package as listed in the Releases:

    Bash

    go mod edit -require github.com/databricks/databricks-sql-go@v1.5.2

    Your go.mod file should now look like this:

    Go

    module sample

    go 1.20

    require github.com/databricks/databricks-sql-go v1.5.2
  3. In your project, create a Go code file that imports the Databricks SQL Driver for Go. The following example, in a file named main.go with the following contents, lists all the clusters in your Databricks workspace:

    Go

    package main

    import (
    "database/sql"
    "os"
    _ "github.com/databricks/databricks-sql-go"
    )

    func main() {
    dsn := os.Getenv("DATABRICKS_DSN")

    if dsn == "" {
    panic("No connection string found. " +
    "Set the DATABRICKS_DSN environment variable, and try again.")
    }

    db, err := sql.Open("databricks", dsn)
    if err != nil {
    panic(err)
    }
    defer db.Close()

    if err := db.Ping(); err != nil {
    panic(err)
    }
    }
  4. Add any missing module dependencies by running the go mod tidy command:

    note

    If you get the error go: warning: "all" matched no packages, you forgot to add a Go code file that imports the Databricks SQL Driver for Go.

  5. Make copies of all packages needed to support builds and tests of packages in your main module, by running the go mod vendor command:

  6. Modify your code as needed to set the DATABRICKS_DSN environment variable for Databricks authentication. See also Connect with a DSN connection string.

  7. Run your Go code file, assuming a file named main.go, by running the go run command:

  8. If no errors are returned, you have successfully authenticated the Databricks SQL Driver for Go with your Databricks workspace and connected to your running Databricks cluster or SQL warehouse in that workspace.

Connect with a DSN connection string​

To access clusters and SQL warehouses, use sql.Open() to create a database handle through a data source name (DSN) connection string. This code example retrieves the DSN connection string from an environment variable named DATABRICKS_DSN:

Go

package main

import (
"database/sql"
"os"
_ "github.com/databricks/databricks-sql-go"
)

func main() {
dsn := os.Getenv("DATABRICKS_DSN")

if dsn == "" {
panic("No connection string found. " +
"Set the DATABRICKS_DSN environment variable, and try again.")
}

db, err := sql.Open("databricks", dsn)
if err != nil {
panic(err)
}
defer db.Close()

if err := db.Ping(); err != nil {
panic(err)
}
}

To specify the DSN connection string in the correct format, see the DSN connection string examples in Authentication. For example, for Databricks personal access token authentication, use the following syntax, where:

token:<personal-access-token>@<server-hostname>:<port-number>/<http-path>?<param1=value1>&

For example, for a cluster:

token:dapi12345678901234567890123456789012@dbc-a1b2345c-d6e7.cloud.databricks.com:443/sql/protocolv1/o/1234567890123456/1234-567890-abcdefgh

For example, for a SQL warehouse:

token:dapi12345678901234567890123456789012@dbc-a1b2345c-d6e7.cloud.databricks.com:443/sql/1.0/endpoints/a1b234c5678901d2

note

As a security best practice, you should not hard-code this DSN connection string into your Go code. Instead, you should retrieve this DSN connection string from a secure location. For example, the code example earlier in this article used an environment variable.

Optional parameters​

For example, for a SQL warehouse:

token:dapi12345678901234567890123456789012@dbc-a1b2345c-d6e7.cloud.databricks.com:443/sql/1.0/endpoints/a1b234c5678901d2?catalog=hive_metastore&schema=example&maxRows=100&timeout=60&timezone=America/Sao_Paulo&ansi_mode=true
Connect with the NewConnector function​

Alternatively, use sql.OpenDB() to create a database handle through a new connector object that is created with dbsql.NewConnector() (connecting to Databricks clusters and SQL warehouses with a new connector object requires v1.0.0 or higher of the Databricks SQL Driver for Go). For example:

Go

package main

import (
"database/sql"
"os"
dbsql "github.com/databricks/databricks-sql-go"
)

func main() {
connector, err := dbsql.NewConnector(
dbsql.WithAccessToken(os.Getenv("DATABRICKS_ACCESS_TOKEN")),
dbsql.WithServerHostname(os.Getenv("DATABRICKS_HOST")),
dbsql.WithPort(443),
dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")),
)
if err != nil {
panic(err)
}

db := sql.OpenDB(connector)
defer db.Close()

if err := db.Ping(); err != nil {
panic(err)
}
}

To specify the correct set of NewConnector settings, see the examples in Authentication.

note

As a security best practice, you should not hard-code your NewConnector settings into your Go code. Instead, you should retrieve these values from a secure location. For example, the preceding code uses environment variables.

Some of the more frequently used functional options include:

For example:

Go

connector, err := dbsql.NewConnector(
dbsql.WithAccessToken(os.Getenv("DATABRICKS_ACCESS_TOKEN")),
dbsql.WithServerHostname(os.Getenv("DATABRICKS_HOST")),
dbsql.WithPort(443),
dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")),
dbsql.WithInitialNamespace("samples", "nyctaxi"),
dbsql.WithMaxRows(100),
dbsql.SessionParams(map[string]string{"timezone": "America/Sao_Paulo", "ansi_mode": "true"}),
dbsql.WithTimeout(time.Minute),
dbsql.WithUserAgentEntry("example-user"),
)
Authentication​

The Databricks SQL Driver for Go supports the following Databricks authentication types:

Databricks personal access token authentication​

To use the Databricks SQL Driver for Go with Databricks personal access token authentication, you must first create a Databricks personal access token. For details on this step, see Databricks personal access tokens for workspace users.

To authenticate the Databricks SQL Driver for Go with a DSN connection string and the code example in Connect with a DSN connection string, use the following DSN connection string syntax, where:

You can also append one or more Optional parameters listed previously in this article.

token:<personal-access-token>@<server-hostname>:<port-number>/<http-path>

To authenticate the Databricks SQL Driver for Go with the NewConnector function, use the following code snippet and the code example in Connect with the NewConnector function, which assumes that you have set the following environment variables:

To set environment variables, see your operating system's documentation.

Go

connector, err := dbsql.NewConnector(
dbsql.WithServerHostname(os.Getenv("DATABRICKS_SERVER_HOSTNAME")),
dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")),
dbsql.WithPort(443),
dbsql.WithAccessToken(os.Getenv("DATABRICKS_TOKEN")),
)
OAuth user-to-machine (U2M) authentication​

Databricks SQL Driver for Go versions 1.5.0 and above support OAuth user-to-machine (U2M) authentication.

To use the Databricks SQL Driver for Go with a DSN connection string and the code example in Connect with a DSN connection string, use the following DSN connection string syntax, where:

You can also append one or more Optional parameters listed previously in this article.

<server-hostname>:<port-number>/<http-path>?authType=OauthU2M

To authenticate the Databricks SQL Driver for Go with the NewConnector function, you must first add the following to your import declaration:

Go

"github.com/databricks/databricks-sql-go/auth/oauth/u2m"

Then use the following code snippet and the code example in Connect with the NewConnector function, which assumes that you have set the following environment variables:

To set environment variables, see your operating system's documentation.

Go

authenticator, err := u2m.NewAuthenticator(os.Getenv("DATABRICKS_SERVER_HOSTNAME"), 1*time.Minute)
if err != nil {
panic(err)
}

connector, err := dbsql.NewConnector(
dbsql.WithServerHostname(os.Getenv("DATABRICKS_SERVER_HOSTNAME")),
dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")),
dbsql.WithPort(443),
dbsql.WithAuthenticator(authenticator),
)
OAuth machine-to-machine (M2M) authentication​

Databricks SQL Driver for Go versions 1.5.2 and above support OAuth machine-to-machine (M2M) authentication.

To use the Databricks SQL Driver for Go with OAuth M2M authentication, you must do the following:

  1. Create a Databricks service principal in your Databricks workspace, and create an OAuth secret for that service principal.

    To create the service principal and its OAuth secret, see Authorize unattended access to Databricks resources with a service principal using OAuth. Make a note of the service principal's UUID or Application ID value, and the Secret value for the service principal's OAuth secret.

  2. Give that service principal access to your cluster or warehouse.

    To give the service principal access to your cluster or warehouse, see Compute permissions or Manage a SQL warehouse.

To authenticate the Databricks SQL Driver for Go with a DSN connection string and the code example in Connect with a DSN connection string, use the following DSN connection string syntax, where:

You can also append one or more Optional parameters listed previously in this article.

<server-hostname>:<port-number>/<http-path>?authType=OAuthM2M&clientID=<client-id>&clientSecret=<client-secret>

To authenticate the Databricks SQL Driver for Go with the NewConnector function, you must first add the following to your import declaration:

Go

"github.com/databricks/databricks-sql-go/auth/oauth/m2m"

Then use the following code snippet and the code example in Connect with the NewConnector function, which assumes that you have set the following environment variables:

To set environment variables, see your operating system's documentation.

Go

authenticator := m2m.NewAuthenticator(
os.Getenv("DATABRICKS_CLIENT_ID"),
os.Getenv("DATABRICKS_CLIENT_SECRET"),
os.Getenv("DATABRICKS_SERVER_HOSTNAME"),
)

connector, err := dbsql.NewConnector(
dbsql.WithServerHostname(os.Getenv("DATABRICKS_SERVER_HOSTNAME")),
dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")),
dbsql.WithPort(443),
dbsql.WithAuthenticator(authenticator),
)
Query data​

The following code example demonstrates how to call the Databricks SQL Driver for Go to run a basic SQL query on a Databricks compute resource. This command returns the first two rows from the trips table in the samples catalog's nyctaxi schema.

This code example retrieves the DSN connection string from an environment variable named DATABRICKS_DSN.

Go

package main

import (
"database/sql"
"fmt"
"os"
"time"

_ "github.com/databricks/databricks-sql-go"
)

func main() {
dsn := os.Getenv("DATABRICKS_DSN")

if dsn == "" {
panic("No connection string found." +
"Set the DATABRICKS_DSN environment variable, and try again.")
}

db, err := sql.Open("databricks", dsn)
if err != nil {
panic(err)
}

defer db.Close()

var (
tpep_pickup_datetime time.Time
tpep_dropoff_datetime time.Time
trip_distance float64
fare_amount float64
pickup_zip int
dropoff_zip int
)

rows, err := db.Query("SELECT * FROM samples.nyctaxi.trips LIMIT 2")
if err != nil {
panic(err)
}

defer rows.Close()

fmt.Print("tpep_pickup_datetime,",
"tpep_dropoff_datetime,",
"trip_distance,",
"fare_amount,",
"pickup_zip,",
"dropoff_zip\n")

for rows.Next() {
err := rows.Scan(&tpep_pickup_datetime,
&tpep_dropoff_datetime,
&trip_distance,
&fare_amount,
&pickup_zip,
&dropoff_zip)
if err != nil {
panic(err)
}

fmt.Print(tpep_pickup_datetime, ",",
tpep_dropoff_datetime, ",",
trip_distance, ",",
fare_amount, ",",
pickup_zip, ",",
dropoff_zip, "\n")
}

err = rows.Err()
if err != nil {
panic(err)
}
}
Manage files in Unity Catalog volumes​

The Databricks SQL Driver enables you to write local files to Unity Catalog volumes, download files from volumes, and delete files from volumes, as shown in the following example:

Go

package main

import (
"context"
"database/sql"
"os"

_ "github.com/databricks/databricks-sql-go"
"github.com/databricks/databricks-sql-go/driverctx"
)

func main() {
dsn := os.Getenv("DATABRICKS_DSN")

if dsn == "" {
panic("No connection string found." +
"Set the DATABRICKS_DSN environment variable, and try again.")
}

db, err := sql.Open("databricks", dsn)
if err != nil {
panic(err)
}
defer db.Close()








ctx := driverctx.NewContextWithStagingInfo(
context.Background(),
[]string{"/tmp/"},
)



db.ExecContext(ctx, "PUT '/tmp/my-data.csv' INTO '/Volumes/main/default/my-volume/my-data.csv' OVERWRITE")


db.ExecContext(ctx, "GET '/Volumes/main/default/my-volume/my-data.csv' TO '/tmp/my-downloaded-data.csv'")


db.ExecContext(ctx, "REMOVE '/Volumes/main/default/my-volume/my-data.csv'")

db.Close()
}
Logging​

Use github.com/databricks/databricks-sql-go/logger to log messages that the Databricks SQL Driver for Go emits. The following code example uses sql.Open() to create a database handle through a DSN connection string. This code example retrieves the DSN connection string from an environment variable named DATABRICKS_DSN. All log messages that are emitted at the debug level and below are written to the results.log file.

Go

package main

import (
"database/sql"
"io"
"log"
"os"

_ "github.com/databricks/databricks-sql-go"
dbsqllog "github.com/databricks/databricks-sql-go/logger"
)

func main() {
dsn := os.Getenv("DATABRICKS_DSN")


file, err := os.Create("results.log")
if err != nil {
log.Fatal(err)
}
defer file.Close()

writer := io.Writer(file)


if err := dbsqllog.SetLogLevel("debug"); err != nil {
log.Fatal(err)
}


dbsqllog.SetLogOutput(writer)

if dsn == "" {
panic("Error: Cannot connect. No connection string found. " +
"Set the DATABRICKS_DSN environment variable, and try again.")
}

db, err := sql.Open("databricks", dsn)
if err != nil {
panic(err)
}
defer db.Close()

if err := db.Ping(); err != nil {
panic(err)
}
}
Testing​

To test your code, use Go test frameworks such as the testing standard library. To test your code under simulated conditions without calling Databricks REST API endpoints or changing the state of your Databricks accounts or workspaces, use Go mocking libraries such as testfify.

For example, given the following file named helpers.go containing a GetDBWithDSNPAT function that returns a Databricks workspace connection, a GetNYCTaxiTrips function that returns data from the trips table in the samples catalog's nyctaxi schema, and a PrintNYCTaxiTrips that prints the returned data:

Go

package main

import (
"database/sql"
"fmt"
"strconv"
"time"
)

func GetDBWithDSNPAT(dsn string) (*sql.DB, error) {
db, err := sql.Open("databricks", dsn)
if err != nil {
return nil, err
}
return db, nil
}

func GetNYCTaxiTrips(db *sql.DB, numRows int) (*sql.Rows, error) {
rows, err := db.Query("SELECT * FROM samples.nyctaxi.trips LIMIT " + strconv.Itoa(numRows))
if err != nil {
return nil, err
}
return rows, nil
}

func PrintNYCTaxiTrips(rows *sql.Rows) {
var (
tpep_pickup_datetime time.Time
tpep_dropoff_datetime time.Time
trip_distance float64
fare_amount float64
pickup_zip int
dropoff_zip int
)

fmt.Print(
"tpep_pickup_datetime,",
"tpep_dropoff_datetime,",
"trip_distance,",
"fare_amount,",
"pickup_zip,",
"dropoff_zip\n",
)

for rows.Next() {
err := rows.Scan(
&tpep_pickup_datetime,
&tpep_dropoff_datetime,
&trip_distance,
&fare_amount,
&pickup_zip,
&dropoff_zip,
)
if err != nil {
panic(err)
}

fmt.Print(
tpep_pickup_datetime, ",",
tpep_dropoff_datetime, ",",
trip_distance, ",",
fare_amount, ",",
pickup_zip, ",",
dropoff_zip, "\n",
)
}

err := rows.Err()
if err != nil {
panic(err)
}
}

And given the following file named main.go that calls these functions:

Go

package main

import (
"os"
)

func main() {
db, err := GetDBWithDSNPAT(os.Getenv("DATABRICKS_DSN"))
if err != nil {
panic(err)
}

rows, err := GetNYCTaxiTrips(db, 2)
if err != nil {
panic(err)
}

PrintNYCTaxiTrips(rows)
}

The following file named helpers_test.go tests whether the GetNYCTaxiTrips function returns the expected response. Rather than creating a real connection to the target workspace, this test mocks a sql.DB object. The test also mocks some data that conforms to the schema and values that are in the real data. The test returns the mocked data through the mocked connection and then checks whether one of the mocked data rows' values matches the expected value.

Go

package main

import (
"database/sql"
"testing"

"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/mock"
)



type MockGetNYCTaxiTrips interface {
GetNYCTaxiTrips(db *sql.DB, numRows int) (*sql.Rows, error)
}



type MockGetNYCTaxiTripsObj struct {
mock.Mock
}


func (m *MockGetNYCTaxiTripsObj) GetNYCTaxiTrips(db *sql.DB, numRows int) (*sql.Rows, error) {
args := m.Called(db, numRows)
return args.Get(0).(*sql.Rows), args.Error(1)
}

func TestGetNYCTaxiTrips(t *testing.T) {

mockGetNYCTaxiTripsObj := new(MockGetNYCTaxiTripsObj)




mockGetNYCTaxiTripsObj.On("GetNYCTaxiTrips", mock.Anything, mock.AnythingOfType("int")).Return(&sql.Rows{}, nil)


rows, err := mockGetNYCTaxiTripsObj.GetNYCTaxiTrips(nil, 2)


mockGetNYCTaxiTripsObj.AssertExpectations(t)


assert.NotNil(t, rows)
assert.Nil(t, err)
}

Because the GetNYCTaxiTrips function contains a SELECT statement and therefore does not change the state of the trips table, mocking is not absolutely required in this example. However, mocking enables you to quickly run your tests without waiting for an actual connection to be made with the workspace. Also, mocking enables you to run simulated tests multiple times for functions that might change a table's state, such as INSERT INTO, UPDATE, and DELETE FROM.

Additional resources​

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4