Stay organized with collections Save and categorize content based on your preferences.
Create Spanner external datasetsThis document describes how to create an external dataset (also known as a federated dataset) in BigQuery that's linked to an existing database in Spanner.
An external dataset is a connection between BigQuery and an external data source at the dataset level. It lets you query transactional data in Spanner databases with GoogleSQL without needing to copy or import all of the data from Spanner to BigQuery storage. These query results are stored in BigQuery.
The tables in an external dataset are automatically populated from the tables in the corresponding external data source. You can query these tables directly in BigQuery, but you cannot make modifications, additions, or deletions. However, any updates that you make in the external data source are automatically reflected in BigQuery.
When you query Spanner, query results are by default saved in temporary tables. They can also optionally be saved as a new BigQuery table, joined with other tables, or merged with existing tables using DML.
Required permissionsTo get the permission that you need to create an external dataset, ask your administrator to grant you the BigQuery User (roles/bigquery.user
) IAM role. For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the bigquery.datasets.create
permission, which is required to create an external dataset.
You might also be able to get this permission with custom roles or other predefined roles.
For more information about IAM roles and permissions in BigQuery, see Introduction to IAM.
Use aCLOUD_RESOURCE
connection
Optionally, Spanner external datasets can use a CLOUD_RESOURCE
connection to interact with your Spanner database, so that you can provide a user access to Spanner data through BigQuery, without giving them direct access to the Spanner database. Because the service account from CLOUD_RESOURCE
connection handles retrieving data from the Spanner, you only have to grant users access to the Spanner external dataset.
Before you create Spanner external datasets with a CLOUD_RESOURCE
connection, do the following:
You can create or use an existing CLOUD_RESOURCE
connection to connect to Spanner. To create the Cloud resource connection, follow the steps on the Create a Cloud resource connection page.
After you create the connection, open it, and in the Connection info pane, copy the service account ID. You need this ID when you configure permissions for the connection. When you create a connection resource, BigQuery creates a unique system service account and associates it with the connection.
Set up accessYou must give the new connection read access to your Spanner instance or database. It is recommended to use Cloud Spanner Database Reader with DataBoost (roles/spanner.databaseReaderWithDataBoost
) predefined IAM role.
Follow these steps to grant access to database-level roles for the service account that you copied earlier from the connection:
Go to the Spanner Instances page.
Click the name of the instance that contains your database to go to the Instance details page.
In the Overview tab, select the checkbox for your database.
The Info panel appears.
Click Add principal.
In the Add principals panel, in New principals, enter the service account ID that you copied earlier.
In the Select a role field, select Cloud Spanner Database Reader with DataBoost role.
Click Save.
To create an external dataset, do the following:
ConsoleOpen the BigQuery page in the Google Cloud console.
In the Explorer panel, select the project where you want to create the dataset.
Expand the more_vert Actions option and click Create dataset.
On the Create dataset page, do the following:
us-central1
or multiregion us
. After you create a dataset, the location can't be changed.For External Dataset, do the following:
Spanner
.projects/PROJECT_ID/instances/INSTANCE/databases/DATABASE
. For example: projects/my_project/instances/my_instance/databases/my_database
.Leave the other default settings as they are.
Click Create dataset.
Use the CREATE EXTERNAL SCHEMA
data definition language (DDL) statement.
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement:
CREATE EXTERNAL SCHEMA DATASET_NAME
OPTIONS (
external_source = 'SPANNER_EXTERNAL_SOURCE',
location = 'LOCATION');
/*
Alternatively, create with a connection:
*/
CREATE EXTERNAL SCHEMA DATASET_NAME
WITH CONNECTION PROJECT_ID.LOCATION.CONNECTION_NAME
OPTIONS (
external_source = 'SPANNER_EXTERNAL_SOURCE',
location = 'LOCATION');
Replace the following:
DATASET_NAME
: the name of your new dataset in BigQuery.SPANNER_EXTERNAL_SOURCE
: the full, qualified Spanner database name, with a prefix identifying the source, in the following format: google-cloudspanner://[DATABASE_ROLE@]/projects/PROJECT_ID/instances/INSTANCE/databases/DATABASE
. For example: google-cloudspanner://admin@/projects/my_project/instances/my_instance/databases/my_database
or google-cloudspanner:/projects/my_project/instances/my_instance/databases/my_database
.LOCATION
: the location of your new dataset in BigQuery, for example, us-central1
. After you create a dataset, you can't change its location.CONNECTION_NAME
: the name of your Cloud resource connection.Click play_circle Run.
For more information about how to run queries, see Run an interactive query.
bqIn a command-line environment, create an external dataset by using the bq mk
command:
bq --location=LOCATION mk --dataset \ --external_source SPANNER_EXTERNAL_SOURCE \ DATASET_NAME
Alternatively, create with a connection:
bq --location=LOCATION mk --dataset \ --external_source SPANNER_EXTERNAL_SOURCE \ --connection_id PROJECT_ID.LOCATION.CONNECTION_NAME \ DATASET_NAME
Replace the following:
LOCATION
: the location of your new dataset in BigQuery—for example, us-central1
. After you create a dataset, you can't change its location. You can set a default location value by using the .bigqueryrc
file.SPANNER_EXTERNAL_SOURCE
: the full, qualified Spanner database name, with a prefix identifying the source, in the following format: google-cloudspanner://[DATABASE_ROLE@]/projects/PROJECT_ID/instances/INSTANCE/databases/DATABASE
. For example: google-cloudspanner://admin@/projects/my_project/instances/my_instance/databases/my_database
or google-cloudspanner:/projects/my_project/instances/my_instance/databases/my_database
.DATASET_NAME
: the name of your new dataset in BigQuery. To create a dataset in a project other than your default project, add the project ID to the dataset name in the following format: PROJECT_ID
:DATASET_NAME
.CONNECTION_NAME
: the name of your Cloud resource connection.Use the google_bigquery_dataset
resource.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
The following example creates a Spanner external dataset:
To apply your Terraform configuration in a Google Cloud project, complete the steps in the following sections.
Prepare Cloud ShellSet the default Google Cloud project where you want to apply your Terraform configurations.
You only need to run this command once per project, and you can run it in any directory.
export GOOGLE_CLOUD_PROJECT=PROJECT_ID
Environment variables are overridden if you set explicit values in the Terraform configuration file.
Each Terraform configuration file must have its own directory (also called a root module).
.tf
extension—for example main.tf
. In this tutorial, the file is referred to as main.tf
.
mkdir DIRECTORY && cd DIRECTORY && touch main.tf
If you are following a tutorial, you can copy the sample code in each section or step.
Copy the sample code into the newly created main.tf
.
Optionally, copy the code from GitHub. This is recommended when the Terraform snippet is part of an end-to-end solution.
terraform init
Optionally, to use the latest Google provider version, include the -upgrade
option:
terraform init -upgrade
terraform plan
Make corrections to the configuration as necessary.
yes
at the prompt:
terraform apply
Wait until Terraform displays the "Apply complete!" message.
Call the datasets.insert
method with a defined dataset resource and externalDatasetReference
field for your Spanner database.
Note that names of the tables in the external datasets are case insensitive.
When you create the external datasets with a CLOUD_RESOURCE
connection, you need to have the bigquery.connections.delegate
permission (available from the BigQuery Connection Admin role) on the connection that is used by the external datasets.
Spanner external datasets support end-user credentials (EUC). That means that access to the Spanner tables from external datasets is controlled by Spanner. Users can query these tables only if they have access granted in Spanner.
Spanner external datasets also support access delegation. Access delegation decouples access to the Spanner tables from external datasets and the direct access to the underlying Spanner tables. A Cloud resource connection associated with a service account is used to connect to the Spanner. Users can query these Spanner tables from external datasets even if they don't have access granted in Spanner.
List tables in an external datasetTo list the tables that are available for query in your external dataset, see Listing datasets.
Get table informationTo get information on the tables in your external dataset, such as schema details, see Get table information.
Query Spanner dataQuerying tables in external datasets is the same as querying tables in any other BigQuery dataset. However, data modification operations (DML) aren't supported.
Queries against tables in Spanner external datasets use Data Boost by default and it cannot be changed. Because of that you need additional permissions to run such queries.
Create a view in an external datasetYou can't create a view in a external dataset. However, you can create a view in a standard dataset that's based on a table in an external dataset. For more information, see Create views.
Delete a external datasetDeleting a external dataset is the same as deleting any other BigQuery dataset. Deleting external datasets does not impact tables in the Spanner database. For more information, see Delete datasets.
LimitationsINFORMATION_SCHEMA
views aren't supported.Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-07 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-07 UTC."],[[["External datasets in BigQuery allow querying of transactional data in Spanner databases using GoogleSQL without data migration, with updates in Spanner automatically reflecting in BigQuery."],["Creating an external dataset requires the `bigquery.datasets.create` permission, typically granted through the BigQuery User IAM role, and can be done via the Google Cloud console, SQL, command-line tool, or Terraform."],["You can list tables and get schema details for tables within external datasets, but direct modifications, additions, or deletions of data within these datasets are not possible, you can however query the data."],["Access to Spanner tables via external datasets is controlled by Spanner's end-user credentials, meaning users must have proper permissions granted in Spanner to query these tables."],["Several limitations apply to Spanner external datasets, including no support for data modification operations, named schemas, certain data types, metadata caching, or creation of new objects."]]],[]]
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4