Stay organized with collections Save and categorize content based on your preferences.
Spanner federated queriesAs a data analyst, you can query data in Spanner from BigQuery using federated queries.
BigQuery Spanner federation enables BigQuery to query data residing in Spanner in real-time, without copying or moving data.
You can query Spanner data in two ways:
EXTERNAL_QUERY
function.The simplest way to query Spanner tables is to create an external dataset. Once you create the external dataset, your tables from the corresponding Spanner database are visible in BigQuery and you can use them in your queries - for example in joins, unions or subqueries. However, no data is moved from Spanner to BigQuery storage.
You don't need to create a connection to query Spanner data if you create an external dataset.
UseEXTERNAL_QUERY
function
Like for other federated databases, you can also query Spanner data with an EXTERNAL_QUERY
function. This may be useful if you want to query Spanner database which uses PostgreSQL dialect or want to have more control over the connection parameters.
roles/bigquery.connectionUser
) Identity and Access Management (IAM) role. You also need to ask your administrator to grant you one of the following:
SELECT
privilege on all Spanner schema objects in your queries.roles/spanner.databaseReader
) IAM role.For information about granting IAM roles, see Manage access to projects, folders, and organizations. For information about fine-grained access control, see About fine-grained access control.
If you are a Spanner fine-grained access control user, when you run a federated query with an EXTERNAL_QUERY
function, you must use a Spanner connection that specifies a database role. Then all queries that you run with this connection use that database role.
If you use a connection that doesn't specify a database role, you must have the IAM roles indicated in Before you begin.
Query dataTo send a federated query to Spanner from a GoogleSQL query, use the EXTERNAL_QUERY
function.
Formulate your Spanner query in either GoogleSQL or PostgreSQL, depending on the specified dialect of the database.
The following example makes a federated query to a Spanner database named orders
and joins the results with a BigQuery table named mydataset.customers
.
SELECT c.customer_id, c.name, rq.first_order_date FROM mydataset.customers AS c LEFT OUTER JOIN EXTERNAL_QUERY( 'my-project.us.example-db', '''SELECT customer_id, MIN(order_date) AS first_order_date FROM orders GROUP BY customer_id''') AS rq ON rq.customer_id = c.customer_id GROUP BY c.customer_id, c.name, rq.first_order_date;Spanner Data Boost
Data Boost is a fully managed, serverless feature that provides independent compute resources for supported Spanner workloads. Data Boost lets you execute analytics queries and data exports with near-zero impact to existing workloads on the provisioned Spanner instance. Data Boost lets you run federated queries with independent compute capacity separate from your provisioned instances to avoid impacting existing workloads on Spanner. Data Boost is most impactful when you run complex ad hoc queries, or when you want to process large amounts of data without impacting the existing Spanner workload. Running federated queries with Data Boost can lead to significantly lower CPU consumption, and in some cases, lower query latency.
Before you beginTo get the permission that you need to enable access to Data Boost, ask your administrator to grant you the Cloud Spanner Database Reader with DataBoost (roles/spanner.databaseReaderWithDataBoost
) IAM role on the Spanner database. For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the spanner.databases.useDataBoost
permission, which is required to enable access to Data Boost.
You might also be able to get this permission with custom roles or other predefined roles.
Enable Data BoostWhen using external datasets, Data Boost is always used and you don't have to enable it manually.
If you want to use Data Boost for your EXTERNAL_QUERY
queries, you must enable it when creating a connection that is used by your query.
Spanner can divide certain queries into smaller pieces, or partitions, and fetch the partitions in parallel. For more information, including a list of limitations, see Read data in parallel in the Spanner documentation.
To view the query execution plan for a Spanner query, see Understand how Spanner executes queries.
When running federated queries with external datasets, the "Read data in parallel" option is always used.
To enable parallel reads when using the EXTERNAL_QUERY
, enable it when you create the Connection.
When you run federated queries with an EXTERNAL_QUERY
function, you can assign priority (high
, medium
, or low
) to individual queries by specifying the query_execution_priority
option:
SELECT * FROM EXTERNAL_QUERY( 'my-project.us.example-db', '''SELECT customer_id, MIN(order_date) AS first_order_date FROM orders GROUP BY customer_id''', '{"query_execution_priority":"high"}');
The default priority is medium
.
Queries with priority high
will compete with transactional traffic. Queries with priority low
are best-effort, and might get preempted by background load, for example scheduled backups.
low
priority fall below queries like backup jobs which might never complete within the timeouts for BigQuery.
When running federated queries with external datasets, all queries have always medium
priority.
If you use external datasets, your Spanner tables are visible directly in BigQuery Studio and you can see their schemas.
However, you can also see the schemas without defining external datasets. You can use EXTERNAL_QUERY
function also to query information_schema views to access database metadata. The following example returns information about the columns in the table MyTable
:
SELECT * FROM EXTERNAL_QUERY( 'my-project.us.example-db', '''SELECT t.column_name, t.spanner_type, t.is_nullable FROM information_schema.columns AS t WHERE t.table_catalog = '' AND t.table_schema = '' AND t.table_name = 'MyTable' ORDER BY t.ordinal_position ''');PostgreSQL database
SELECT * from EXTERNAL_QUERY( 'my-project.us.postgresql.example-db', '''SELECT t.column_name, t.data_type, t.is_nullable FROM information_schema.columns AS t WHERE t.table_schema = 'public' and t.table_name='MyTable' ORDER BY t.ordinal_position ''');
For more information, see the following information schema references in the Spanner documentation:
PricingBigQuery supports federated queries where Spanner instances and BigQuery datasets are in different regions. These queries incur an additional Spanner data transfer charge. For more information see Spanner pricing.
You are charged for the data transfer, based on the following SKUs:
Data transfer is charged based on the BigQuery region you run the query in and the nearest Spanner region that has read-write or read-only replicas.
For BigQuery multi-region configurations (US
or EU
), data transfer costs from Spanner are determined as follows:
US
multi-region: Spanner region us-central1
EU
multi-region: Spanner region europe-west1
For example:
US
multi-region) and Spanner (us-central1
): Costs apply for data transfer within the same region.US
multi-region) and Spanner (us-west4
): Costs apply for Data transfer between regions within the same continent.This section helps you troubleshoot issues you might encounter when sending a federated query to Spanner.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-14 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-14 UTC."],[[["BigQuery allows data analysts to query data in Spanner in real-time using federated queries, without needing to copy or move the data."],["Spanner data can be queried in BigQuery by creating a Spanner external dataset or by using the `EXTERNAL_QUERY` function."],["Data Boost, a Spanner serverless feature, can be used to run federated queries, providing independent compute capacity and minimizing impact on existing Spanner workloads."],["When using the `EXTERNAL_QUERY` function, users can assign a priority to their queries (high, medium, or low), with 'medium' being the default."],["Data parallel reads are possible under certain conditions, and when using external datasets the \"Read data in parallel\" option is always used."]]],[]]
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4