Stay organized with collections Save and categorize content based on your preferences.
Loading data from Datastore exportsBigQuery supports loading data from Datastore exports created using the Datastore managed import and export service. You can use the managed import and export service to export Datastore entities into a Cloud Storage bucket. You can then load the export into BigQuery as a table.
To learn how to create a Datastore export file, see Exporting and importing entities in the Datastore documentation. For information on scheduling exports, see Scheduling an export.
Note: If you intend to load a Datastore export into BigQuery, you must specify an entity filter in your export command. Data exported without specifying an entity filter cannot be loaded into BigQuery.You can control which properties BigQuery should load by setting the projectionFields
property in the API or by using the --projection_fields
flag in the bq command-line tool.
If you prefer to skip the loading process, you can query the export directly by setting it up as an external data source. For more information, see External data sources.
When you load data from Cloud Storage into a BigQuery table, the dataset that contains the table must be in the same region or multi-region as the Cloud Storage bucket.
LimitationsWhen you load data into BigQuery from a Datastore export, note the following restrictions:
Grant Identity and Access Management (IAM) roles that give users the necessary permissions to perform each task in this document.
Required permissionsTo load data into BigQuery, you need IAM permissions to run a load job and load data into BigQuery tables and partitions. If you are loading data from Cloud Storage, you also need IAM permissions to access the bucket that contains your data.
Permissions to load data into BigQueryTo load data into a new BigQuery table or partition or to append or overwrite an existing table or partition, you need the following IAM permissions:
bigquery.tables.create
bigquery.tables.updateData
bigquery.tables.update
bigquery.jobs.create
Each of the following predefined IAM roles includes the permissions that you need in order to load data into a BigQuery table or partition:
roles/bigquery.dataEditor
roles/bigquery.dataOwner
roles/bigquery.admin
(includes the bigquery.jobs.create
permission)bigquery.user
(includes the bigquery.jobs.create
permission)bigquery.jobUser
(includes the bigquery.jobs.create
permission)Additionally, if you have the bigquery.datasets.create
permission, you can create and update tables using a load job in the datasets that you create.
For more information on IAM roles and permissions in BigQuery, see Predefined roles and permissions.
Permissions to load data from Cloud StorageTo get the permissions that you need to load data from a Cloud Storage bucket, ask your administrator to grant you the Storage Admin (roles/storage.admin
) IAM role on the bucket. For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the permissions required to load data from a Cloud Storage bucket. To see the exact permissions that are required, expand the Required permissions section:
Required permissionsThe following permissions are required to load data from a Cloud Storage bucket:
storage.buckets.get
storage.objects.get
storage.objects.list (required if you are using a URI wildcard)
You might also be able to get these permissions with custom roles or other predefined roles.
Loading Datastore export service dataTo load data from a Datastore export metadata file:
ConsoleIn the Google Cloud console, go to the BigQuery page.
KIND_NAME.export_metadata
or export[NUM].export_metadata
. For example, in default_namespace_kind_Book.export_metadata
, Book
is the kind name, and default_namespace_kind_Book
is the filename generated by Datastore.Use the bq load
command with source_format
set to DATASTORE_BACKUP
. Supply the --location
flag and set the value to your location.
bq --location=LOCATION load \
--source_format=FORMAT \
DATASET.TABLE \
PATH_TO_SOURCE
Replace the following:
LOCATION
: your location. The --location
flag is optional. For example, if you are using BigQuery in the Tokyo region, you can set the flag's value to asia-northeast1
. You can set a default value for the location by using the .bigqueryrc file.FORMAT
: DATASTORE_BACKUP
.DATASET
: the dataset that contains the table into which you're loading data.TABLE
: the table into which you're loading data. If the table does not exist, it is created.PATH_TO_SOURCE
: the Cloud Storage URI.For example, the following command loads the gs://mybucket/20180228T1256/default_namespace/kind_Book/default_namespace_kind_Book.export_metadata
Datastore export file into a table named book_data
. mybucket
and mydataset
were created in the US
multi-region location.
bq --location=US load \
--source_format=DATASTORE_BACKUP \
mydataset.book_data \
gs://mybucket/20180228T1256/default_namespace/kind_Book/default_namespace_kind_Book.export_metadata
API
Set the following properties to load Datastore export data using the API.
Create a load job that points to the source data in Cloud Storage.
Specify your location in the location
property in the jobReference
section of the job resource.
The source URIs must be fully qualified, in the format gs://[BUCKET]/[OBJECT]. The file (object) name must end in [KIND_NAME].export_metadata
. Only one URI is allowed for Datastore exports, and you cannot use a wildcard.
Specify the data format by setting the JobConfigurationLoad.sourceFormat
property to DATASTORE_BACKUP
.
When you load Datastore export data into BigQuery, you can create a new table to store the data, or you can overwrite an existing table. You cannot append Datastore export data to an existing table.
If you attempt to append Datastore export data to an existing table, the following error results: Cannot append a datastore backup to a table that already has a schema. Try using the WRITE_TRUNCATE write disposition to replace the existing table
.
To overwrite an existing table with Datastore export data:
ConsoleIn the Google Cloud console, go to the BigQuery page.
KIND_NAME.export_metadata
or export[NUM].export_metadata
. For example, in default_namespace_kind_Book.export_metadata
, Book
is the kind name, and default_namespace_kind_Book
is the filename generated by Datastore.Use the bq load
command with the --replace
flag and with source_format
set to DATASTORE_BACKUP
. Supply the --location
flag and set the value to your location.
bq --location=LOCATION load \
--source_format=FORMAT \
--replace \
DATASET.TABLE \
PATH_TO_SOURCE
Replace the following:
LOCATION
: your location. The --location
flag is optional. For example, if you are using BigQuery in the Tokyo region, you can set the flag's value to asia-northeast1
. You can set a default value for the location by using the .bigqueryrc file.FORMAT
: DATASTORE_BACKUP
.DATASET
: the dataset containing the table into which you're loading data.TABLE
: the table you're overwriting.PATH_TO_SOURCE
: the Cloud Storage URI.For example, the following command loads the gs://mybucket/20180228T1256/default_namespace/kind_Book/default_namespace_kind_Book.export_metadata
Datastore export file and overwrites a table named book_data
:
bq load --source_format=DATASTORE_BACKUP \
--replace \
mydataset.book_data \
gs://mybucket/20180228T1256/default_namespace/kind_Book/default_namespace_kind_Book.export_metadata
API
Set the following properties to load data from the API.
Create a load job that points to the source data in Cloud Storage.
Specify your location in the location
property in the jobReference
section of the job resource.
The source URIs must be fully qualified, in the format gs://[BUCKET]/[OBJECT]. The file (object) name must end in [KIND_NAME].export_metadata
. Only one URI is allowed for Datastore exports, and you cannot use a wildcard.
Specify the data format by setting the JobConfigurationLoad.sourceFormat
property to DATASTORE_BACKUP
.
Specify the write disposition by setting the JobConfigurationLoad.writeDisposition
property to WRITE_TRUNCATE
.
To change how BigQuery parses Datastore export data, specify the following option:
Console option bq tool flag BigQuery API property Description Not available--projection_fields
projectionFields A comma-separated list that indicates which entity properties to load into BigQuery from a Datastore export. Property names are case-sensitive and must be top-level properties. If no properties are specified, BigQuery loads all properties. If any named property isn't found in the Datastore export, an invalid error is returned in the job result. The default value is ''. Data type conversion
BigQuery converts data from each entity in Datastore export files to BigQuery data types. The following table describes the conversion between data types.
Datastore data type BigQuery data type ArrayARRAY
Blob BYTES
Boolean BOOLEAN
Date and time TIMESTAMP
Embedded entity RECORD
Floating-point number FLOAT
Geographical point
RECORD
[{"lat","DOUBLE"}, {"long","DOUBLE"}]Integer
INTEGER
Key RECORD
Null STRING
Text string STRING
(truncated to 64 KB) Datastore key properties
Each entity in Datastore has a unique key that contains information such as the namespace and the path. BigQuery creates a RECORD
data type for the key, with nested fields for each piece of information, as described in the following table.
__key__.app
The Datastore app name. STRING __key__.id
The entity's ID, or null
if __key__.name
is set. INTEGER __key__.kind
The entity's kind. STRING __key__.name
The entity's name, or null
if __key__.id
is set. STRING __key__.namespace
If the Datastore app uses a custom namespace, the entity's namespace. Else, the default namespace is represented by an empty string. STRING __key__.path
The flattened ancestral path of the entity, consisting of the sequence of kind-identifier pairs from the root entity to the entity itself. For example: "Country", "USA", "PostalCode", 10011, "Route", 1234
. STRING
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-07 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-07 UTC."],[[["BigQuery can load data from Datastore exports that are stored in a Cloud Storage bucket, allowing you to analyze your Datastore entities in BigQuery."],["When creating a Datastore export for BigQuery, it is essential to specify an entity filter in the export command to ensure the data can be loaded correctly."],["Loading Datastore export data into BigQuery requires setting the source format to `DATASTORE_BACKUP` and specifying a Cloud Storage URI ending in `KIND_NAME.export_metadata`, while also ensuring that the dataset and bucket are in the same location."],["You can either create a new table with the Datastore export data or overwrite an existing table, but appending data to an existing table is not supported."],["Certain limitations exist, such as a maximum field size of 64 KB, the inability to use wildcards in the Cloud Storage URI, and the requirement for consistent schemas across exported entities."]]],[]]
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4