A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://developers.google.com/bigquery/docs/managing-partitioned-tables below:

Managing partitioned tables | BigQuery

Stay organized with collections Save and categorize content based on your preferences.

Managing partitioned tables

This document describes how to manage partitioned tables in BigQuery.

Note: The information in Managing tables also applies to partitioned tables.

You can get information about partitioned tables in the following ways:

Getting partition metadata using INFORMATION_SCHEMA views

When you query the INFORMATION_SCHEMA.PARTITIONS view, the query results contain one row for each partition. For example, the following query lists all of the table partitions in the dataset named mydataset:

SELECT table_name, partition_id, total_rows
FROM `mydataset.INFORMATION_SCHEMA.PARTITIONS`
WHERE partition_id IS NOT NULL

For more information, see INFORMATION_SCHEMA.PARTITIONS.

Getting partition metadata using meta-tables

In legacy SQL, you can get metadata about table partitions by querying the __PARTITIONS_SUMMARY__ meta-table. Meta-tables are read-only tables that contain metadata.

Query the __PARTITIONS_SUMMARY__ meta-table as follows:

#legacySQL
SELECT
  column
FROM
  [dataset.table$__PARTITIONS_SUMMARY__]
Note: GoogleSQL does not support the partition decorator separator ($), so you cannot query __PARTITIONS_SUMMARY__ in GoogleSQL.

The __PARTITIONS_SUMMARY__ meta-table has the following columns:

Value Description project_id Name of the project. dataset_id Name of the dataset. table_id Name of the time-partitioned table. partition_id Name (date) of the partition. creation_time The time at which the partition was created, in milliseconds since January 1, 1970 UTC. last_modified_time The time at which the partition was last modified, in milliseconds since January 1, 1970 UTC.

At a minimum, to run a query job that uses the __PARTITIONS_SUMMARY__ meta- table, you must be granted bigquery.jobs.create permissions and bigquery.tables.getData permissions.

For more information on IAM roles in BigQuery, see Access control.

Set the partition expiration

When you create a table partitioned by ingestion time or time-unit column, you can specify a partition expiration. This setting specifies how long BigQuery keeps the data in each partition. The setting applies to all partitions in the table, but is calculated independently for each partition based on the partition time.

A partition's expiration time is calculated from the partition boundary in UTC. For example, with daily partitioning, the partition boundary is at midnight (00:00:00 UTC). If the table's partition expiration is 6 hours, then each partition expires at 06:00:00 UTC the following day. When a partition expires, BigQuery deletes the data in that partition.

You can also specify a default partition expiration at the dataset level. If you set the partition expiration on a table, then the value overrides the default partition expiration. If you don't specify any partition expiration (on the table or dataset), then partitions never expire.

Note: Integer-range partitioned tables don't support partition expiration times.

If you set a table expiration, that value takes precedence over the partition expiration. For example, if the table expiration is set to 5 days, and the partition expiration is set to 7 days, then the table and all partitions in it are deleted after 5 days.

At any point after a table is created, you can update the table's partition expiration. The new setting applies to all partitions in that table, regardless of when they were created. Existing partitions expire immediately if they are older than the new expiration time. Similarly, if data is being copied or inserted to a table partitioned by time-unit column, any partitions older than partition expiration configured for the table are expired immediately.

When a partition expires, BigQuery deletes that partition. The partition data is retained in accordance with time travel and fail-safe policies, and can be charged for, depending on your billing model. Until then, the partition counts for purposes of table quotas. To delete a partition immediately, you can manually delete the partition.

Update the partition expiration

To update a partitioned table's partition expiration:

Console

You cannot update the partition expiration in the Google Cloud console.

SQL

Use the ALTER TABLE SET OPTIONS statement. The following example updates the expiration to 5 days. To remove the partition expiration for a table, set partition_expiration_days to NULL.

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the query editor, enter the following statement:

    ALTER TABLE mydataset.mytable
      SET OPTIONS (
        -- Sets partition expiration to 5 days
        partition_expiration_days = 5);
  3. Click play_circle Run.

For more information about how to run queries, see Run an interactive query.

bq

Issue the bq update command with the --time_partitioning_expiration flag. If you are updating a partitioned table in a project other than your default project, add the project ID to the dataset name in the following format: project_id:dataset.

bq update \
--time_partitioning_expiration integer_in_seconds \
--time_partitioning_type unit_time \
project_id:dataset.table

Where:

Examples:

Enter the following command to update the expiration time of partitions in mydataset.mytable to 5 days (432000 seconds). mydataset is in your default project.

bq update --time_partitioning_expiration 432000 mydataset.mytable

Enter the following command to update the expiration time of partitions in mydataset.mytable to 5 days (432000 seconds). mydataset is in myotherproject, not your default project.

bq update \
--time_partitioning_expiration 432000 \
myotherproject:mydataset.mytable
API

Call the tables.patch method and use the timePartitioning.expirationMs property to update the partition expiration in milliseconds. Because the tables.update method replaces the entire table resource, the tables.patch method is preferred.

Set partition filter requirements

When you create a partitioned table, you can require that all queries on the table must include a predicate filter (a WHERE clause) that filters on the partitioning column. This setting can improve performance and reduce costs, because BigQuery can use the filter to prune partitions that don't match the predicate. This requirement also applies to queries on views and materialized views that reference the partitioned table.

For information on adding the Require partition filter option when you create a partitioned table, see Creating partitioned tables.

If a partitioned table has the Require partition filter setting, then every query on that table must include at least one predicate that only references the partitioning column. Queries without such a predicate return the following error:

Cannot query over table 'project_id.dataset.table' without a filter that can be used for partition elimination.

For more information, see Querying partitioned tables.

Update the partition filter requirement

If you don't enable the Require partition filter option when you create the partitioned table, you can update the table to add the option.

Console

You cannot use the Google Cloud console to require partition filters after a partitioned table is created.

SQL

Use the ALTER TABLE SET OPTIONS statement to update the partition filter requirement. The following example updates the requirement to true:

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the query editor, enter the following statement:

    ALTER TABLE mydataset.mypartitionedtable
      SET OPTIONS (
        require_partition_filter = true);
  3. Click play_circle Run.

For more information about how to run queries, see Run an interactive query.

bq

To update a partitioned table to require partition filters by using the bq command-line tool, enter the bq update command and supply the --require_partition_filter flag.

To update a partitioned table in a project other than your default project, add the project ID to the dataset in the following format: project_id:dataset.

For example:

To update mypartitionedtable in mydataset in your default project, enter:

bq update --require_partition_filter mydataset.mytable

To update mypartitionedtable in mydataset in myotherproject, enter:

bq update --require_partition_filter myotherproject:mydataset.mytable
API

Call the tables.patch method and set the requirePartitionFilter property to true to require partition filters. Because the tables.update method replaces the entire table resource, the tables.patch method is preferred.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

Copy a partitioned table

The process for copying a partitioned table is the same as the process for copying a standard table. For more information, see Copying a table.

When you copy a partitioned table, note the following:

When you copy to an existing table, you can specify whether to append or overwrite the destination table.

Copy individual partitions

You can copy the data from one or more partitions to another table.

Note: The required permissions are the same as for copying a table. Console

Copying partitions is not supported by the Google Cloud console.

bq

To copy a partition, use the bq command-line tool's bq cp (copy) command with a partition decorator ($date) such as $20160201.

Optional flags can be used to control the write disposition of the destination partition:

The cp command does not support the --time_partitioning_field or --time_partitioning_type flags. You cannot use a copy job to convert an ingestion-time partitioned table into a partitioned table.

--destination_kms_key is not demonstrated here. See Protecting data with Cloud KMS keys for more information.

If the source or destination dataset is in a project other than your default project, add the project ID to the dataset names in the following format: project_id:dataset.

(Optional) Supply the --location flag and set the value to your location.

bq --location=location cp \
-a -f -n \
project_id:dataset.source_table$source_partition \
project_id:dataset.destination_table$destination_partition

Where:

Examples:

Note: The partition decorator separator ($) is a special variable in the unix shell. You might have to escape the decorator when you use the command- line tool. The following examples escape the partition decorator: mydataset.table\$20160519, 'mydataset.table$20160519'.

Copying a partition to a new table

Enter the following command to copy the January 30, 2018 partition from mydataset.mytable to a new table — mydataset.mytable2. mydataset is in your default project.

bq cp -a 'mydataset.mytable$20180130' mydataset.mytable2

Copying a partition to a non-partitioned table

Enter the following command to copy the January 30, 2018 partition from mydataset.mytable to a non-partitioned table — mydataset2.mytable2. The -a shortcut is used to append the partition's data to the non-partitioned destination table. Both datasets are in your default project.

bq cp -a 'mydataset.mytable$20180130' mydataset2.mytable2

Enter the following command to copy the January 30, 2018 partition from mydataset.mytable to a non-partitioned table — mydataset2.mytable2. The -f shortcut is used to overwrite the non-partitioned destination table without prompting.

bq --location=US cp -f 'mydataset.mytable$20180130' mydataset2.mytable2

Copying a partition to another partitioned table

Enter the following command to copy the January 30, 2018 partition from mydataset.mytable to another partitioned table — mydataset2.mytable2. The -a shortcut is used to append the partition's data to the destination table. Since no partition decorator is specified on the destination table, the source partition key is preserved and the data is copied to the January 30, 2018 partition in the destination table. You can also specify a partition decorator on the destination table to copy data to a specific partition. mydataset is in your default project. mydataset2 is in myotherproject, not your default project.

bq --location=US cp \
-a \
'mydataset.mytable$20180130' \
myotherproject:mydataset2.mytable2

Enter the following command to copy the January 30, 2018 partition from mydataset.mytable to the January 30, 2018 partition of another partitioned table — mydataset2.mytable2. The -f shortcut is used to overwrite the January 30, 2018 partition in the destination table without prompting. If no partition decorator is used, all data in the destination table is overwritten. mydataset is in your default project. mydataset2 is in myotherproject, not your default project.

bq cp \
-f \
'mydataset.mytable$20180130' \
'myotherproject:mydataset2.mytable2$20180130'

Enter the following command to copy the January 30, 2018 partition from mydataset.mytable to another partitioned table — mydataset2.mytable2. mydataset is in your default project. mydataset2 is in myotherproject, not your default project. If there is data in the destination table, the default behavior is to prompt you to overwrite.

bq cp \
'mydataset.mytable$20180130' \
myotherproject:mydataset2.mytable2
Note: The bq cp command with a partition decorator works on column-based partitions in which the source partition and destination partition are identical. The bq cp command also works on ingestion-time based partitions where the partition represents either the same time unit or a coarser time unit that contains the source partition. For example, if $20180130 is the source partition decorator, valid destination partition decorators include $20180130, $201801, and $2018. To copy a column-based partition to a completely different partition decorator or to a time-unit partition with finer granularity, use an INSERT SELECT statement.

To copy multiple partitions, specify them as a comma-separated list:

bq cp \
'mydataset.mytable$20180130,mydataset.mytable$20180131' \
myotherproject:mydataset.mytable2
API

Call the jobs.insert method, and configure a copy job. (Optional) Specify your region in the location property in the jobReference section of the job resource.

Specify the following properties in your job configuration:

To copy multiple partitions, enter the source partitions (including the dataset and table names) in the sourceTables property.

Delete a partition

You can delete an individual partition from a partitioned table. However, you can't delete the special __NULL__ or __UNPARTITIONED__ partitions.

You can only delete one partition at a time.

Note: The required permissions are the same as for deleting a table.

You can delete a partition by specifying the partition's decorator unless it is one of the two special partitions.

To delete a partition in a partitioned table:

Console

Deleting partitions is not supported by the Google Cloud console.

SQL

If a qualifying DELETE statement covers all rows in a partition, BigQuery removes the entire partition. This removal is done without scanning bytes or consuming slots. The following example of a DELETE statement covers the entire partition of a filter on the _PARTITIONDATE pseudocolumn:

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the query editor, enter the following statement:

    DELETE mydataset.mytable
    WHERE _PARTITIONDATE IN ('2076-10-07', '2076-03-06');
  3. Click play_circle Run.

For more information about how to run queries, see Run an interactive query.

bq

Use the bq rm command with the --table flag (or -t shortcut) and specify the partition decorator to delete a specific partition.

bq rm --table project_id:dataset.table$partition

Where:

Partition decorators have the following format, depending on the type of partitioning:

The bq command-line tool prompts you to confirm the action. To skip the confirmation, use the --force flag (or -f shortcut).

Note: The partition decorator separator ($) is a special variable in the unix shell. You might have to escape the decorator when you use the command- line tool. The following examples escape the partition decorator: mydataset.table\$20160519, 'mydataset.table$20160519'.

Examples:

Delete the partition for March 1, 2016 in a daily partitioned table named mydataset.mytable in your default project:

bq rm --table 'mydataset.mytable$20160301'

Delete the partition for March, 2016 in a monthly partitioned table:

bq rm --table 'mydataset.mytable$201603'

Delete the integer range starting at 20 in an integer range partitioned table named mydataset.mytable:

bq rm --table 'mydataset.mytable$20'
API

Call the tables.delete method and specify the table and partition decorator using the tableId parameter.

Partitioned table security

Access control for partitioned tables is the same as access control for standard tables. For more information, see Introduction to table access controls.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4