A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://developers.google.com/bigquery/docs/datasets below:

Create datasets | BigQuery | Google Cloud

This document describes how to create datasets in BigQuery.

To see steps for copying a dataset, including across regions, see Copying datasets.

This document describes how to work with regular datasets that store data in BigQuery. To learn how to work with Spanner external datasets see Create Spanner external datasets. To learn how to work with AWS Glue federated datasets see Create AWS Glue federated datasets.

Grant Identity and Access Management (IAM) roles that give users the necessary permissions to perform each task in this document.

To create a dataset, you need the bigquery.datasets.create IAM permission.

Each of the following predefined IAM roles includes the permissions that you need in order to create a dataset:

Console
  1. Open the BigQuery page in the Google Cloud console.
  2. Go to the BigQuery page
  3. In the Explorer panel, select the project where you want to create the dataset.
  4. Expand the more_vert View actions option and click Create dataset.
  5. On the Create dataset page:
    1. For Dataset ID, enter a unique dataset name.
    2. For Location type, choose a geographic location for the dataset. After a dataset is created, the location can't be changed.
    3. Note: If you choose EU or an EU-based region for the dataset location, your Core BigQuery Customer Data resides in the EU. Core BigQuery Customer Data is defined in the Service Specific Terms.
    4. Optional: Select Link to an external dataset if you're creating an external dataset.
    5. If you don't need to configure additional options such as tags and table expirations, click Create dataset. Otherwise, expand the following section to configure the additional dataset options.
    Additional options for datasets
    1. Optional: Expand the Tags section to add tags to your dataset.
    2. To apply an existing tag, do the following:
      1. Click the drop-down arrow beside Select scope and choose Current scope—Select current organization or Select current project.
      2. Alternatively, click Select scope to search for a resource or to see a list of current resources.

      3. For Key 1 and Value 1, choose the appropriate values from the lists.
    3. To manually enter a new tag, do the following:
      1. Click the drop-down arrow beside Select a scope and choose Manually enter IDs > Organization, Project, or Tags.
      2. If you're creating a tag for your project or organization, in the dialog, enter the PROJECT_ID or the ORGANIZATION_ID, and then click Save.
      3. For Key 1 and Value 1, choose the appropriate values from the lists.
      4. To add additional tags to the table, click Add tag and follow the previous steps.
    4. Optional: Expand the Advanced options section to configure one or more of the following options.
      1. To change the Encryption option to use your own cryptographic key with the Cloud Key Management Service, select Cloud KMS key.
      2. To use case-insensitive table names, select Enable case insensitive table names.
      3. To change the Default collation specification, choose the collation type from the list.
      4. To set an expiration for tables in the dataset, select Enable table expiration, then specify the Default maximum table age in days.
      5. Note: If your project is not associated with a billing account, BigQuery automatically sets the default table expiration for datasets that you create in the project. You can specify a shorter default table expiration for a dataset, but you can't specify a longer default table expiration.
      6. To set a Default rounding mode, choose the rounding mode from the list.
      7. To enable the physical Storage billing model, choose the billing model from the list.
      8. When you change a dataset's billing model, it takes 24 hours for the change to take effect.

        Once you change a dataset's storage billing model, you must wait 14 days before you can change the storage billing model again.

      9. To set the dataset's time travel window, choose the window size from the list.
    5. Click Create dataset.
SQL

Use the CREATE SCHEMA statement.

To create a dataset in a project other than your default project, add the project ID to the dataset ID in the following format: PROJECT_ID.DATASET_ID.

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the query editor, enter the following statement:

    CREATE SCHEMA PROJECT_ID.DATASET_ID
      OPTIONS (
        default_kms_key_name = 'KMS_KEY_NAME',
        default_partition_expiration_days = PARTITION_EXPIRATION,
        default_table_expiration_days = TABLE_EXPIRATION,
        description = 'DESCRIPTION',
        labels = [('KEY_1','VALUE_1'),('KEY_2','VALUE_2')],
        location = 'LOCATION',
        max_time_travel_hours = HOURS,
        storage_billing_model = BILLING_MODEL);

    Replace the following:

  3. Click play_circle Run.

For more information about how to run queries, see Run an interactive query.

bq

To create a new dataset, use the bq mk command with the --location flag. For a full list of possible parameters, see the bq mk --dataset command reference.

To create a dataset in a project other than your default project, add the project ID to the dataset name in the following format: PROJECT_ID:DATASET_ID.

bq --location=LOCATION mk \
    --dataset \
    --default_kms_key=KMS_KEY_NAME \
    --default_partition_expiration=PARTITION_EXPIRATION \
    --default_table_expiration=TABLE_EXPIRATION \
    --description="DESCRIPTION" \
    --label=KEY_1:VALUE_1 \
    --label=KEY_2:VALUE_2 \
    --add_tags=KEY_3:VALUE_3[,...] \
    --max_time_travel_hours=HOURS \
    --storage_billing_model=BILLING_MODEL \
    PROJECT_ID:DATASET_ID

Replace the following:

For example, the following command creates a dataset named mydataset with data location set to US, a default table expiration of 3600 seconds (1 hour), and a description of This is my dataset. Instead of using the --dataset flag, the command uses the -d shortcut. If you omit -d and --dataset, the command defaults to creating a dataset.

bq --location=US mk -d \
    --default_table_expiration 3600 \
    --description "This is my dataset." \
    mydataset

To confirm that the dataset was created, enter the bq ls command. Also, you can create a table when you create a new dataset using the following format: bq mk -t dataset.table. For more information about creating tables, see Creating a table.

Terraform

Use the google_bigquery_dataset resource.

Note: You must enable the Cloud Resource Manager API in order to use Terraform to create BigQuery objects.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

Create a dataset

The following example creates a dataset named mydataset:

When you create a dataset using the google_bigquery_dataset resource, it automatically grants access to the dataset to all accounts that are members of project-level basic roles. If you run the terraform show command after creating the dataset, the access block for the dataset looks similar to the following:

To grant access to the dataset, we recommend that you use one of the google_bigquery_iam resources, as shown in the following example, unless you plan to create authorized objects, such as authorized views, within the dataset. In that case, use the google_bigquery_dataset_access resource. Refer to that documentation for examples.

Create a dataset and grant access to it

The following example creates a dataset named mydataset, then uses the google_bigquery_dataset_iam_policy resource to grant access to it.

Note: Don't use this approach if you want to use authorized objects, such as authorized views, with this dataset. In that case, use the google_bigquery_dataset_access resource. For examples, see google_bigquery_dataset_access.

Create a dataset with a customer-managed encryption key

The following example creates a dataset named mydataset, and also uses the google_kms_crypto_key and google_kms_key_ring resources to specify a Cloud Key Management Service key for the dataset. You must enable the Cloud Key Management Service API before running this example.

To apply your Terraform configuration in a Google Cloud project, complete the steps in the following sections.

Prepare Cloud Shell
  1. Launch Cloud Shell.
  2. Set the default Google Cloud project where you want to apply your Terraform configurations.

    You only need to run this command once per project, and you can run it in any directory.

    export GOOGLE_CLOUD_PROJECT=PROJECT_ID

    Environment variables are overridden if you set explicit values in the Terraform configuration file.

Prepare the directory

Each Terraform configuration file must have its own directory (also called a root module).

  1. In Cloud Shell, create a directory and a new file within that directory. The filename must have the .tf extension—for example main.tf. In this tutorial, the file is referred to as main.tf.
    mkdir DIRECTORY && cd DIRECTORY && touch main.tf
  2. If you are following a tutorial, you can copy the sample code in each section or step.

    Copy the sample code into the newly created main.tf.

    Optionally, copy the code from GitHub. This is recommended when the Terraform snippet is part of an end-to-end solution.

  3. Review and modify the sample parameters to apply to your environment.
  4. Save your changes.
  5. Initialize Terraform. You only need to do this once per directory.
    terraform init

    Optionally, to use the latest Google provider version, include the -upgrade option:

    terraform init -upgrade
Apply the changes
  1. Review the configuration and verify that the resources that Terraform is going to create or update match your expectations:
    terraform plan

    Make corrections to the configuration as necessary.

  2. Apply the Terraform configuration by running the following command and entering yes at the prompt:
    terraform apply

    Wait until Terraform displays the "Apply complete!" message.

  3. Open your Google Cloud project to view the results. In the Google Cloud console, navigate to your resources in the UI to make sure that Terraform has created or updated them.
Note: Terraform samples typically assume that the required APIs are enabled in your Google Cloud project. API

Call the datasets.insert method with a defined dataset resource.

C#

Before trying this sample, follow the C# setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery C# API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

Go

Before trying this sample, follow the Go setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Go API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

Node.js

Before trying this sample, follow the Node.js setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Node.js API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

PHP

Before trying this sample, follow the PHP setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery PHP API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

Python

Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

Ruby

Before trying this sample, follow the Ruby setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Ruby API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

When you create a dataset in BigQuery, the dataset name must be unique for each project. The dataset name can contain the following:

Dataset names are case-sensitive by default. mydataset and MyDataset can coexist in the same project, unless one of them has case-sensitivity turned off. For examples, see Creating a case-insensitive dataset and Resource: Dataset.

A hidden dataset is a dataset whose name begins with an underscore. You can query tables and views in hidden datasets the same way you would in any other dataset. Hidden datasets have the following restrictions:


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4