A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://developers.google.com/bigquery/docs/resource-hierarchy below:

Organizing BigQuery resources | Google Cloud

Stay organized with collections Save and categorize content based on your preferences.

Organizing BigQuery resources

Like other Google Cloud services, BigQuery resources are organized in a hierarchy. You can use this hierarchy to manage aspects of your BigQuery workloads such as permissions, quotas, slot reservations, and billing.

Resource hierarchy

BigQuery inherits the Google Cloud resource hierarchy and adds an additional grouping mechanism called datasets, which are specific to BigQuery. This section describes the elements of this hierarchy.

Datasets

Datasets are logical containers that are used to organize and control access to your BigQuery resources. Datasets are similar to schemas in other database systems.

Most BigQuery resources that you create — including tables, views, functions, and procedures — are created inside a dataset. Connections and jobs are exceptions; these are associated with projects rather than datasets.

A dataset has a location. When you create a table, the table data is stored in the location of the dataset. Before you create tables for production data, think about your location requirements. You cannot change the location of a dataset after it is created.

Projects

Every dataset is associated with a project. To use Google Cloud, you must create at least one project. Projects form the basis for creating, enabling, and using all Google Cloud services. For more information, see Resource hierarchy. A project can hold multiple datasets, and datasets with different locations can exist in the same project.

When you perform operations on your BigQuery data, such as running a query or ingesting data into a table, you create a job. A job is always associated with a project, but it doesn't have to run in the same project that contains the data. In fact, a job might reference tables from datasets in multiple projects. A query job, load job, or export job always runs in the same location as the tables that it references.

Each project has a Cloud Billing account attached to it. The costs accrued to a project are billed to that account. If you use on-demand pricing, your queries are billed to the project that runs the query. If you use capacity-based pricing, your slot reservations are billed to the administration project used to purchase the slots. Storage is charged to the project where the dataset resides.

Folders

Folders are an additional grouping mechanism above projects. Projects and folders inside a folder automatically inherit the access policies of their parent folder. Folders can be used to model different legal entities, departments, and teams within a company.

Organizations

The Organization resource represents an organization (for example, a company) and is the root node in the Google Cloud resource hierarchy.

You don't need an Organization resource to get started using BigQuery, but we recommend creating one. Using an Organization resource allows administrators to centrally control your BigQuery resources, rather than individual users controlling the resources they create.

The following diagram shows an example of the resource hierarchy. In this example, the organization has a project inside a folder. The project is associated with a billing account, and it contains three datasets.

Considerations

When choosing how to organize your BigQuery resources, consider the following points:

Patterns

This section presents two common patterns for organizing BigQuery resources.

There are advantages and tradeoffs to each approach. Many organizations combine elements of both patterns.

Central data lake, department data marts

In this pattern, you create a unified storage project to hold your organization's raw data. Your data ingestion pipeline can also run in this project. The unified storage project acts as a data lake for your organization.

Each department has its own dedicated project, which it uses to query the data, save query results, and create views. These department-level projects act as data marts. They are associated with the department's billing account.

Advantages of this structure include:

When using this structure, the following permissions are typical:

For more information, see Basic roles and permissions.

Department data lakes, central data warehouse

In this pattern, each department creates and manages its own storage project, which holds that department's raw data. A central data warehouse project stores aggregations or transformations of the raw data.

Analysts can query and read the aggregated data from the data warehouse project. The data warehouse project also provides an access layer for business intelligence (BI) tools.

Advantages of this structure include:

When using this structure, the following permissions are typical:

For more information, see Basic roles and permissions.

You can also use security features such as authorized views and authorized user-defined functions (UDFs) to make aggregated data available to certain users without granting them permission to see the raw data in the data mart projects.

This project structure can result in many concurrent queries in the data warehouse project. As a result, you might hit the concurrent query limit. If you adopt this structure, consider raising this quota limit for the project. Also consider using capacity-based billing, so that you can purchase a pool of slots to run the queries.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4