Stay organized with collections Save and categorize content based on your preferences.
Monitor health, resource utilization, and jobsAs a BigQuery administrator, you can monitor your organization's health, slots use, and jobs performance over time with operational health and resource utilization charts. BigQuery provides configurable charts to help you with the following:
Monitor operational health of BigQuery. BigQuery real-time operational health monitoring is a centralized monitoring system that lets you observe BigQuery usage across the organization in multiple locations.
View BigQuery resource utilization. Use historical data to perform root-cause analysis, plan capacity, and diagnose performance changes.
To get the permissions that you need to view all data in the operational health and resource utilization charts, ask your administrator to grant you the following IAM roles on your organization:
roles/bigquery.resourceViewer
)roles/bigquery.metadataViewer
)For more information about granting roles, see Manage access to projects, folders, and organizations.
These predefined roles contain the permissions required to view all data in the operational health and resource utilization charts. To see the exact permissions that are required, expand the Required permissions section:
Required permissionsThe following permissions are required to view all data in the operational health and resource utilization charts:
bigquery.jobs.listExecutionMetadata or bigquery.jobs.listAll
on the organization bigquery.reservationAssignments.list
on the administration project used to create the reservations bigquery.capacityCommitments.list
on the administration project used to create the reservations bigquery.jobs.listExecutionMetadata or bigquery.jobs.listAll
on the organization bigquery.tables.get or bigquery.tables.list
on the organization bigquery.reservations.list
on the administration project used to create the reservations bigquery.reservationAssignments.list
on the administration project used to create the reservations bigquery.jobs.listAll
on the projectYou might also be able to get these permissions with custom roles or other predefined roles.
Note: Viewing all data in the operational health and resource utilization charts is only available if you have defined Google Cloud organizations. Monitor operational health across an organizationThe operational health dashboard displays key metrics for your organization and its reservations in all the locations where you have reservations. You can use this dashboard to monitor the following metrics:
To view information about the operational health of your organization, follow these steps:
In the Google Cloud console, go to the BigQuery Monitoring page.
Select the administration project that you used to purchase slots and create reservations.
In the Monitoring page, go to the Operational health tab to view a summary of your organization's key metrics for all locations and reservations.
Optional: To view real-time metrics, where queries run on fresh data every five minutes, click the Live data toggle. By default, this setting is turned off and the maximum staleness of the data is about an hour.
To filter the metrics, configure the following fields:
Optional: To view more details about operational health with a resource utilization chart or jobs explorer, click Explore more.
BigQuery gathers the metrics by querying the following INFORMATION_SCHEMA
views:
INFORMATION_SCHEMA.JOBS
INFORMATION_SCHEMA.JOBS_TIMELINE
INFORMATION_SCHEMA.RESERVATIONS
INFORMATION_SCHEMA.TABLE_STORAGE
The Operational health tab displays the following summary and detailed views.
Summary viewThe summary view shows you the health of your organization's subsystems, including reservations and regions, in the last 30 minutes.
To view the summary view, do the following:
The detailed view shows detailed timeline charts of different metrics at a location or a reservation level.
To view the detailed view, do the following:
You can refine the data displayed in the detailed view using the following optional filters:
BigQuery provides the following table and chart options, which let you explore your operational health metrics in detail.
Summary tableThe summary tables display the following metrics:
There are two summary tables presented in the view: Location summary and Reservation summary. Each table row represents usage for one location or one reservation.
The tables display metrics for the 30 minutes prior to the last update. If Live data is enabled, the queried data refreshes every five minutes. If Live data is disabled, then the maximum data staleness is approximately one hour.
The table cell is color coded if the metric is greater than the predefined threshold and if there are an increased number of performance insights for repeated jobs:
All thresholds are predefined and can't be customized. You can check a threshold by clicking info Info.
Note: The Total storage column doesn't support color annotation. Filter dataYou can filter data in charts based on the following values:
This chart shows the top ten jobs with active resources that are sorted in descending order. In the drop-down menu, you can select a sorting option based on slot usage or job duration. The job ID and relevant resource usage number are presented in the bar chart. Select Explore more or the job ID in the top active queries chart to view more details in the jobs explorer. To learn more about the execution details and diagnose performance issues for your BigQuery jobs, see Get query performance insights.
Error donut chartThis chart shows the proportion of the top causes of failure in the selected time period. In the summary view, it defaults to cover the last 30 minutes. In the detailed view, the time range selector controls its coverage. You can group the errors by type, owner project, or reservation. The count of failed jobs is presented in the donut chart.
Metric timeline chartsThe charts display an overview of supported metrics over a maximum of 30 days. The default time period is one hour. As the selected time period changes, the granularity of each data point in the chart is scaled automatically.
These charts display an aggregated value over a region or a reservation. Displaying data for multiple regions or multiple reservations is not supported.
The Metric timeline charts support the following metrics:
Besides the metric trends, the charts display reference lines of the P95 and P99 metrics values of the last week's usage for the same day. The Job concurrency chart shows the threshold for the sum of pending and running jobs. Those referenced values are used as the color coding thresholds in the summary table.
To learn more about the chart, click Explore more. You are redirected to the Resource utilization tab.
Insights tableThis table aggregates quota errors, access denied errors, and performance insights gathered during job executions within the selected time period. Insights are aggregated at reservation level. Each row provides information about the insight type, location, reservation, insight detail, and sample job IDs. Click job IDs to view more job details in the jobs explorer. In the summary view, the default observation timeframe covers the last 30 minutes. In the detailed view, the time range selector controls the specific time period for which observation data is displayed.
Monitor operational health for a single projectThe project operational health dashboard is the default view you get if you don't have access to the entire organization, or your project doesn't own any reservation. This view can help project analysts monitor system health for their projects, much like the operational health dashboard at the organizational level, but it shows only project-level data in its charts and filters.
View BigQuery resource utilizationBigQuery resource charts help you track past resource use to plan for future needs and troubleshoot performance.
The charts gather metrics by querying the following INFORMATION_SCHEMA
views:
The data can be updated in real time, going back a maximum of 30 days.
When you view resource utilization, you can configure the following:
The event timeline chart shows an overview of data over a maximum of 30 days. The default is 6 hours.
The main chart shows chosen metrics, such as slot usage or bytes processed, over time for your organization or administration project. A legend for the chart gives more details about the data shown.
The Chart configuration pane lets you select predefined views of your metrics or customize your own metrics for the views.
The resource utilization chart has the following elements:
The status chips also shows you the following:
To view and configure resource utilization charts, follow these steps:
In the Google Cloud console, go to the BigQuery Monitoring page.
Select the project. As an administrator monitoring reservation resource use, choose the administration project used to buy slots and create reservations. As a data analyst monitoring job resource use, choose the corresponding project.
In the Monitoring page, go to the Resource utilization tab to view a summary of the resource usage, broken down by location.
Choose a time period for the metrics in this view, such as 1 day. To view real-time metrics, where queries run on fresh data, click the Live data toggle. This setting is turned off by default to improve performance, and the maximum staleness of the data is about an hour.
In the Chart configuration pane, configure the following fields:
To save the changes you've made to the chart configuration, click Apply.
BigQuery provides pre-configured views of resource utilization metrics. The following sections describe the metrics that you can configure in those views.
Reservation slot usageThis view shows you metrics about the slot usage breakdown for the reservations in the administrative project. Each metric has the following default settings, which you can edit in the Chart configuration pane by clicking the metric's name:
This view shows you metrics about slot usage and capacity for edition resources in the administrative project. Each metric has the following default settings, which you can edit in the Chart configuration pane by clicking the metric's name:
This view shows you metrics about job resources in the project where you run queries. Each metric has the following default settings, which you can edit in the Chart configuration pane by clicking the metric's name:
This view shows you metrics about job activity with reservation resources in the administrative project. Each metric has the following default settings, which you can edit in the Chart configuration pane by clicking the metric's name:
To create a custom metric view, you can add metrics from scratch, or start with one of the predefined metric views (for example, the Reservation overview metric view) and customize the metrics in it.
You can save custom views for future use. Saved views retain the metric, group by, and filter configurations, with the exception of the user email filter. Saved views are stored at the user level. You can create, update, rename, and delete your saved views.
MetricsWhen you configure a chart, you can add, edit, or delete the resource utilization metrics it monitors.
To configure resource utilization chart metrics, follow these steps:
In the Google Cloud console, view the BigQuery resource utilization charts.
In the Chart configuration pane, configure the Metrics field:
In the Select a metric dialog, choose the following:
Click Apply.
In the New item dialog, choose the aggregation. For example, to monitor the maximum slot usage in excess of the reservation's capacity in the selected time period, choose Max.
Save the metric by clicking Done.
Optional: To change an existing metric, click the metrics name and edit its settings or delete it.
Resource utilization metrics are categorized by resource type and scope.
Resource types
You can monitor the following resource types:
Resource Description Job Metrics about BigQuery job details for a given scope and time period. Reservation Metrics about BigQuery reservation usage for a given scope and time period. These metrics aggregate job details by reservation. If you have only partial data access at the reservation level, aggregated reservation usage metrics are available, without access to job-level details. Edition Metrics about BigQuery edition capacity for a given scope and time period.Scope types
After you select a resource for a metric, you select a scope.
To group data in your resource utilization chart, follow these steps:
In the Google Cloud console, view the BigQuery resource utilization charts.
In the Chart configuration pane, configure the Group by field by selecting one of the following options:
When you configure your resource utilization chart, you can apply filters to your data, such as displaying resource usage for Enterprise edition resources, or by a resource ID.
To filter the chart data, apply filters in the filter pane. You can only select a filter if it's supported for the metrics you selected. To view the required permissions, click the Filter menu.
View project-level resource utilization dataYou can analyze project-level resource utilization using the same configuration steps as you would for viewing organization-level data. Charts only display project-level data and configuration options (organizational-level options are greyed out). This project-level scope shows the overall resource utilization, regardless of billing mode within the contextual project.
Resource utilization chart limitationsBy default, you have access to Edition resource charts when navigating from the reservation administration project. You can toggle between the on-demand resource charts and the Edition resource charts from the reservation administration project (Preview).
To view resource charts, follow these steps:
You can adjust the view of your resource charts by changing the following chart configuration options.
Chart optionsBigQuery provides the following metric types to display in the charts:
The table displays metrics that are relevant to the time period and dimension that you selected in the resource chart.
The Slot Usage chart displays the Average slot usage for all of the jobs running during the selected time period. Jobs that didn't finish within the selected time period include only slots used within the time period.
For the Job Performance chart, the table component displays the following metrics:
For the Failed Jobs chart, the table component displays the following data:
INFORMATION_SCHEMA
views. Group by options
Based on the type of chart, you can group data in the chart view by several dimensions:
You can modify the time period in the following ways:
The alignment period updates automatically as the selected timeframe changes. The smaller the alignment period, the more detailed the view. To better view resources that change frequently, for example the Slot Usage option, reduce the alignment period.
Note: For more granular alignment periods on the slot usage chart, displayed usage might briefly exceed capacity due to data sampling and alignment. Filter and searchTo narrow the chart data, apply filters in the filter panel. Some filters are only available for certain charts. The Reservations, Folders, Projects, and Users filters are populated with the respective resources that have consumed slots in the selected timeframe. For example, if a project hasn't been used in the last 30 days, it does not appear in the project filter list.
The chart refreshes after you apply filters to show data within the selected parameters.
Note: To filter by specific jobs, enter the job ID in the text field without the project prefix. Troubleshoot slot contentionSlot contention can happen when there aren't enough slots to run all of your jobs, causing performance issues. To troubleshoot slot contention issues, see the following steps and best practices.
If you have tried these best practices but are still experiencing job performance issues, you can request support.
Job concurrency spikesUse the Detailed view to check for sudden surge in job runs where there are simultaneous slot usage spikes. This can indicate that there are too many jobs contending for slots limited by your reservation limit.
Use the Detailed view to check for increased job durations, especially if there are jobs that exceed your reservation's maximum capacity. Consistently high slot usage can indicate ongoing slot contention.
If jobs are taking significantly longer to complete, check the Detailed view. High job concurrency and slot usage spikes can indicate slot contention.
The insights table can display messages such as There were NUMBER jobs detected with slot_contention in the reservation.
that indicate slot contention issues. Check the jobs explorer to review details about the specific jobs flagged in these messages.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4