A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://cloud.google.com/python/docs/reference/bigframes/0.26.0/bigframes.pandas.ArrowDtype below:

Python client library | Google Cloud

BigQuery DataFrames

BigQuery DataFrames provides a Pythonic DataFrame and machine learning (ML) API powered by the BigQuery engine.

BigQuery DataFrames is an open-source package. You can run pip install --upgrade bigframes to install the latest version.

Documentation Quickstart Prerequisites Code sample

Import bigframes.pandas for a pandas-like interface. The read_gbq method accepts either a fully-qualified table ID or a SQL query.

import bigframes.pandas as bpd

bpd.options.bigquery.project = your_gcp_project_id
df1 = bpd.read_gbq("project.dataset.table")
df2 = bpd.read_gbq("SELECT a, b, c, FROM `project.dataset.table`")
Locations

BigQuery DataFrames uses a BigQuery session internally to manage metadata on the service side. This session is tied to a location . BigQuery DataFrames uses the US multi-region as the default location, but you can use session_options.location to set a different location. Every query in a session is executed in the location where the session was created. BigQuery DataFrames auto-populates bf.options.bigquery.location if the user starts with read_gbq/read_gbq_table/read_gbq_query() and specifies a table, either directly or in a SQL statement.

If you want to reset the location of the created DataFrame or Series objects, you can close the session by executing bigframes.pandas.close_session(). After that, you can reuse bigframes.pandas.options.bigquery.location to specify another location.

read_gbq() requires you to specify a location if the dataset you are querying is not in the US multi-region. If you try to read a table from another location, you get a NotFound exception.

Project

If bf.options.bigquery.project is not set, the $GOOGLE_CLOUD_PROJECT environment variable is used, which is set in the notebook runtime serving the BigQuery Studio/Vertex Notebooks.

ML Capabilities

The ML capabilities in BigQuery DataFrames let you preprocess data, and then train models on that data. You can also chain these actions together to create data pipelines.

Preprocess data

Create transformers to prepare data for use in estimators (models) by using the bigframes.ml.preprocessing module and the bigframes.ml.compose module. BigQuery DataFrames offers the following transformations:

Train models

Create estimators to train models in BigQuery DataFrames.

Clustering models

Create estimators for clustering models by using the bigframes.ml.cluster module.

Decomposition models

Create estimators for decomposition models by using the bigframes.ml.decomposition module.

Ensemble models

Create estimators for ensemble models by using the bigframes.ml.ensemble module.

Forecasting models

Create estimators for forecasting models by using the bigframes.ml.forecasting module.

Imported models

Create estimators for imported models by using the bigframes.ml.imported module.

Linear models

Create estimators for linear models by using the bigframes.ml.linear_model module.

Large language models

Create estimators for LLMs by using the bigframes.ml.llm module.

Create pipelines

Create ML pipelines by using bigframes.ml.pipeline module. Pipelines let you assemble several ML steps to be cross-validated together while setting different parameters. This simplifies your code, and allows you to deploy data preprocessing steps and an estimator together.

ML remote models

Requirements

To use BigQuery DataFrames ML remote models (bigframes.ml.remote or bigframes.ml.llm), you must enable the following APIs:

and you must be granted the following IAM roles:

ML locations

bigframes.ml supports the same locations as BigQuery ML. BigQuery ML model prediction and other ML functions are supported in all BigQuery regions. Support for model training varies by region. For more information, see BigQuery ML locations.

Data types

BigQuery DataFrames supports the following numpy and pandas dtypes:

BigQuery DataFrames doesn’t support the following BigQuery data types:

All other BigQuery data types display as the object type.

Remote functions

BigQuery DataFrames gives you the ability to turn your custom scalar functions into BigQuery remote functions . Creating a remote function in BigQuery DataFrames (See code samples) creates a BigQuery remote function, a BigQuery connection , and a Cloud Functions (2nd gen) function .

BigQuery connections are created in the same location as the BigQuery DataFrames session, using the name you provide in the custom function definition. To view and manage connections, do the following:

  1. Go to BigQuery in the Google Cloud Console.

  2. Select the project in which you created the remote function.

  3. In the Explorer pane, expand that project and then expand External connections.

BigQuery remote functions are created in the dataset you specify, or in a special type of hidden dataset referred to as an anonymous dataset. To view and manage remote functions created in a user provided dataset, do the following:

  1. Go to BigQuery in the Google Cloud Console.

  2. Select the project in which you created the remote function.

  3. In the Explorer pane, expand that project, expand the dataset in which you created the remote function, and then expand Routines.

To view and manage Cloud Functions functions, use the Functions page and use the project picker to select the project in which you created the function. For easy identification, the names of the functions created by BigQuery DataFrames are prefixed by bigframes.

Requirements

To use BigQuery DataFrames remote functions, you must enable the following APIs:

To use BigQuery DataFrames remote functions, you must be granted the following IAM roles:

Limitations

Quotas and limits

BigQuery quotas including hardware, software, and network components.

Session termination

Each BigQuery DataFrames DataFrame or Series object is tied to a BigQuery DataFrames session, which is in turn based on a BigQuery session. BigQuery sessions auto-terminate ; when this happens, you can’t use previously created DataFrame or Series objects and must re-create them using a new BigQuery DataFrames session. You can do this by running bigframes.pandas.close_session() and then re-running the BigQuery DataFrames expressions.

Data processing location

BigQuery DataFrames is designed for scale, which it achieves by keeping data and processing on the BigQuery service. However, you can bring data into the memory of your client machine by calling .to_pandas() on a DataFrame or Series object. If you choose to do this, the memory limitation of your client machine applies.

License

BigQuery DataFrames is distributed with the Apache-2.0 license.

It also contains code derived from the following third-party packages:

For details, see the third_party directory.

For further help and provide feedback, you can email us at bigframes-feedback@google.com.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4