RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://cloud.google.com/python/docs/reference/bigframes/1.2.0/bigframes.session.Session below:

Class Session (1.2.0) | Python client library

Skip to main content Class Session (1.2.0)

Stay organized with collections Save and categorize content based on your preferences.

Session(
    context: typing.Optional[bigframes._config.bigquery_options.BigQueryOptions] = None,
    clients_provider: typing.Optional[bigframes.session.clients.ClientsProvider] = None,
)

Establishes a BigQuery connection to capture a group of job activities related to DataFrames.

Parameters Name Description context bigframes._config.bigquery_options.BigQueryOptions

Configuration adjusting how to connect to BigQuery and related APIs. Note that some options are ignored if clients_provider is set.

clients_provider bigframes.session.clients.ClientsProvider

An object providing client library objects.

Properties bqclient

API documentation for bqclient property.

bqconnectionclient

API documentation for bqconnectionclient property.

bqconnectionmanager

API documentation for bqconnectionmanager property.

bqstoragereadclient

API documentation for bqstoragereadclient property.

cloudfunctionsclient

API documentation for cloudfunctionsclient property.

resourcemanagerclient

API documentation for resourcemanagerclient property.

Methods close

No-op. Temporary resources are deleted after 7 days.

read_csv

read_csv(
    filepath_or_buffer: str | IO["bytes"],
    *,
    sep: Optional[str] = ",",
    header: Optional[int] = 0,
    names: Optional[
        Union[MutableSequence[Any], np.ndarray[Any, Any], Tuple[Any, ...], range]
    ] = None,
    index_col: Optional[
        Union[int, str, Sequence[Union[str, int]], Literal[False]]
    ] = None,
    usecols: Optional[
        Union[
            MutableSequence[str],
            Tuple[str, ...],
            Sequence[int],
            pandas.Series,
            pandas.Index,
            np.ndarray[Any, Any],
            Callable[[Any], bool],
        ]
    ] = None,
    dtype: Optional[Dict] = None,
    engine: Optional[
        Literal["c", "python", "pyarrow", "python-fwf", "bigquery"]
    ] = None,
    encoding: Optional[str] = None,
    **kwargs
) -> dataframe.DataFrame

Loads DataFrame from comma-separated values (csv) file locally or from Cloud Storage.

The CSV file data will be persisted as a temporary BigQuery table, which can be automatically recycled after the Session is closed.

Note: using engine="bigquery" will not guarantee the same ordering as the file. Instead, set a serialized index column as the index and sort by that in the resulting DataFrame. Examples:

>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None

>>> gcs_path = "gs://cloud-samples-data/bigquery/us-states/us-states.csv"
>>> df = bpd.read_csv(filepath_or_buffer=gcs_path)
>>> df.head(2)
      name post_abbr
0  Alabama        AL
1   Alaska        AK
<BLANKLINE>
[2 rows x 2 columns]

Parameters Name Description filepath_or_buffer str

A local or Google Cloud Storage (gs://) path with engine="bigquery" otherwise passed to pandas.read_csv.

sep Optional[str], default ","

the separator for fields in a CSV file. For the BigQuery engine, the separator can be any ISO-8859-1 single-byte character. To use a character in the range 128-255, you must encode the character as UTF-8. Both engines support sep=" " to specify tab character as separator. Default engine supports having any number of spaces as separator by specifying sep="\s+". Separators longer than 1 character are interpreted as regular expressions by the default engine. BigQuery engine only supports single character separators.

header Optional[int], default 0

row number to use as the column names. - None: Instructs autodetect that there are no headers and data should be read starting from the first row. - 0: If using engine="bigquery", Autodetect tries to detect headers in the first row. If they are not detected, the row is read as data. Otherwise data is read starting from the second row. When using default engine, pandas assumes the first row contains column names unless the names argument is specified. If names is provided, then the first row is ignored, second row is read as data, and column names are inferred from names. - N > 0: If using engine="bigquery", Autodetect skips N rows and tries to detect headers in row N+1. If headers are not detected, row N+1 is just skipped. Otherwise row N+1 is used to extract column names for the detected schema. When using default engine, pandas will skip N rows and assumes row N+1 contains column names unless the names argument is specified. If names is provided, row N+1 will be ignored, row N+2 will be read as data, and column names are inferred from names.

names default None

a list of column names to use. If the file contains a header row and you want to pass this parameter, then header=0 should be passed as well so the first (header) row is ignored. Only to be used with default engine.

index_col default None

column(s) to use as the row labels of the DataFrame, either given as string name or column index. index_col=False can be used with the default engine only to enforce that the first column is not used as the index. Using column index instead of column name is only supported with the default engine. The BigQuery engine only supports having a single column name as the index_col. Neither engine supports having a multi-column index.

usecols default None

List of column names to use): The BigQuery engine only supports having a list of string column names. Column indices and callable functions are only supported with the default engine. Using the default engine, the column names in usecols can be defined to correspond to column names provided with the names parameter (ignoring the document's header row of column names). The order of the column indices/names in usecols is ignored with the default engine. The order of the column names provided with the BigQuery engine will be consistent in the resulting dataframe. If using a callable function with the default engine, only column names that evaluate to True by the callable function will be in the resulting dataframe.

dtype data type for data or columns

Data type for data or columns. Only to be used with default engine.

engine Optional[Dict], default None

Type of engine to use. If engine="bigquery" is specified, then BigQuery's load API will be used. Otherwise, the engine will be passed to pandas.read_csv.

encoding Optional[str], default to None

encoding the character encoding of the data. The default encoding is UTF-8 for both engines. The default engine acceps a wide range of encodings. Refer to Python documentation for a comprehensive list, https://docs.python.org/3/library/codecs.html#standard-encodings The BigQuery engine only supports UTF-8 and ISO-8859-1.

read_gbq

read_gbq(
    query_or_table: str,
    *,
    index_col: Iterable[str] | str = (),
    columns: Iterable[str] = (),
    configuration: Optional[Dict] = None,
    max_results: Optional[int] = None,
    filters: third_party_pandas_gbq.FiltersType = (),
    use_cache: Optional[bool] = None,
    col_order: Iterable[str] = ()
) -> dataframe.DataFrame

Loads a DataFrame from BigQuery.

BigQuery tables are an unordered, unindexed data source. By default, the DataFrame will have an arbitrary index and ordering.

Set the index_col argument to one or more columns to choose an index. The resulting DataFrame is sorted by the index columns. For the best performance, ensure the index columns don't contain duplicate values.

Note: By default, even SQL query inputs with an ORDER BY clause create a DataFrame with an arbitrary ordering. Use row_number() OVER (ORDER BY ...) AS rowindex in your SQL query and set index_col='rowindex' to preserve the desired ordering.

If your query doesn't have an ordering, select

GENERATE_UUID() AS rowindex

in your SQL and set

index_col='rowindex'

for the best performance.

Examples:

>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None

If the input is a table ID:

>>> df = bpd.read_gbq("bigquery-public-data.ml_datasets.penguins")

Read table path with wildcard suffix and filters:

df = bpd.read_gbq_table("bigquery-public-data.noaa_gsod.gsod19*", filters=[("_table_suffix", ">=", "30"), ("_table_suffix", "<=", "39")])

Preserve ordering in a query input.

>>> df = bpd.read_gbq('''
...    SELECT
...       -- Instead of an ORDER BY clause on the query, use
...       -- ROW_NUMBER() to create an ordered DataFrame.
...       ROW_NUMBER() OVER (ORDER BY AVG(pitchSpeed) DESC)
...         AS rowindex,
...
...       pitcherFirstName,
...       pitcherLastName,
...       AVG(pitchSpeed) AS averagePitchSpeed
...     FROM `bigquery-public-data.baseball.games_wide`
...     WHERE year = 2016
...     GROUP BY pitcherFirstName, pitcherLastName
... ''', index_col="rowindex")
>>> df.head(2)
         pitcherFirstName pitcherLastName  averagePitchSpeed
rowindex
1                Albertin         Chapman          96.514113
2                 Zachary         Britton          94.591039
<BLANKLINE>
[2 rows x 3 columns]

Reading data with columns and filters parameters:

>>> columns = ['pitcherFirstName', 'pitcherLastName', 'year', 'pitchSpeed']
>>> filters = [('year', '==', 2016), ('pitcherFirstName', 'in', ['John', 'Doe']), ('pitcherLastName', 'in', ['Gant'])]
>>> df = bpd.read_gbq(
...             "bigquery-public-data.baseball.games_wide",
...             columns=columns,
...             filters=filters,
...         )
>>> df.head(1)
         pitcherFirstName   pitcherLastName     year        pitchSpeed
0                    John              Gant     2016            82
<BLANKLINE>
[1 rows x 4 columns]

Parameters Name Description query_or_table str

A SQL string to be executed or a BigQuery table to be read. The table must be specified in the format of project.dataset.tablename or dataset.tablename. Can also take wildcard table name, such as project.dataset.table_prefix*. In tha case, will read all the matched table as one DataFrame.

index_col Iterable[str] or str

Name of result column(s) to use for index in results DataFrame.

columns Iterable[str]

List of BigQuery column names in the desired order for results DataFrame.

configuration dict, optional

Query config parameters for job processing. For example: configuration = {'query': {'useQueryCache': False}}. For more information see BigQuery REST API Reference https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query__.

max_results Optional[int], default None

If set, limit the maximum number of rows to fetch from the query results.

filters Union[Iterable[FilterType], Iterable[Iterable[FilterType]]], default ()

To filter out data. Filter syntax: [[(column, op, val), …],…] where op is [==, >, >=, <, <=, !=, in, not in, LIKE]. The innermost tuples are transposed into a set of filters applied through an AND operation. The outer Iterable combines these sets of filters through an OR operation. A single Iterable of tuples can also be used, meaning that no OR operation between set of filters is to be conducted. If using wildcard table suffix in query_or_table, can specify '_table_suffix' pseudo column to filter the tables to be read into the DataFrame.

use_cache Optional[bool], default None

Caches query results if set to True. When None, it behaves as True, but should not be combined with useQueryCache in configuration to avoid conflicts.

col_order Iterable[str]

Alias for columns, retained for backwards compatibility.

read_gbq_function

read_gbq_function(function_name: str)

Parameter Name Description function_name str

the function's name in BigQuery in the format project_id.dataset_id.function_name, or dataset_id.function_name to load from the default project, or function_name to load from the default project and the dataset associated with the current session.

Returns Type Description callable A function object pointing to the BigQuery function read from BigQuery. The object is similar to the one created by the remote_function decorator, including the bigframes_remote_function property, but not including the bigframes_cloud_function property. read_gbq_model

read_gbq_model(model_name: str)

Loads a BigQuery ML model from BigQuery.

Examples:

>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None

Read an existing BigQuery ML model.

>>> model_name = "bigframes-dev.bqml_tutorial.penguins_model"
>>> model = bpd.read_gbq_model(model_name)

Parameter Name Description model_name str

the model's name in BigQuery in the format project_id.dataset_id.model_id, or just dataset_id.model_id to load from the default project.

read_gbq_query

read_gbq_query(
    query: str,
    *,
    index_col: Iterable[str] | str = (),
    columns: Iterable[str] = (),
    configuration: Optional[Dict] = None,
    max_results: Optional[int] = None,
    use_cache: Optional[bool] = None,
    col_order: Iterable[str] = ()
) -> dataframe.DataFrame

Turn a SQL query into a DataFrame.

Note: Because the results are written to a temporary table, ordering by ORDER BY is not preserved. A unique index_col is recommended. Use row_number() over () if there is no natural unique index or you want to preserve ordering.

Examples:

>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None

Simple query input:

>>> df = bpd.read_gbq_query('''
...    SELECT
...       pitcherFirstName,
...       pitcherLastName,
...       pitchSpeed,
...    FROM `bigquery-public-data.baseball.games_wide`
... ''')

Preserve ordering in a query input.

>>> df = bpd.read_gbq_query('''
...    SELECT
...       -- Instead of an ORDER BY clause on the query, use
...       -- ROW_NUMBER() to create an ordered DataFrame.
...       ROW_NUMBER() OVER (ORDER BY AVG(pitchSpeed) DESC)
...         AS rowindex,
...
...       pitcherFirstName,
...       pitcherLastName,
...       AVG(pitchSpeed) AS averagePitchSpeed
...     FROM `bigquery-public-data.baseball.games_wide`
...     WHERE year = 2016
...     GROUP BY pitcherFirstName, pitcherLastName
... ''', index_col="rowindex")
>>> df.head(2)
         pitcherFirstName pitcherLastName  averagePitchSpeed
rowindex
1                Albertin         Chapman          96.514113
2                 Zachary         Britton          94.591039
<BLANKLINE>
[2 rows x 3 columns]

See also: Session.read_gbq.

read_json

read_json(
    path_or_buf: str | IO["bytes"],
    *,
    orient: Literal[
        "split", "records", "index", "columns", "values", "table"
    ] = "columns",
    dtype: Optional[Dict] = None,
    encoding: Optional[str] = None,
    lines: bool = False,
    engine: Literal["ujson", "pyarrow", "bigquery"] = "ujson",
    **kwargs
) -> dataframe.DataFrame

Convert a JSON string to DataFrame object.

Note: using engine="bigquery" will not guarantee the same ordering as the file. Instead, set a serialized index column as the index and sort by that in the resulting DataFrame. Examples:

>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None

>>> gcs_path = "gs://bigframes-dev-testing/sample1.json"
>>> df = bpd.read_json(path_or_buf=gcs_path, lines=True, orient="records")
>>> df.head(2)
   id   name
0   1  Alice
1   2    Bob
<BLANKLINE>
[2 rows x 2 columns]

Parameters Name Description path_or_buf a valid JSON str, path object or file-like object

A local or Google Cloud Storage (gs://) path with engine="bigquery" otherwise passed to pandas.read_json.

orient str, optional

If engine="bigquery" orient only supports "records". Indication of expected JSON string format. Compatible JSON strings can be produced by to_json() with a corresponding orient value. The set of possible orients is: - 'split' : dict like {{index -> [index], columns -> [columns], data -> [values]}} - 'records' : list like [{{column -> value}}, ... , {{column -> value}}] - 'index' : dict like {{index -> {{column -> value}}}} - 'columns' : dict like {{column -> {{index -> value}}}} - 'values' : just the values array

dtype bool or dict, default None

If True, infer dtypes; if a dict of column to dtype, then use those; if False, then don't infer dtypes at all, applies only to the data. For all orient values except 'table', default is True.

encoding str, default is 'utf-8'

The encoding to use to decode py3 bytes.

lines bool, default False

Read the file as a json object per line. If using engine="bigquery" lines only supports True.

engine {{"ujson", "pyarrow", "bigquery"}}, default "ujson"

Type of engine to use. If engine="bigquery" is specified, then BigQuery's load API will be used. Otherwise, the engine will be passed to pandas.read_json.

read_pandas

Loads DataFrame from a pandas DataFrame.

The pandas DataFrame will be persisted as a temporary BigQuery table, which can be automatically recycled after the Session is closed.

Examples:

>>> import bigframes.pandas as bpd
>>> import pandas as pd
>>> bpd.options.display.progress_bar = None

>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> pandas_df = pd.DataFrame(data=d)
>>> df = bpd.read_pandas(pandas_df)
>>> df
   col1  col2
0     1     3
1     2     4
<BLANKLINE>
[2 rows x 2 columns]

Parameter Name Description pandas_dataframe pandas.DataFrame, pandas.Series, or pandas.Index

a pandas DataFrame/Series/Index object to be loaded.

read_parquet

read_parquet(
    path: str | IO["bytes"], *, engine: str = "auto"
) -> dataframe.DataFrame

Load a Parquet object from the file path (local or Cloud Storage), returning a DataFrame.

Note: This method will not guarantee the same ordering as the file. Instead, set a serialized index column as the index and sort by that in the resulting DataFrame. Examples:

>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None

>>> gcs_path = "gs://cloud-samples-data/bigquery/us-states/us-states.parquet"
>>> df = bpd.read_parquet(path=gcs_path, engine="bigquery")

Parameters Name Description path str

Local or Cloud Storage path to Parquet file.

engine str

One of 'auto', 'pyarrow', 'fastparquet', or 'bigquery'. Parquet library to parse the file. If set to 'bigquery', order is not preserved. Default, 'auto'.

read_pickle

read_pickle(
    filepath_or_buffer: FilePath | ReadPickleBuffer,
    compression: CompressionOptions = "infer",
    storage_options: StorageOptions = None,
)

Load pickled BigFrames object (or any object) from file.

Note: If the content of the pickle file is a Series and its name attribute is None, the name will be set to '0' by default. Examples:

>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None

>>> gcs_path = "gs://bigframes-dev-testing/test_pickle.pkl"
>>> df = bpd.read_pickle(filepath_or_buffer=gcs_path)

Parameters Name Description filepath_or_buffer str, path object, or file-like object

String, path object (implementing os.PathLike[str]), or file-like object implementing a binary readlines() function. Also accepts URL. URL is not limited to S3 and GCS.

compression str or dict, default 'infer'

For on-the-fly decompression of on-disk data. If 'infer' and 'filepath_or_buffer' is path-like, then detect compression from the following extensions: '.gz', '.bz2', '.zip', '.xz', '.zst', '.tar', '.tar.gz', '.tar.xz' or '.tar.bz2' (otherwise no compression). If using 'zip' or 'tar', the ZIP file must contain only one data file to be read in. Set to None for no decompression. Can also be a dict with key 'method' set to one of {'zip', 'gzip', 'bz2', 'zstd', 'tar'} and other key-value pairs are forwarded to zipfile.ZipFile, gzip.GzipFile, bz2.BZ2File, zstandard.ZstdDecompressor or tarfile.TarFile, respectively. As an example, the following could be passed for Zstandard decompression using a custom compression dictionary compression={'method': 'zstd', 'dict_data': my_compression_dict}.

storage_options dict, default None

Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to urllib.request.Request as header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to fsspec.open. Please see fsspec and urllib for more details, and for more examples on storage options refer here.

remote_function

remote_function(
    input_types: typing.List[type],
    output_type: type,
    dataset: typing.Optional[str] = None,
    bigquery_connection: typing.Optional[str] = None,
    reuse: bool = True,
    name: typing.Optional[str] = None,
    packages: typing.Optional[typing.Sequence[str]] = None,
    cloud_function_service_account: typing.Optional[str] = None,
    cloud_function_kms_key_name: typing.Optional[str] = None,
    cloud_function_docker_repository: typing.Optional[str] = None,
)

Parameters Name Description input_types list(type)

List of input data types in the user defined function.

output_type type

Data type of the output in the user defined function.

dataset str, Optional

Dataset in which to create a BigQuery remote function. It should be in <project_id>.<dataset_name> or <dataset_name> format. If this parameter is not provided then session dataset id is used.

bigquery_connection str, Optional

Name of the BigQuery connection. You should either have the connection already created in the location you have chosen, or you should have the Project IAM Admin role to enable the service to create the connection for you if you need it. If this parameter is not provided then the BigQuery connection from the session is used.

reuse bool, Optional

Reuse the remote function if already exists. True by default, which will result in reusing an existing remote function and corresponding cloud function (if any) that was previously created for the same udf. Setting it to False would force creating a unique remote function. If the required remote function does not exist then it would be created irrespective of this param.

name str, Optional

Explicit name of the persisted BigQuery remote function. Use it with caution, because two users working in the same project and dataset could overwrite each other's remote functions if they use the same persistent name.

packages str[], Optional

Explicit name of the external package dependencies. Each dependency is added to the requirements.txt as is, and can be of the form supported in https://pip.pypa.io/en/stable/reference/requirements-file-format/.

cloud_function_service_account str, Optional

Service account to use for the cloud functions. If not provided then the default service account would be used. See https://cloud.google.com/functions/docs/securing/function-identity for more details. Please make sure the service account has the necessary IAM permissions configured as described in https://cloud.google.com/functions/docs/reference/iam/roles#additional-configuration.

cloud_function_kms_key_name str, Optional

Customer managed encryption key to protect cloud functions and related data at rest. This is of the format projects/PROJECT_ID/locations/LOCATION/keyRings/KEYRING/cryptoKeys/KEY. Read https://cloud.google.com/functions/docs/securing/cmek for more details including granting necessary service accounts access to the key.

cloud_function_docker_repository str, Optional

Docker repository created with the same encryption key as cloud_function_kms_key_name to store encrypted artifacts created to support the cloud function. This is of the format projects/PROJECT_ID/locations/LOCATION/repositories/REPOSITORY_NAME. For more details see https://cloud.google.com/functions/docs/securing/cmek#before_you_begin.

Returns Type Description callable A remote function object pointing to the cloud assets created in the background to support the remote execution. The cloud assets can be located through the following properties set in the object: bigframes_cloud_function - The google cloud function deployed for the user defined code. bigframes_remote_function - The bigquery remote function capable of calling into bigframes_cloud_function.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-08-12 UTC.

[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-12 UTC."],[],[]]

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4