Stay organized with collections Save and categorize content based on your preferences.
DataFrame(
data=None,
index: vendored_pandas_typing.Axes | None = None,
columns: vendored_pandas_typing.Axes | None = None,
dtype: typing.Optional[
bigframes.dtypes.DtypeString | bigframes.dtypes.Dtype
] = None,
copy: typing.Optional[bool] = None,
*,
session: typing.Optional[bigframes.session.Session] = None
)
Two-dimensional, size-mutable, potentially heterogeneous tabular data.
Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.
Properties axesReturn a list representing the axes of the DataFrame.
It has the row axis labels and column axis labels as the only members. They are returned in that order.
Examples
df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
df.axes
[RangeIndex(start=0, stop=2, step=1), Index(['col1', 'col2'],
dtype='object')]
columns
The column labels of the DataFrame.
dtypesReturn the dtypes in the DataFrame.
This returns a Series with the data type of each column. The result's index is the original DataFrame's columns. Columns with mixed types aren't supported yet in BigQuery DataFrames.
emptyIndicates whether Series/DataFrame is empty.
True if Series/DataFrame is entirely empty (no items), meaning any of the axes are of length 0.
Note: If Series/DataFrame contains only NA values, it is still not considered empty. Returns Type Descriptionbool
If Series/DataFrame is empty, return True, if not return False. iloc
Purely integer-location based indexing for selection by position.
.iloc[]
is primarily integer position based (from 0
to length-1
of the axis), but may also be used with a boolean array.
Allowed inputs are:
5
.[4, 3, 0]
.1:7
.callable
function with one argument (the calling Series or DataFrame) that returns valid output for indexing (one of the above). This is useful in method chains, when you don't have a reference to the calling object, but would like to base your selection on some value.(0, 1)
..iloc
will raise IndexError
if a requested indexer is out-of-bounds, except slice indexers which allow out-of-bounds indexing (this conforms with python/numpy slice semantics).
The index (row labels) of the DataFrame.
The index of a DataFrame is a series of labels that identify each row. The labels can be integers, strings, or any other hashable type. The index is used for label-based access and alignment, and can be accessed or modified using this attribute.
locAccess a group of rows and columns by label(s) or a boolean array.
.loc[]
is primarily label based, but may also be used with a boolean array.
Allowed inputs are:
5
or 'a'
, (note that 5
is interpreted as a label of the index, and never as an integer position along the index).['a', 'b', 'c']
.[True, False, True]
.'a':'f'
. Note: contrary to usual python slices, both the start and the stop are included.callable
function with one argument (the calling Series or DataFrame) that returns valid output for indexing (one of the above).NotImplementError
if the inputs are not supported. ndim
Return an int representing the number of axes / array dimensions.
Returns Type Descriptionint
Return 1 if Series. Otherwise return 2 if DataFrame. query_job
BigQuery job metadata for the most recent query.
shapeReturn a tuple representing the dimensionality of the DataFrame.
sizeReturn an int representing the number of elements in this object.
Returns Type Descriptionint
Return the number of rows if Series. Otherwise return the number of rows times number of columns if DataFrame. sql
Compiles this DataFrame's expression tree to SQL.
valuesReturn the values of DataFrame in the form of a NumPy array.
Methods __array_ufunc____array_ufunc__(
ufunc: numpy.ufunc, method: str, *inputs, **kwargs
) -> bigframes.dataframe.DataFrame
Used to support numpy ufuncs. See: https://numpy.org/doc/stable/reference/ufuncs.html
__getitem____getitem__(
key: typing.Union[
typing.Hashable,
typing.Sequence[typing.Hashable],
pandas.core.indexes.base.Index,
bigframes.series.Series,
]
)
Gets the specified column(s) from the DataFrame.
__repr__Converts a DataFrame to a string. Calls compute.
Only represents the first <xref uid="bigframes.options">bigframes.options</xref>.display.max_rows
.
__setitem__(
key: str, value: typing.Union[bigframes.series.Series, int, float, typing.Callable]
)
Modify or insert a column into the DataFrame.
Note: This does not modify the original table the DataFrame was derived from.
absabs() -> bigframes.dataframe.DataFrame
Return a Series/DataFrame with absolute numeric value of each element.
This function only applies to elements that are all numeric.
addadd(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get addition of DataFrame and other, element-wise (binary operator +
).
Equivalent to dataframe + other
. With reverse version, radd
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. add_prefix
add_prefix(prefix: str, axis: int | str | None = None) -> DataFrame
Prefix labels with string prefix
.
For Series, the row labels are prefixed. For DataFrame, the column labels are prefixed.
Parameters Name Descriptionprefix
str
The string to add before each label.
axis
int or str or None, default None
{{0 or 'index', 1 or 'columns', None}}
, default None. Axis to add prefix on.
add_suffix(suffix: str, axis: int | str | None = None) -> DataFrame
Suffix labels with string suffix
.
For Series, the row labels are suffixed. For DataFrame, the column labels are suffixed.
aggagg(func: str | typing.Sequence[str]) -> DataFrame | bigframes.series.Series
Aggregate using one or more operations over the specified axis.
Parameter Name Descriptionfunc
function
Function to use for aggregating the data. Accepted combinations are: string function name, list of function names, e.g. ['sum', 'mean']
.
aggregate(func: str | typing.Sequence[str]) -> DataFrame | bigframes.series.Series
Aggregate using one or more operations over the specified axis.
Parameter Name Descriptionfunc
function
Function to use for aggregating the data. Accepted combinations are: string function name, list of function names, e.g. ['sum', 'mean']
.
all(*, bool_only: bool = False) -> bigframes.series.Series
Return whether all elements are True, potentially over an axis.
Returns True unless there at least one element within a Series or along a DataFrame axis that is False or equivalent (e.g. zero or empty).
Parameter Name Descriptionbool_only
bool. default False
Include only boolean columns.
anyany(*, bool_only: bool = False) -> bigframes.series.Series
Return whether any element is True, potentially over an axis.
Returns False unless there is at least one element within a series or along a Dataframe axis that is True or equivalent (e.g. non-zero or non-empty).
Parameter Name Descriptionbool_only
bool. default False
Include only boolean columns.
applymapapplymap(
func, na_action: typing.Optional[str] = None
) -> bigframes.dataframe.DataFrame
Apply a function to a Dataframe elementwise.
This method applies a function that accepts and returns a scalar to every element of a DataFrame.
Note: In pandas 2.1.0, DataFrame.applymap is deprecated and renamed to DataFrame.map. Parameter Name Descriptionna_action
Optional[str], default None
{None, 'ignore'}
, default None. If ‘ignore’, propagate NaN values, without passing them to func.
bigframes.dataframe.DataFrame
Transformed DataFrame. assign
assign(**kwargs) -> bigframes.dataframe.DataFrame
Assign new columns to a DataFrame.
Returns a new object with all original columns in addition to new ones. Existing columns that are re-assigned will be overwritten.
Note: Assigning multiple columns within the sameassign
is possible. Later items in '**kwargs' may refer to newly created or modified columns in 'df'; items are computed and assigned into 'df' in order. Returns Type Description bigframes.dataframe.DataFrame
A new DataFrame with the new columns in addition to all the existing columns. astype
astype(
dtype: typing.Union[
typing.Literal[
"boolean",
"Float64",
"Int64",
"string",
"string[pyarrow]",
"timestamp[us, tz=UTC][pyarrow]",
"timestamp[us][pyarrow]",
"date32[day][pyarrow]",
"time64[us][pyarrow]",
],
pandas.core.arrays.boolean.BooleanDtype,
pandas.core.arrays.floating.Float64Dtype,
pandas.core.arrays.integer.Int64Dtype,
pandas.core.arrays.string_.StringDtype,
pandas.core.dtypes.dtypes.ArrowDtype,
]
) -> bigframes.dataframe.DataFrame
Cast a pandas object to a specified dtype dtype
.
dtype
str or pandas.ExtensionDtype
A dtype supported by BigQuery DataFrame include 'boolean','Float64','Int64', 'string', 'tring[pyarrow]','timestamp[us, tz=UTC][pyarrow]', 'timestampus][pyarrow]
','date32day][pyarrow]
','time64us][pyarrow]
' A pandas.ExtensionDtype include pandas.BooleanDtype(), pandas.Float64Dtype(), pandas.Int64Dtype(), pandas.StringDtype(storage="pyarrow"), pd.ArrowDtype(pa.date32()), pd.ArrowDtype(pa.time64("us")), pd.ArrowDtype(pa.timestamp("us")), pd.ArrowDtype(pa.timestamp("us", tz="UTC")).
copy() -> bigframes.dataframe.DataFrame
Make a copy of this object's indices and data.
A new object will be created with a copy of the calling object's data and indices. Modifications to the data or indices of the copy will not be reflected in the original object.
countcount(*, numeric_only: bool = False) -> bigframes.series.Series
Count non-NA cells for each column or row.
The values None
, NaN
, NaT
, and optionally numpy.inf
(depending on pandas.options.mode.use_inf_as_na
) are considered NA.
numeric_only
bool, default False
Include only float
, int
or boolean
data.
bigframes.series.Series
For each column/row the number of non-NA/null entries. If level
is specified returns a DataFrame
. cummax
cummax() -> bigframes.dataframe.DataFrame
Return cumulative maximum over a DataFrame axis.
Returns a DataFrame of the same size containing the cumulative maximum.
Returns Type Descriptionbigframes.dataframe.DataFrame
Return cumulative maximum of DataFrame. cummin
cummin() -> bigframes.dataframe.DataFrame
Return cumulative minimum over a DataFrame axis.
Returns a DataFrame of the same size containing the cumulative minimum.
Returns Type Descriptionbigframes.dataframe.DataFrame
Return cumulative minimum of DataFrame. cumprod
cumprod() -> bigframes.dataframe.DataFrame
Return cumulative product over a DataFrame axis.
Returns a DataFrame of the same size containing the cumulative product.
Returns Type Descriptionbigframes.dataframe.DataFrame
Return cumulative product of DataFrame. cumsum
Return cumulative sum over a DataFrame axis.
Returns a DataFrame of the same size containing the cumulative sum.
Returns Type Descriptionbigframes.dataframe.DataFrame
Return cumulative sum of DataFrame. describe
describe() -> bigframes.dataframe.DataFrame
Generate descriptive statistics.
Descriptive statistics include those that summarize the central tendency, dispersion and shape of a dataset's distribution, excluding NaN
values.
Only supports numeric columns.
Note: Percentile values are approximates only. Note: For numeric data, the result's index will includecount
, mean
, std
, min
, max
as well as lower, 50
and upper percentiles. By default the lower percentile is 25
and the upper percentile is 75
. The 50
percentile is the same as the median. Returns Type Description bigframes.dataframe.DataFrame
Summary statistics of the Series or Dataframe provided. div
div(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get floating division of DataFrame and other, element-wise (binary operator /
).
Equivalent to dataframe / other
. With reverse version, rtruediv
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. divide
divide(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get floating division of DataFrame and other, element-wise (binary operator /
).
Equivalent to dataframe / other
. With reverse version, rtruediv
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. drop
drop(
labels: typing.Optional[typing.Any] = None,
*,
axis: typing.Union[int, str] = 0,
index: typing.Optional[typing.Any] = None,
columns: typing.Optional[
typing.Union[typing.Hashable, typing.Sequence[typing.Hashable]]
] = None,
level: typing.Optional[typing.Union[str, int]] = None
) -> bigframes.dataframe.DataFrame
Drop specified labels from columns.
Remove columns by directly specifying column names.
Exceptions Type DescriptionKeyError
If any of the labels is not found in the selected axis. Returns Type Description bigframes.dataframe.DataFrame
DataFrame without the removed column labels. drop_duplicates
drop_duplicates(
subset: typing.Optional[
typing.Union[typing.Hashable, typing.Sequence[typing.Hashable]]
] = None,
*,
keep: str = "first"
) -> bigframes.dataframe.DataFrame
Return DataFrame with duplicate rows removed.
Considering certain columns is optional. Indexes, including time indexes are ignored.
Parameters Name Descriptionsubset
column label or sequence of labels, optional
Only consider certain columns for identifying duplicates, by default use all of the columns.
keep
{'first', 'last', False
}, default 'first'
Determines which duplicates (if any) to keep. - 'first' : Drop duplicates except for the first occurrence. - 'last' : Drop duplicates except for the last occurrence. - False
: Drop all duplicates.
bigframes.dataframe.DataFrame
DataFrame with duplicates removed droplevel
droplevel(level: typing.Union[str, int, typing.Sequence[typing.Union[str, int]]])
Return DataFrame with requested index / column level(s) removed.
Parameter Name Descriptionlevel
int, str, or list-like
If a string is given, must be the name of a level If list-like, elements must be names or positional indexes of levels.
Returns Type DescriptionDataFrame
DataFrame with requested index / column level(s) removed. dropna
dropna(
*, axis: int | str = 0, inplace: bool = False, how: str = "any", ignore_index=False
) -> DataFrame
Remove missing values.
Parameters Name Descriptionaxis
{0 or 'index', 1 or 'columns'}, default 'columns'
Determine if rows or columns which contain missing values are removed. * 0, or 'index' : Drop rows which contain missing values. * 1, or 'columns' : Drop columns which contain missing value.
how
{'any', 'all'}, default 'any'
Determine if row or column is removed from DataFrame, when we have at least one NA or all NA. * 'any' : If any NA values are present, drop that row or column. * 'all' : If all values are NA, drop that row or column.
ignore_index
bool, default False
If True
, the resulting axis will be labeled 0, 1, …, n - 1.
bigframes.dataframe.DataFrame
DataFrame with NA entries dropped from it. duplicated
duplicated(subset=None, keep: str = "first") -> bigframes.series.Series
Return boolean Series denoting duplicate rows.
Considering certain columns is optional.
Parameters Name Descriptionsubset
column label or sequence of labels, optional
Only consider certain columns for identifying duplicates, by default use all of the columns.
keep
{'first', 'last', False}, default 'first'
Determines which duplicates (if any) to mark. - first
: Mark duplicates as True
except for the first occurrence. - last
: Mark duplicates as True
except for the last occurrence. - False : Mark all duplicates as True
.
eq(other: typing.Any, axis: str | int = "columns") -> DataFrame
Get equal to of DataFrame and other, element-wise (binary operator eq
).
Among flexible wrappers (eq
, ne
, le
, lt
, ge
, gt
) to comparison operators.
Equivalent to ==
, !=
, <=
, <
, >=
, >
with support to choose axis (rows or columns) and level for comparison.
other
scalar, sequence, Series, or DataFrame
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}, default 'columns'
Whether to compare by the index (0 or 'index') or columns (1 or 'columns').
fillnafillna(value=None) -> bigframes.dataframe.DataFrame
Fill NA/NaN values using the specified method.
Parameter Name Descriptionvalue
scalar, Series
Value to use to fill holes (e.g. 0), alternately a Series of values specifying which value to use for each index (for a Series) or column (for a DataFrame). Values not in the Series will not be filled. This value cannot be a list.
Returns Type DescriptionDataFrame
Object with missing values filled floordiv
floordiv(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get integer division of DataFrame and other, element-wise (binary operator //
).
Equivalent to dataframe // other
. With reverse version, rfloordiv
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. ge
ge(other: typing.Any, axis: str | int = "columns") -> DataFrame
Get 'greater than or equal to' of DataFrame and other, element-wise (binary operator >=
).
Among flexible wrappers (eq
, ne
, le
, lt
, ge
, gt
) to comparison operators.
Equivalent to ==
, !=
, <=
, <
, >=
, >
with support to choose axis (rows or columns) and level for comparison.
NaN
values in floating point columns are considered different (i.e. NaN
!= NaN
). Parameters Name Description other
scalar, sequence, Series, or DataFrame
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}, default 'columns'
Whether to compare by the index (0 or 'index') or columns (1 or 'columns').
Returns Type DescriptionDataFrame
DataFrame of bool. The result of the comparison. get
Get item from object for given key (ex: DataFrame column).
Returns default value if not found.
groupbygroupby(
by: typing.Optional[
typing.Union[
typing.Hashable,
bigframes.series.Series,
typing.Sequence[typing.Union[typing.Hashable, bigframes.series.Series]],
]
] = None,
*,
level: typing.Optional[
typing.Union[str, int, typing.Sequence[typing.Union[str, int]]]
] = None,
as_index: bool = True,
dropna: bool = True
) -> bigframes.core.groupby.DataFrameGroupBy
Group DataFrame by columns.
A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups.
Parameters Name Descriptionby
str, Sequence[str]
A label or list of labels may be passed to group by the columns in self
. Notice that a tuple is interpreted as a (single) key.
level
int, level name, or sequence of such, default None
If the axis is a MultiIndex (hierarchical), group by a particular level or levels. Do not specify both by
and level
.
as_index
bool, default True
Default True. Return object with group labels as the index. Only relevant for DataFrame input. as_index=False
is effectively "SQL-style" grouped output. This argument has no effect on filtrations such as head()
, tail()
, nth()
and in transformations.
dropna
bool, default True
Default True. If True, and if group keys contain NA values, NA values together with row/column will be dropped. If False, NA values will also be treated as the key in groups.
gtgt(other: typing.Any, axis: str | int = "columns") -> DataFrame
Get 'greater than' of DataFrame and other, element-wise (binary operator >
).
Among flexible wrappers (eq
, ne
, le
, lt
, ge
, gt
) to comparison operators.
Equivalent to ==
, !=
, <=
, <
, >=
, >
with support to choose axis (rows or columns) and level for comparison.
NaN
values in floating point columns are considered different (i.e. NaN
!= NaN
). Parameters Name Description other
scalar, sequence, Series, or DataFrame
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}, default 'columns'
Whether to compare by the index (0 or 'index') or columns (1 or 'columns').
Returns Type DescriptionDataFrame
DataFrame of bool: The result of the comparison. head
head(n: int = 5) -> bigframes.dataframe.DataFrame
Return the first n
rows.
This function returns the first n
rows for the object based on position. It is useful for quickly testing if your object has the right type of data in it.
Not yet supported For negative values of n
, this function returns all rows except the last |n|
rows, equivalent to df[:n]
.
If n is larger than the number of rows, this function returns all rows.
Parameter Name Descriptionn
int, default 5
Default 5. Number of rows to select.
isinisin(values) -> bigframes.dataframe.DataFrame
Whether each element in the DataFrame is contained in values.
Parameter Name Descriptionvalues
iterable, or dict
The result will only be true at a location if all the labels match. If values
is a dict, the keys must be the column names, which must match.
DataFrame
DataFrame of booleans showing whether each element in the DataFrame is contained in values. isna
isna() -> bigframes.dataframe.DataFrame
Detect missing values.
Return a boolean same-sized object indicating if the values are NA. NA values get mapped to True values. Everything else gets mapped to False values. Characters such as empty strings ''
or numpy.inf
are not considered NA values.
isnull() -> bigframes.dataframe.DataFrame
Detect missing values.
Return a boolean same-sized object indicating if the values are NA. NA values get mapped to True values. Everything else gets mapped to False values. Characters such as empty strings ''
or numpy.inf
are not considered NA values.
join(
other: bigframes.dataframe.DataFrame,
*,
on: typing.Optional[str] = None,
how: str = "left"
) -> bigframes.dataframe.DataFrame
Join columns of another DataFrame.
Join columns with other
DataFrame on index
how
{'left', 'right', 'outer', 'inner'}, default 'left'`
How to handle the operation of the two objects. left
: use calling frame's index (or column if on is specified) right
: use other
's index. outer
: form union of calling frame's index (or column if on is specified) with other
's index, and sort it lexicographically. inner
: form intersection of calling frame's index (or column if on is specified) with other
's index, preserving the order of the calling's one.
bigframes.dataframe.DataFrame
A dataframe containing columns from both the caller and other
. le
le(other: typing.Any, axis: str | int = "columns") -> DataFrame
Get 'less than or equal to' of dataframe and other, element-wise (binary operator <=
).
Among flexible wrappers (eq
, ne
, le
, lt
, ge
, gt
) to comparison operators.
Equivalent to ==
, !=
, <=
, <
, >=
, >
with support to choose axis (rows or columns) and level for comparison.
NaN
values in floating point columns are considered different (i.e. NaN
!= NaN
). Parameters Name Description other
scalar, sequence, Series, or DataFrame
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}, default 'columns'
Whether to compare by the index (0 or 'index') or columns (1 or 'columns').
Returns Type DescriptionDataFrame
DataFrame of bool. The result of the comparison. lt
lt(other: typing.Any, axis: str | int = "columns") -> DataFrame
Get 'less than' of DataFrame and other, element-wise (binary operator <
).
Among flexible wrappers (eq
, ne
, le
, lt
, ge
, gt
) to comparison operators.
Equivalent to ==
, !=
, <=
, <
, >=
, >
with support to choose axis (rows or columns) and level for comparison.
NaN
values in floating point columns are considered different (i.e. NaN
!= NaN
). Parameters Name Description other
scalar, sequence, Series, or DataFrame
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}, default 'columns'
Whether to compare by the index (0 or 'index') or columns (1 or 'columns').
Returns Type DescriptionDataFrame
DataFrame of bool. The result of the comparison. map
map(func, na_action: typing.Optional[str] = None) -> bigframes.dataframe.DataFrame
Apply a function to a Dataframe elementwise.
This method applies a function that accepts and returns a scalar to every element of a DataFrame.
Note: In pandas 2.1.0, DataFrame.applymap is deprecated and renamed to DataFrame.map. Parameter Name Descriptionna_action
Optional[str], default None
{None, 'ignore'}
, default None. If ‘ignore’, propagate NaN values, without passing them to func.
bigframes.dataframe.DataFrame
Transformed DataFrame. max
max(*, numeric_only: bool = False) -> bigframes.series.Series
Return the maximum of the values over the requested axis.
If you want the index of the maximum, use idxmax
. This is the equivalent of the numpy.ndarray
method argmax
.
numeric_only
bool. default False
Default False. Include only float, int, boolean columns.
meanmean(*, numeric_only: bool = False) -> bigframes.series.Series
Return the mean of the values over the requested axis.
Parameter Name Descriptionnumeric_only
bool. default False
Default False. Include only float, int, boolean columns.
medianmedian(
*, numeric_only: bool = False, exact: bool = False
) -> bigframes.series.Series
Return the median of the values over the requested axis.
Parameters Name Descriptionnumeric_only
bool. default False
Default False. Include only float, int, boolean columns.
exact
bool. default False
Default False. Get the exact median instead of an approximate one. Note: exact=True
not yet supported.
merge(
right: bigframes.dataframe.DataFrame,
how: typing.Literal["inner", "left", "outer", "right"] = "inner",
on: typing.Optional[str] = None,
*,
left_on: typing.Optional[str] = None,
right_on: typing.Optional[str] = None,
sort: bool = False,
suffixes: tuple[str, str] = ("_x", "_y")
) -> bigframes.dataframe.DataFrame
Merge DataFrame objects with a database-style join.
The join is done on columns or indexes. If joining columns on columns, the DataFrame indexes will be ignored. Otherwise if joining indexes on indexes or indexes on a column or columns, the index will be passed on. When performing a cross merge, no column specifications to merge on are allowed.
Warning: If both key columns contain rows where the key is a null value, those rows will be matched against each other. This is different from usual SQL join behaviour and can lead to unexpected results. Returns Type Descriptionbigframes.dataframe.DataFrame
A DataFrame of the two merged objects. min
min(*, numeric_only: bool = False) -> bigframes.series.Series
Return the minimum of the values over the requested axis.
If you want the index of the minimum, use idxmin
. This is the equivalent of the numpy.ndarray
method argmin
.
numeric_only
bool, default False
Default False. Include only float, int, boolean columns.
modmod(
other: int | bigframes.series.Series | DataFrame, axis: str | int = "columns"
) -> DataFrame
Get modulo of DataFrame and other, element-wise (binary operator %
).
Equivalent to dataframe % other
. With reverse version, rmod
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. mul
mul(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get multiplication of DataFrame and other, element-wise (binary operator *
).
Equivalent to dataframe * other
. With reverse version, rmul
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. multiply
multiply(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get multiplication of DataFrame and other, element-wise (binary operator *
).
Equivalent to dataframe * other
. With reverse version, rmul
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. ne
ne(other: typing.Any, axis: str | int = "columns") -> DataFrame
Get not equal to of DataFrame and other, element-wise (binary operator ne
).
Among flexible wrappers (eq
, ne
, le
, lt
, ge
, gt
) to comparison operators.
Equivalent to ==
, !=
, <=
, <
, >=
, >
with support to choose axis (rows or columns) and level for comparison.
other
scalar, sequence, Series, or DataFrame
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}, default 'columns'
Whether to compare by the index (0 or 'index') or columns (1 or 'columns').
Returns Type DescriptionDataFrame
Result of the comparison. notna
notna() -> bigframes.dataframe.DataFrame
Detect existing (non-missing) values.
Return a boolean same-sized object indicating if the values are not NA. Non-missing values get mapped to True. Characters such as empty strings ''
or numpy.inf
are not considered NA values. NA values get mapped to False values.
NDFrame
Mask of bool values for each element that indicates whether an element is not an NA value. notnull
notnull() -> bigframes.dataframe.DataFrame
Detect existing (non-missing) values.
Return a boolean same-sized object indicating if the values are not NA. Non-missing values get mapped to True. Characters such as empty strings ''
or numpy.inf
are not considered NA values. NA values get mapped to False values.
NDFrame
Mask of bool values for each element that indicates whether an element is not an NA value. nunique
nunique() -> bigframes.series.Series
Count number of distinct elements in specified axis.
pivotpivot(
*,
columns: typing.Union[typing.Hashable, typing.Sequence[typing.Hashable]],
index: typing.Optional[
typing.Union[typing.Hashable, typing.Sequence[typing.Hashable]]
] = None,
values: typing.Optional[
typing.Union[typing.Hashable, typing.Sequence[typing.Hashable]]
] = None
) -> bigframes.dataframe.DataFrame
Return reshaped DataFrame organized by given index / column values.
Reshape data (produce a "pivot" table) based on column values. Uses unique values from specified index
/ columns
to form axes of the resulting DataFrame. This function does not support data aggregation, multiple values will result in a MultiIndex in the columns.
columns
str or object or a list of str
Column to use to make new frame's columns.
index
str or object or a list of str, optional
Column to use to make new frame's index. If not given, uses existing index.
values
str, object or a list of the previous, optional
Column(s) to use for populating new frame's values. If not specified, all remaining columns will be used and the result will have hierarchically indexed columns.
powpow(other: int | bigframes.series.Series, axis: str | int = "columns") -> DataFrame
Get Exponential power of dataframe and other, element-wise (binary operator pow
).
Equivalent to dataframe ** other
, but with support to substitute a fill_value for missing data in one of the inputs. With reverse version, rpow
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. prod
prod(*, numeric_only: bool = False) -> bigframes.series.Series
Return the product of the values over the requested axis.
Parameter Name Descriptionnumeric_only
bool. default False
Include only float, int, boolean columns.
productproduct(*, numeric_only: bool = False) -> bigframes.series.Series
Return the product of the values over the requested axis.
Parameter Name Descriptionnumeric_only
bool. default False
Include only float, int, boolean columns.
raddradd(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get addition of DataFrame and other, element-wise (binary operator +
).
Equivalent to dataframe + other
. With reverse version, radd
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. rank
rank(
axis=0,
method: str = "average",
numeric_only=False,
na_option: str = "keep",
ascending=True,
) -> bigframes.dataframe.DataFrame
Compute numerical data ranks (1 through n) along axis.
By default, equal values are assigned a rank that is the average of the ranks of those values.
Parameters Name Descriptionmethod
{'average', 'min', 'max', 'first', 'dense'}, default 'average'
How to rank the group of records that have the same value (i.e. ties): average
: average rank of the group, min
: lowest rank in the group max: highest rank in the group,
first: ranks assigned in order they appear in the array,
dense`: like 'min', but rank always increases by 1 between groups.
numeric_only
bool, default False
For DataFrame objects, rank only numeric columns if set to True.
na_option
{'keep', 'top', 'bottom'}, default 'keep'
How to rank NaN values: keep
: assign NaN rank to NaN values, , top
: assign lowest rank to NaN values, bottom
: assign highest rank to NaN values.
ascending
bool, default True
Whether or not the elements should be ranked in ascending order.
Returns Type Descriptionsame type as caller
Return a Series or DataFrame with data ranks as values. rdiv
rdiv(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get floating division of DataFrame and other, element-wise (binary operator /
).
Equivalent to other / dataframe
. With reverse version, truediv
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
renamerename(
*, columns: typing.Mapping[typing.Hashable, typing.Hashable]
) -> bigframes.dataframe.DataFrame
Rename columns.
Dict values must be unique (1-to-1). Labels not contained in a dict will be left as-is. Extra labels listed don't throw an error.
Parameter Name Descriptioncolumns
Mapping
Dict-like from old column labels to new column labels.
Exceptions Type DescriptionKeyError
If any of the labels is not found. Returns Type Description bigframes.dataframe.DataFrame
DataFrame with the renamed axis labels. rename_axis
rename_axis(
mapper: typing.Union[typing.Hashable, typing.Sequence[typing.Hashable]], **kwargs
) -> bigframes.dataframe.DataFrame
Set the name of the axis for the index.
Note: Currently only accepts a single string parameter (the new name of the index). Returns Type Descriptionbigframes.dataframe.DataFrame
DataFrame with the new index name reorder_levels
reorder_levels(
order: typing.Union[str, int, typing.Sequence[typing.Union[str, int]]]
)
Rearrange index levels using input order. May not drop or duplicate levels.
Parameter Name Descriptionorder
list of int or list of str
List representing new level order. Reference level by number (position) or by key (label).
Returns Type DescriptionDataFrame
DataFrame of rearranged index. reset_index
reset_index(*, drop: bool = False) -> bigframes.dataframe.DataFrame
Reset the index.
Reset the index of the DataFrame, and use the default one instead.
Parameter Name Descriptiondrop
bool, default False
Do not try to insert index into dataframe columns. This resets the index to the default integer index.
Returns Type Descriptionbigframes.dataframe.DataFrame
DataFrame with the new index. rfloordiv
rfloordiv(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get integer division of DataFrame and other, element-wise (binary operator //
).
Equivalent to other // dataframe
. With reverse version, rfloordiv
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. rmod
rmod(
other: int | bigframes.series.Series | DataFrame, axis: str | int = "columns"
) -> DataFrame
Get modulo of DataFrame and other, element-wise (binary operator %
).
Equivalent to other % dataframe
. With reverse version, mod
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. rmul
rmul(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get multiplication of DataFrame and other, element-wise (binary operator *
).
Equivalent to dataframe * other
. With reverse version, rmul
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. rpow
rpow(
other: int | bigframes.series.Series, axis: str | int = "columns"
) -> DataFrame
Get Exponential power of dataframe and other, element-wise (binary operator rpow
).
Equivalent to other ** dataframe
, but with support to substitute a fill_value for missing data in one of the inputs. With reverse version, pow
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. rsub
rsub(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get subtraction of DataFrame and other, element-wise (binary operator -
).
Equivalent to other - dataframe
. With reverse version, sub
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. rtruediv
rtruediv(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get floating division of DataFrame and other, element-wise (binary operator /
).
Equivalent to other / dataframe
. With reverse version, truediv
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
samplesample(
n: typing.Optional[int] = None,
frac: typing.Optional[float] = None,
*,
random_state: typing.Optional[int] = None
) -> bigframes.dataframe.DataFrame
Return a random sample of items from an axis of object.
You can use random_state
for reproducibility.
n
Optional[int], default None
Number of items from axis to return. Cannot be used with frac
. Default = 1 if frac
= None.
frac
Optional[float], default None
Fraction of axis items to return. Cannot be used with n
.
random_state
Optional[int], default None
Seed for random number generator.
set_indexset_index(
keys: typing.Union[typing.Hashable, typing.Sequence[typing.Hashable]],
append: bool = False,
drop: bool = True,
) -> bigframes.dataframe.DataFrame
Set the DataFrame index using existing columns.
Set the DataFrame index (row labels) using one existing column. The index can replace the existing index.
Returns Type DescriptionDataFrame
Changed row labels. shift
shift(periods: int = 1) -> bigframes.dataframe.DataFrame
Shift index by desired number of periods.
Shifts the index without realigning the data.
Returns Type DescriptionNDFrame
Copy of input object, shifted. sort_index
sort_index(
ascending: bool = True, na_position: typing.Literal["first", "last"] = "last"
) -> bigframes.dataframe.DataFrame
Sort object by labels (along an axis).
sort_valuessort_values(
by: str | typing.Sequence[str],
*,
ascending: bool | typing.Sequence[bool] = True,
kind: str = "quicksort",
na_position: typing.Literal["first", "last"] = "last"
) -> DataFrame
Sort by the values along row axis.
Parameters Name Descriptionby
str or Sequence[str]
Name or list of names to sort by.
ascending
bool or Sequence[bool], default True
Sort ascending vs. descending. Specify list for multiple sort orders. If this is a list of bools, must match the length of the by.
kind
str, default quicksort
Choice of sorting algorithm. Accepts 'quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’. Ignored except when determining whether to sort stably. 'mergesort' or 'stable' will result in stable reorder.
na_position
{'first', 'last'}, default last
{'first', 'last'}
, default 'last' Puts NaNs at the beginning if first
; last
puts NaNs at the end.
Stack the prescribed level(s) from columns to index.
Return a reshaped DataFrame or Series having a multi-level index with one or more new inner-most levels compared to the current DataFrame. The new inner-most levels are created by pivoting the columns of the current dataframe:
DataFrame or Series
Stacked dataframe or series. std
std(*, numeric_only: bool = False) -> bigframes.series.Series
Return sample standard deviation over requested axis.
Normalized by N-1 by default.
Parameter Name Descriptionnumeric_only
bool. default False
Default False. Include only float, int, boolean columns.
subsub(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get subtraction of DataFrame and other, element-wise (binary operator -
).
Equivalent to dataframe - other
. With reverse version, rsub
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. subtract
subtract(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get subtraction of DataFrame and other, element-wise (binary operator -
).
Equivalent to dataframe - other
. With reverse version, rsub
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. sum
sum(*, numeric_only: bool = False) -> bigframes.series.Series
Return the sum of the values over the requested axis.
This is equivalent to the method numpy.sum
.
numeric_only
bool. default False
Default False. Include only float, int, boolean columns.
tailtail(n: int = 5) -> bigframes.dataframe.DataFrame
Return the last n
rows.
This function returns last n
rows from the object based on position. It is useful for quickly verifying data, for example, after sorting or appending rows.
For negative values of n
, this function returns all rows except the first |n|
rows, equivalent to df[|n|:]
.
If n is larger than the number of rows, this function returns all rows.
Parameter Name Descriptionn
int, default 5
Number of rows to select.
to_csvto_csv(
path_or_buf: str, sep=",", *, header: bool = True, index: bool = True
) -> None
Write object to a comma-separated values (csv) file on Cloud Storage.
Parameters Name Descriptionpath_or_buf
str
A destination URI of Cloud Storage files(s) to store the extracted dataframe in format of gs://<bucket_name>/<object_name_or_glob>
. If the data size is more than 1GB, you must use a wildcard to export the data into multiple files and the size of the files varies. None, file-like objects or local file paths not yet supported.
index
bool, default True
If True, write row names (index).
Returns Type DescriptionNone
String output not yet supported. to_gbq
to_gbq(
destination_table: str,
*,
if_exists: typing.Optional[typing.Literal["fail", "replace", "append"]] = "fail",
index: bool = True,
ordering_id: typing.Optional[str] = None
) -> None
Write a DataFrame to a BigQuery table.
Parameters Name Descriptiondestination_table
str
Name of table to be written, in the form dataset.tablename
or project.dataset.tablename
.
if_exists
str, default 'fail'
Behavior when the destination table exists. Value can be one of: 'fail'
If table exists raise pandas_gbq.gbq.TableCreationError. 'replace'
If table exists, drop it, recreate it, and insert data. 'append'
If table exists, insert data. Create if does not exist.
index
bool. default True
whether write row names (index) or not.
ordering_id
Optional[str], default None
If set, write the ordering of the DataFrame as a column in the result table with this name.
to_jsonto_json(
path_or_buf: str,
orient: typing.Literal[
"split", "records", "index", "columns", "values", "table"
] = "columns",
*,
lines: bool = False,
index: bool = True
) -> None
Convert the object to a JSON string, written to Cloud Storage.
Note NaN's and None will be converted to null and datetime objects will be converted to UNIX timestamps.
Note: Onlyorient='records'
and lines=True
is supported so far. Parameters Name Description path_or_buf
str
A destination URI of Cloud Storage files(s) to store the extracted dataframe in format of gs://<bucket_name>/<object_name_or_glob>
. Must contain a wildcard *
character. If the data size is more than 1GB, you must use a wildcard to export the data into multiple files and the size of the files varies. None, file-like objects or local file paths not yet supported.
orient
{split
, records
, index
, columns
, values
, table
}, default 'columns
Indication of expected JSON string format. * Series: - default is 'index' - allowed values are: {{'split', 'records', 'index', 'table'}}. * DataFrame: - default is 'columns' - allowed values are: {{'split', 'records', 'index', 'columns', 'values', 'table'}}. * The format of the JSON string: - 'split' : dict like {{'index' -> [index], 'columns' -> [columns], 'data' -> [values]}} - 'records' : list like [{{column -> value}}, ... , {{column -> value}}] - 'index' : dict like {{index -> {{column -> value}}}} - 'columns' : dict like {{column -> {{index -> value}}}} - 'values' : just the values array - 'table' : dict like {{'schema': {{schema}}, 'data': {{data}}}} Describing the data, where data component is like orient='records'
.
index
bool, default True
If True, write row names (index).
lines
bool, default False
If 'orient' is 'records' write out line-delimited json format. Will throw ValueError if incorrect 'orient' since others are not list-like.
Returns Type DescriptionNone
String output not yet supported. to_numpy
to_numpy(dtype=None, copy=False, na_value=None, **kwargs) -> numpy.ndarray
Convert the DataFrame to a NumPy array.
Parameters Name Descriptiondtype
None
The dtype to pass to numpy.asarray()
.
copy
bool, default None
Whether to ensure that the returned value is not a view on another array.
na_value
Any, default None
The value to use for missing values. The default value depends on dtype and the dtypes of the DataFrame columns.
Returns Type Descriptionnumpy.ndarray
The converted NumPy array. to_pandas
to_pandas(
max_download_size: typing.Optional[int] = None,
sampling_method: typing.Optional[str] = None,
random_state: typing.Optional[int] = None,
) -> pandas.core.frame.DataFrame
Write DataFrame to pandas DataFrame.
Parameters Name Descriptionmax_download_size
int, default None
Download size threshold in MB. If max_download_size is exceeded when downloading data (e.g., to_pandas()), the data will be downsampled if bigframes.options.sampling.enable_downsampling is True, otherwise, an error will be raised. If set to a value other than None, this will supersede the global config.
sampling_method
str, default None
Downsampling algorithms to be chosen from, the choices are: "head": This algorithm returns a portion of the data from the beginning. It is fast and requires minimal computations to perform the downsampling; "uniform": This algorithm returns uniform random samples of the data. If set to a value other than None, this will supersede the global config.
random_state
int, default None
The seed for the uniform downsampling algorithm. If provided, the uniform method may take longer to execute and require more computation. If set to a value other than None, this will supersede the global config.
Returns Type Descriptionpandas.DataFrame
A pandas DataFrame with all rows and columns of this DataFrame if the data_sampling_threshold_mb is not exceeded; otherwise, a pandas DataFrame with downsampled rows and all columns of this DataFrame. to_parquet
to_parquet(path: str, *, index: bool = True) -> None
Write a DataFrame to the binary Parquet format.
This function writes the dataframe as a parquet file <https://parquet.apache.org/>
_ to Cloud Storage.
path
str
Destination URI(s) of Cloud Storage files(s) to store the extracted dataframe in format of gs://<bucket_name>/<object_name_or_glob>
. If the data size is more than 1GB, you must use a wildcard to export the data into multiple files and the size of the files varies.
index
bool, default True
If True
, include the dataframe's index(es) in the file output. If False
, they will not be written to the file.
truediv(
other: float | int | bigframes.series.Series | DataFrame,
axis: str | int = "columns",
) -> DataFrame
Get floating division of DataFrame and other, element-wise (binary operator /
).
Equivalent to dataframe / other
. With reverse version, rtruediv
.
Among flexible wrappers (add
, sub
, mul
, div
, mod
, pow
) to arithmetic operators: +
, -
, *
, /
, //
, %
, **
.
other
float, int, or Series
Any single or multiple element data structure, or list-like object.
axis
{0 or 'index', 1 or 'columns'}
Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.
Returns Type DescriptionDataFrame
DataFrame result of the arithmetic operation. value_counts
value_counts(
subset: typing.Optional[
typing.Union[typing.Hashable, typing.Sequence[typing.Hashable]]
] = None,
normalize: bool = False,
sort: bool = True,
ascending: bool = False,
dropna: bool = True,
)
Return a Series containing counts of unique rows in the DataFrame.
Parameters Name Descriptionsubset
label or list of labels, optional
Columns to use when counting unique combinations.
normalize
bool, default False
Return proportions rather than frequencies.
sort
bool, default True
Sort by frequencies.
ascending
bool, default False
Sort in ascending order.
dropna
bool, default True
Don’t include counts of rows that contain NA values.
Returns Type DescriptionSeries
Series containing counts of unique rows in the DataFrame var
var(*, numeric_only: bool = False) -> bigframes.series.Series
Return unbiased variance over requested axis.
Normalized by N-1 by default.
Parameter Name Descriptionnumeric_only
bool. default False
Default False. Include only float, int, boolean columns.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-12 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-12 UTC."],[],[]]
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4