A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/pandas-dev/pandas/issues/54938 below:

Series.struct accessor with Series.struct.field("sub-column name") for ArrowDtype · Issue #54938 · pandas-dev/pandas · GitHub

Feature Type Problem Description

When I have a Series of type ArrowDtype(struct(...)), I'd like to be able to extract sub-fields from them.

For example, I have a pandas Series with the ArrowDtype(pyarrow.struct([("int_col", pyarrow.int64()), ("string_col", pyarrow.string())])). I'd like to extract just the int_col field from this Series as another Series.

Feature Description

Add a struct accessor which is accessible from Series with ArrowDtype(struct(...)). This struct accessor provides a field() method which returns a Series containing only the specified sub-field.

series = pandas.Series(struct_array, dtype=pandas.ArrowDtype(struct_type))

int_series = series.struct.field("int_col")
Alternative Solutions

I can currently do this via pyarrow.compute.struct_field on the underlying pyarrow array:

import pyarrow
struct_type = pyarrow.struct([
    ("int_col", pyarrow.int64()),
    ("string_col", pyarrow.string()),
])
struct_array = pyarrow.array([
    {"int_col": 1, "string_col": "a"},
    {"int_col": 2, "string_col": "b"},
    {"int_col": 3, "string_col": "c"},
], type=struct_type)

import pandas
series = pandas.Series(struct_array, dtype=pandas.ArrowDtype(struct_type))

int_col_index = struct_array.type.get_field_index("int_col")
int_col_series = pandas.Series(
    pyarrow.compute.struct_field(struct_array, [int_col_index]),
    dtype=pandas.ArrowDtype(struct_array.type[int_col_index].type))
Additional Context

This issue is particularly relevant when working with data sources that support struct fields, such as BigQuery or Parquet.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4