A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://stackoverflow.com/questions/79720251/polars-schema-override-for-datetimes-as-string below:

python - Polars schema_override for Datetimes as string

Issue

I have data in form of a list of dicts (see MRE below). To make everything type strict I would always like to pass in the expected schema (dtypes) when I read in this data. This option is given in the pl.DataFrame constructor with either schema or schema_overrides. However I frequently run into trouble with the Datetime columns in the schema. Especially when they presented as strings in the dictionaries

Traceback
polars.exceptions.ComputeError: could not append value: "2020-02-11" of type: str to the builder; make sure that all rows have the same schema or consider increasing `infer_schema_length`
Question

Is there a way to "automatically" parse datetime strings when I construct the Dataframe (or use the pl.from_dicts() method)? Something comparable to the solution for data that is present as timestamps (int) in the dictionary of the data implemented early 2024 (github issue)?

Is there something similar for date information present as string (e.g. 2022-01-01)?

Or do I have to drop from my schema_override every pl.Datetime key and then later on convert this manually via

with_columns(pl.col(list_dropped_datetime_cols).cast(pl.Datetime))
MRE
import polars as pl

schema_override = {
    "some_int_override": pl.Int8,
    "some_date_override": pl.Datetime,
}

dict_data = [
    {
        "some_int_override": 1,
        "some_date_override": "2020-02-11",
        "some_date": "2025-02-11",
    }
]


df_naiive = pl.DataFrame(dict_data)
print(df_naiive)

df_schema_override = pl.DataFrame(dict_data, schema_overrides=schema_override)
print(df_schema_override)


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4