A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/pandas-dev/pandas/issues/49631 below:

Series[dt64].astype(int64) vs Series[Sparse[dt64]].astype(int64) · Issue #49631 · pandas-dev/pandas · GitHub

dti = pd.date_range("2016-01-01", periods=3)
ser = pd.Series(dti)
ser[0] = pd.NaT

dense = ser._values
sparse = pd.core.arrays.SparseArray(ser.values)

>>> dense.astype("int64")
array([-9223372036854775808,  1451692800000000000,  1451779200000000000])

>>> sparse.astype("int64")
[...]
ValueError: Cannot convert NaT values to integer

>>> sparse.astype("Sparse[int64]")
[0, 1451692800000000000, 1451779200000000000]
Fill: 0
IntIndex
Indices: array([1, 2], dtype=int32)

The dense version goes through DatetimeArray.astype, for which .astype to int64 is basically a view (xref #45034). The Sparse version goes through astype_nansafe which specifically checks for NaTs when going from dt64->int64. I expected this to match the non-sparse behavior.

When converting to Sparse[int64], we only call astype_nansafe on the non-NaT elements so it doesn't raise, but when converting the fill_value from NaT it somehow gets 0, whereas I expected that to raise.

Side-notes:
ser.astype(pd.SparseDtype(ser.dtype)) raises, as does dense.astype("Sparse[int64]")


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4