Sparse looks to handle missing (NaN)
and fill_value
confusingly. Based on the doc, I understand fill_value
is a user-specified value to be omitted in the sparse internal repr. fill_value
may be different from missing (NaN).
# NG, 2nd and last element must be NaN
pd.SparseArray([1, np.nan, 0, 3, np.nan], fill_value=0).to_dense()
# array([ 1., 0., 0., 3., 0.])
# NG, 2nd element must be NaN
orig = pd.Series([1, np.nan, 0, 3, np.nan], index=list('ABCDE'))
sparse = orig.to_sparse(fill_value=0)
sparse.reindex(['A', 'B', 'C'])
# A 1.0
# B 0.0
# C 0.0
# dtype: float64
# BlockIndex
# Block locations: array([0], dtype=int32)
# Block lengths: array([1], dtype=int32)
Expected Output
pd.SparseArray([1, np.nan, 0, 3, np.nan], fill_value=0).to_dense()
# array([ 1., np.nan, 0., 3., np.nan])
sparse = orig.to_sparse(fill_value=0)
sparse.reindex(['A', 'B', 'C'])
# A 1.0
# B NaN
# C 0.0
# dtype: float64
# BlockIndex
# Block locations: array([0], dtype=int32)
# Block lengths: array([1], dtype=int32)
output of pd.show_versions()
Current master.
The fix itself looks straightforward, but it breaks some tests use dubious comparison.
You can’t perform that action at this time.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4