This was fun to debug.
In [1]: import pandas as pd In [2]: 0 in pd.Int64Index([0, 0, 1]) Out[2]: True In [3]: 0 in pd.Int64Index([0, 1, 0]) Out[3]: True In [4]: 0 in pd.Int64Index([0, 0, -1]) Out[4]: True In [5]: pd.Timestamp(0) in pd.DatetimeIndex([0, 1, -1]) Out[5]: True In [6]: pd.Timestamp(0) in pd.DatetimeIndex([0, 1, 0]) Out[6]: False # BAD In [7]: pd.Timestamp(0) in pd.DatetimeIndex([0, 0, 1]) Out[7]: True In [8]: pd.Timestamp(0) in pd.DatetimeIndex([0, 0, -1]) Out[8]: False # BAD
TimedeltaIndex is also broken.
The problem is in DatetimeIndexOpsMixin.__contains__
, which checks the type of idx.get_loc(key)
to determine whether the key was found in the index. If the index contains duplicate entries and is not monotonic increasing (for some reason, monotonic decreasing doesn't cut it), get_loc
eventually falls back to Int64Engine._maybe_get_bool_indexer
, which returns an ndarray of bools if the key is duplicated. Since the original __contains__
method is looking for scalars or slices, it reports that the duplicated entry is not present.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4