A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/pandas-dev/pandas/issues/9512 below:

Non-monotonic-increasing DatetimeIndex claims not to __contain__ duplicate entries · Issue #9512 · pandas-dev/pandas · GitHub

This was fun to debug.

In [1]: import pandas as pd

In [2]: 0 in pd.Int64Index([0, 0, 1])
Out[2]: True

In [3]: 0 in pd.Int64Index([0, 1, 0])
Out[3]: True

In [4]: 0 in pd.Int64Index([0, 0, -1])
Out[4]: True

In [5]: pd.Timestamp(0) in pd.DatetimeIndex([0, 1, -1])
Out[5]: True

In [6]: pd.Timestamp(0) in pd.DatetimeIndex([0, 1, 0])
Out[6]: False   # BAD

In [7]: pd.Timestamp(0) in pd.DatetimeIndex([0, 0, 1])
Out[7]: True

In [8]: pd.Timestamp(0) in pd.DatetimeIndex([0, 0, -1])
Out[8]: False   # BAD

TimedeltaIndex is also broken.

The problem is in DatetimeIndexOpsMixin.__contains__, which checks the type of idx.get_loc(key) to determine whether the key was found in the index. If the index contains duplicate entries and is not monotonic increasing (for some reason, monotonic decreasing doesn't cut it), get_loc eventually falls back to Int64Engine._maybe_get_bool_indexer, which returns an ndarray of bools if the key is duplicated. Since the original __contains__ method is looking for scalars or slices, it reports that the duplicated entry is not present.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4