RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/pandas-dev/pandas/issues/8543 below:

Slowness in multi-level indexes with datetime levels · Issue #8543 · pandas-dev/pandas · GitHub

A MultiIndex with a DatetimeIndex level is slower than a similar index with numeric levels:

lev1 = range(10000)
lev2 = range(100)
mi = pd.MultiIndex.from_product([lev1, lev2])
%time mi.values

CPU times: user 571 ms, sys: 41 ms, total: 612 ms
Wall time: 612 ms

lev1 = range(10000)
lev2 = pd.date_range('1/1/2014', periods=100)
mi = pd.MultiIndex.from_product([lev1, lev2])
%time mi.values

CPU times: user 2.51 s, sys: 68 ms, total: 2.58 s
Wall time: 2.58 s

The overhead is in boxing the level values when generating the tuples for the values property. The overhead can be minimized if we do the boxing once for each distinct value rather than for each occurrence of that value in the tuples.

I can send in a PR shortly.

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4