In 0.13 I could pass a window length greater than the length of the Series
passed to rolling_var
(or, of course, rolling_std
). In 0.14 that raises an error. Behavior is unchanged from 0.13 for other rolling functions:
data = """ x 0.1 0.5 0.3 0.2 0.7 """ df = pd.read_csv(StringIO(data),header=True) >>> pd.rolling_mean(df['x'],window=6,min_periods=2) 0 NaN 1 0.300 2 0.300 3 0.275 4 0.360 dtype: float64 >>> pd.rolling_skew(df['x'],window=6,min_periods=2) 0 NaN 1 NaN 2 3.903128e-15 3 7.528372e-01 4 6.013638e-01 dtype: float64 >>> pd.rolling_skew(df['x'],window=6,min_periods=6) 0 NaN 1 NaN 2 NaN 3 NaN 4 NaN dtype: float64
Those work, but not rolling_var
:
>>> pd.rolling_var(df['x'],window=6,min_periods=2) Traceback (most recent call last): File "./foo.py", line 187, in <module> print pd.rolling_var(df['x'],window=6,min_periods=2) File "/usr/lib64/python2.7/site-packages/pandas/stats/moments.py", line 594, in f center=center, how=how, **kwargs) File "/usr/lib64/python2.7/site-packages/pandas/stats/moments.py", line 346, in _rolling_moment result = calc(values) File "/usr/lib64/python2.7/site-packages/pandas/stats/moments.py", line 340, in <lambda> **kwds) File "/usr/lib64/python2.7/site-packages/pandas/stats/moments.py", line 592, in call_cython return func(arg, window, minp, **kwds) File "algos.pyx", line 1177, in pandas.algos.roll_var (pandas/algos.c:28449) IndexError: Out of bounds on buffer access (axis 0)
If this is the new desired default behavior for the rolling functions, I can work around it. I do like the behavior of rolling_skew
and rolling_mean
better. It was nice default behavior for me when I was doing rolling standard deviations for reasonably large financial data panels.
It looks to me like the issue is caused by the fact that the 0.14 algo for rolling variance is implemented such that the initial loop (roll_var
(algos.pyx)) is the following:
So it loops to win
even when win > N
.
It looks like to me that the other rolling functions try to implement their algos in such a way that the first loop counts over the following:
for i from 0 <= i < minp - 1:
>>> pd.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 2.7.5.final.0 python-bits: 64 OS: Linux OS-release: 3.13.10-200.fc20.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 pandas: 0.14.0 nose: 1.3.1 Cython: 0.20.1 numpy: 1.8.1 scipy: 0.13.3 statsmodels: 0.6.0.dev-b52bc09 IPython: 2.0.0 sphinx: 1.2.2 patsy: 0.2.1 scikits.timeseries: None dateutil: 2.2 pytz: 2014.3 bottleneck: 0.8.0 tables: None numexpr: 2.4 matplotlib: 1.3.1 openpyxl: 1.8.5 xlrd: 0.9.3 xlwt: 0.7.5 xlsxwriter: 0.5.3 lxml: 3.3.5 bs4: 4.3.2 html5lib: 0.999 bq: None apiclient: None rpy2: None sqlalchemy: 0.9.4 pymysql: None psycopg2: None Non
Karl D.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4