RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/pandas-dev/pandas/issues/7297 below:

Change in behavior for rolling_var when win

In 0.13 I could pass a window length greater than the length of the Series passed to rolling_var (or, of course, rolling_std). In 0.14 that raises an error. Behavior is unchanged from 0.13 for other rolling functions:

data = """
x
0.1
0.5
0.3
0.2
0.7
"""

df = pd.read_csv(StringIO(data),header=True)

>>> pd.rolling_mean(df['x'],window=6,min_periods=2)

0      NaN
1    0.300
2    0.300
3    0.275
4    0.360
dtype: float64

>>> pd.rolling_skew(df['x'],window=6,min_periods=2)

0             NaN
1             NaN
2    3.903128e-15
3    7.528372e-01
4    6.013638e-01
dtype: float64

>>> pd.rolling_skew(df['x'],window=6,min_periods=6)

0   NaN
1   NaN
2   NaN
3   NaN
4   NaN
dtype: float64

Those work, but not rolling_var:

>>> pd.rolling_var(df['x'],window=6,min_periods=2)

Traceback (most recent call last):
  File "./foo.py", line 187, in <module>
    print pd.rolling_var(df['x'],window=6,min_periods=2)
  File "/usr/lib64/python2.7/site-packages/pandas/stats/moments.py", line 594, in f
    center=center, how=how, **kwargs)
  File "/usr/lib64/python2.7/site-packages/pandas/stats/moments.py", line 346, in _rolling_moment
    result = calc(values)
  File "/usr/lib64/python2.7/site-packages/pandas/stats/moments.py", line 340, in <lambda>
    **kwds)
  File "/usr/lib64/python2.7/site-packages/pandas/stats/moments.py", line 592, in call_cython
    return func(arg, window, minp, **kwds)
  File "algos.pyx", line 1177, in pandas.algos.roll_var (pandas/algos.c:28449)
IndexError: Out of bounds on buffer access (axis 0)

If this is the new desired default behavior for the rolling functions, I can work around it. I do like the behavior of rolling_skew and rolling_mean better. It was nice default behavior for me when I was doing rolling standard deviations for reasonably large financial data panels.

It looks to me like the issue is caused by the fact that the 0.14 algo for rolling variance is implemented such that the initial loop (roll_var (algos.pyx)) is the following:

So it loops to win even when win > N.

It looks like to me that the other rolling functions try to implement their algos in such a way that the first loop counts over the following:

for i from 0 <= i < minp - 1:

>>> pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.5.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.10-200.fc20.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.14.0
nose: 1.3.1
Cython: 0.20.1
numpy: 1.8.1
scipy: 0.13.3
statsmodels: 0.6.0.dev-b52bc09
IPython: 2.0.0
sphinx: 1.2.2
patsy: 0.2.1
scikits.timeseries: None
dateutil: 2.2
pytz: 2014.3
bottleneck: 0.8.0
tables: None
numexpr: 2.4
matplotlib: 1.3.1
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: 0.7.5
xlsxwriter: 0.5.3
lxml: 3.3.5
bs4: 4.3.2
html5lib: 0.999
bq: None
apiclient: None
rpy2: None
sqlalchemy: 0.9.4
pymysql: None
psycopg2: None
Non

Karl D.

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4