I am trying to calculate Volume Weighted Average Price on a rolling basis.
To do this, I have a function vwap that does this for me, like so:
def vwap(bars):
return ((bars.Close*bars.Volume).sum()/bars.Volume.sum()).round(2)
When I try to use this function with rolling_apply, as shown, I get an error:
import pandas.io.data as web
bars = web.DataReader('AAPL','yahoo')
print pandas.rolling_apply(bars,30,vwap)
AttributeError: 'numpy.ndarray' object has no attribute 'Close'
The error makes sense to me because the rolling_apply requires not DataSeries or a ndarray as an input and not a dataFrame.. the way I am doing it.
Is there a way to use rolling_apply to a DataFrame to solve my problem?
asked Oct 1, 2013 at 16:56
nitinnitin7,4081111 gold badges4242 silver badges5353 bronze badges
1This is not directly enabled, but you can do it like this
In [29]: bars
Out[29]:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 942 entries, 2010-01-04 00:00:00 to 2013-09-30 00:00:00
Data columns (total 6 columns):
Open 942 non-null values
High 942 non-null values
Low 942 non-null values
Close 942 non-null values
Volume 942 non-null values
Adj Close 942 non-null values
dtypes: float64(5), int64(1)
window=30
In [30]: concat([ (Series(vwap(bars.iloc[i:i+window]),
index=[bars.index[i+window]])) for i in xrange(len(df)-window) ])
Out[30]:
2010-02-17 203.21
2010-02-18 202.95
2010-02-19 202.64
2010-02-22 202.41
2010-02-23 202.19
2010-02-24 201.85
2010-02-25 201.65
2010-02-26 201.50
2010-03-01 201.31
2010-03-02 201.35
2010-03-03 201.42
2010-03-04 201.09
2010-03-05 200.95
2010-03-08 201.50
2010-03-09 202.02
...
2013-09-10 485.94
2013-09-11 487.38
2013-09-12 486.77
2013-09-13 487.23
2013-09-16 487.20
2013-09-17 486.09
2013-09-18 485.52
2013-09-19 485.30
2013-09-20 485.37
2013-09-23 484.87
2013-09-24 485.81
2013-09-25 486.41
2013-09-26 486.07
2013-09-27 485.30
2013-09-30 484.74
Length: 912
answered Oct 1, 2013 at 17:27
JeffJeff129k2121 gold badges223223 silver badges189189 bronze badges
1A cleaned up version for reference, hopefully got the indexing correct:
def myrolling_apply(df, N, f, nn=1):
ii = [int(x) for x in arange(0, df.shape[0] - N + 1, nn)]
out = [f(df.iloc[i:(i + N)]) for i in ii]
out = pandas.Series(out)
out.index = df.index[N-1::nn]
return(out)
answered Sep 2, 2014 at 21:22
safetyducksafetyduck6,9001515 gold badges6363 silver badges115115 bronze badges
0Modified @mathtick's answer to include na_fill
. Also note that your function f
needs to return a single value, this can't return a dataframe with multiple columns.
def rolling_apply_df(dfg, N, f, nn=1, na_fill=True):
ii = [int(x) for x in np.arange(0, dfg.shape[0] - N + 1, nn)]
out = [f(dfg.iloc[i:(i + N)]) for i in ii]
if(na_fill):
out = pd.Series(np.concatenate([np.repeat(np.nan, N-1),np.array(out)]))
out.index = dfg.index[::nn]
else:
out = pd.Series(out)
out.index = dfg.index[N-1::nn]
return(out)
answered Mar 15, 2017 at 2:26
citynormancitynorman5,36244 gold badges4747 silver badges4141 bronze badges
Start asking to get answers
Find the answer to your question by asking.
Ask questionExplore related questions
See similar questions with these tags.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4