, we have seen how to use the function numpy.linalg.lstsq(...) to solve an over-determined system. This time, we'll use it to estimate the parameters of a
regression line.
A linear regression line is of the form w
1x+w
2=y and it is the line that minimizes the sum of the squares of the distance from each data point to the line. So, given n pairs of data (x
i, y
i), the parameters that we are looking for are w
1and w
2which minimize the error
and we can compute the parameter vector
w= (w
1, w
2)
Tas the least-squares solution of the following over-determined system
Let's use numpy to compute the regression line:
from numpy import arange,array,ones,linalg from pylab import plot,show xi = arange(0,9) A = array([ xi, ones(9)]) # linearly generated sequence y = [19, 20, 20.5, 21.5, 22, 23, 23, 25.5, 24] w = linalg.lstsq(A.T,y)[0] # obtaining the parameters # plotting the line line = w[0]*xi+w[1] # regression line plot(xi,line,'r-',xi,y,'o') show()
We can see the result in the plot below.
You can find more about data fitting using numpy in the following posts:
Update, the same result could be achieve using the function scipy.stats.linregress (thanks ianalis!):from numpy import arange,array,ones#,random,linalg from pylab import plot,show from scipy import stats xi = arange(0,9) A = array([ xi, ones(9)]) # linearly generated sequence y = [19, 20, 20.5, 21.5, 22, 23, 23, 25.5, 24] slope, intercept, r_value, p_value, std_err = stats.linregress(xi,y) print 'r value', r_value print 'p_value', p_value print 'standard deviation', std_err line = slope*xi+intercept plot(xi,line,'r-',xi,y,'o') show()
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4