Regression is a mathematical process used to model data by identifying a function that best represents its patterns. In machine learning, regression functions are used for predictive analysis.
There are various regression techniques and the choice depends on factors such as data distribution. A simple form is linear regression, represented by the equation:
y = a\*x + b
Visualizing this equation as a straight line on a 2D graph:
y
: The dependent (outcome) variable, plotted on the y-axis (vertical).x
: The independent (predictor) variable, plotted on the x-axis (horizontal).b
: The intercept, representing the value of y
when x = 0
.a
: The slope, indicating how y
changes when x
increases by one unit.The following code predicts a person’s weight based on a person’s height:
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
heights = [150, 152, 160, 172, 176, 176, 180, 189]
weights = [50, 65, 65, 70, 80, 90, 90, 89]
measurements = pd.DataFrame({'height': heights, 'weight': weights})
model = sm.OLS.from_formula("weight ~ height", data=measurements)
results = model.fit()
print(results.summary())
plt.scatter(measurements['height'], measurements['weight'], label='Data')
plt.plot(measurements['height'], results.predict(measurements), color='red', label='Regression Line')
plt.xlabel('Height (cm)')
plt.ylabel('Weight (kg)')
plt.title('Height vs Weight with Regression Line')
plt.legend()
plt.savefig('height-vs-weight-plot.png')
plt.show()
This code performs linear regression using statsmodels
to analyze the relationship between height and weight. It fits a model of the form weight = a * height + b
, prints the regression summary, and visualizes the data with a scatter plot and a best-fit line.
The output of this code is as follows:
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4