RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://www.geeksforgeeks.org/python/predicting-air-quality-index-using-python/ below:

Predicting Air Quality Index using Python

Last Updated : 27 May, 2025

Air pollution is a growing concern globally, and with increasing industrialization and urbanization, it becomes crucial to monitor and predict air quality in real-time. One of the most reliable ways to quantify air pollution is by calculating the Air Quality Index (AQI). In this article, we will explore how to predict AQI using Python, leveraging data science tools and machine learning algorithms.

What is AQI?

The Air Quality Index (AQI) is a standardized indicator used to communicate how polluted the air currently is or how polluted it is forecast to become. The AQI is calculated based on pollutants such as:

PM2.5
PM10
NO2
SO2
CO
O3

Each pollutant has a sub-index, and the highest sub-index among them becomes the AQI.

I = \frac{I_{HI} - I_{LO}}{BP_{HI} - BP_{LO}} \times (C - BP_{LO}) + I_{LO}

Where:

I is the AQI
C is the concentration of the pollutant
BP_{HI} , BP_{LO} are the breakpoint concentrations
I_{HI} , I_{LO} are the AQI values corresponding to those breakpoints

We can see how air pollution is by looking at the AQI

AQI Level AQI Range Good 0 - 50 Moderate 51 - 100 Unhealthy 101 - 150 Unhealthy for Strong People 151 - 200 Hazardous 201+

Let's find the AQI based on Chemical pollutants using Machine Learning Concept.

Data Set Description

It contains 7 attributes, of which 6 are chemical pollution quantities and one is Air Quality Index. AQI Value, CO AQI Value, Ozone AQI Value, NO2 AQI Value, PM2.5 AQI Value, lat,LNG are independent attributes. air_quality_index is a dependent attribute. Since air_quality_index is calculated based on the 7 attributes.

As the data is numeric and there are no missing values in the data, so no preprocessing is required. Our goal is to predict the AQI, so this task is either Classification or regression. So as our class label is continuous, regression technique is required.

Step-by-Step Process to Predict AQI 1. Importing Libraries Python


 import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

2. Loading the Dataset

We’ll use a dataset with pollutant concentration levels and corresponding AQI values.

Python


 data = pd.read_csv('air_quality_data.csv')
print(data.head())

3. Data Preprocessing

Handle missing values, rename columns, and check data types.

Python


 data = data.dropna()
data.columns = [col.strip().lower() for col in data.columns]

4. Exploratory Data Analysis (EDA)

Visualizing relationships between variables.

Python


 sns.pairplot(data)
plt.show()

corr = data.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')

5. Feature Selection

Choose relevant features for training.

Python


 X = data[['co aqi value', 'ozone aqi value', 'no2 aqi value', 'pm2.5 aqi value']]
y = data['aqi value']

6. Train-Test Split

Python


 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

7. Model Training (Random Forest)

Python


 model = RandomForestRegressor(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

8. Model Evaluation

Python


 y_pred = model.predict(X_test)

print("Mean Absolute Error:", mean_absolute_error(y_test, y_pred))
print("Mean Squared Error:", mean_squared_error(y_test, y_pred))
print("R2 Score:", r2_score(y_test, y_pred))

9. Plotting Results

Python


 plt.figure(figsize=(10, 6))
plt.plot(y_test.values, label='Actual AQI')
plt.plot(y_pred, label='Predicted AQI', alpha=0.7)
plt.title('Actual vs Predicted AQI')
plt.legend()
plt.show()

Output:

Feature Correlation Map

Model Evaluation Metrics: Mean Absolute Error: 0.09 Mean Squared Error: 2.59 R2 Score: 1.00

Predicted AQI Real-world Applications

Smart cities to monitor pollution in real-time.
Healthcare apps to warn sensitive populations.
Environmental agencies for policy formulation.

Dataset Link: click here.

Predicting the Air Quality Index using Python

Predicting the Air Quality Index using Python Air Quality Index Prediction in Machine Learning using Python

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4