Last Updated : 11 Jul, 2025
Feature Scaling is a technique to standardize the independent features present in the data. It is performed during the data pre-processing to handle highly varying values. If feature scaling is not done then machine learning algorithm tends to use greater values as higher and consider smaller values as lower regardless of the unit of the values. For example it will take 10 m and 10 cm both as same regardless of their unit. In this article we will learn about different techniques which are used to perform feature scaling.
1. Absolute Maximum ScalingThis method of scaling requires two-step:
X_{\rm {scaled }}=\frac{X_{i}-\rm{max}\left(|X|\right)}{\rm{max}\left(|X|\right)}
After performing the above-mentioned two steps we will observe that each entry of the column lies in the range of -1 to 1. But this method is not used that often the reason behind this is that it is too sensitive to the outliers. And while dealing with the real-world data presence of outliers is a very common thing.
For the demonstration purpose we will use the dataset which you can download from here. This dataset is a simpler version of the original house price prediction dataset having only two columns from the original dataset. The first five rows of the original data are shown below:
Python
import pandas as pd
df = pd.read_csv('SampleFile.csv')
print(df.head())
Output:
LotArea MSSubClass 0 8450 60 1 9600 20 2 11250 60 3 9550 70 4 14260 60
Now let's apply the first method which is of the absolute maximum scaling. For this first, we are supposed to evaluate the absolute maximum values of the columns.
Python
import numpy as np
max_vals = np.max(np.abs(df))
max_vals
Output:
LotArea 215245 MSSubClass 190 dtype: int64
Now we are supposed to subtract these values from the data and then divide the results from the maximum values as well.
Python
print((df - max_vals) / max_vals)
Output:
LotArea MSSubClass 0 -0.960742 -0.999721 1 -0.955400 -0.999907 2 -0.947734 -0.999721 3 -0.955632 -0.999675 4 -0.933750 -0.999721 ... ... ... 1455 -0.963219 -0.999721 1456 -0.938791 -0.999907 1457 -0.957992 -0.999675 1458 -0.954856 -0.999907 1459 -0.953834 -0.999907 [1460 rows x 2 columns]2. Min-Max Scaling
This method of scaling requires below two-step:
X_{\rm {scaled }}=\frac{X_{i}-X_{\text {min}}}{X_{\rm{max}} - X_{\rm{min}}}
As we are using the maximum and the minimum value this method is also prone to outliers but the range in which the data will range after performing the above two steps is between 0 to 1.
Python
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(df)
scaled_df = pd.DataFrame(scaled_data,
columns=df.columns)
scaled_df.head()
Output:
LotArea MSSubClass 0 0.033420 0.235294 1 0.038795 0.000000 2 0.046507 0.235294 3 0.038561 0.294118 4 0.060576 0.2352943. Normalization
Normalization is the process of adjusting the values of data points so that they all have the same length or size, specifically a length of 1. This is done by dividing each data point by the "length" (called as Euclidean norm) of that data point. Think of it like adjusting the size of a vector so that it fits within a standard size of 1.
The formula for Normalization looks like this:
X_{\text{scaled}} = \frac{X_i}{\| X \|}
Where:
from sklearn.preprocessing import Normalizer
scaler = Normalizer()
scaled_data = scaler.fit_transform(df)
scaled_df = pd.DataFrame(scaled_data,
columns=df.columns)
print(scaled_df.head())
Output:
LotArea MSSubClass 0 0.999975 0.007100 1 0.999998 0.002083 2 0.999986 0.005333 3 0.999973 0.007330 4 0.999991 0.0042084. Standardization
This method of scaling is basically based on the central tendencies and variance of the data.
This helps us achieve a normal distribution of the data with a mean equal to zero and a standard deviation equal to 1.
X_{\rm {scaled }}=\frac{X_{i}-X_{\text {mean }}}{\sigma}
Python
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaled_data = scaler.fit_transform(df)
scaled_df = pd.DataFrame(scaled_data,
columns=df.columns)
print(scaled_df.head())
Output:
LotArea MSSubClass 0 -0.207142 0.073375 1 -0.091886 -0.872563 2 0.073480 0.073375 3 -0.096897 0.309859 4 0.375148 0.0733755. Robust Scaling
In this method of scaling, we use two main statistical measures of the data.
After calculating these two values we are supposed to subtract the median from each entry and then divide the result by the interquartile range.
X_{\rm {scaled }}=\frac{X_{i}-X_{\text {median }}}{IQR}
Python
from sklearn.preprocessing import RobustScaler
scaler = RobustScaler()
scaled_data = scaler.fit_transform(df)
scaled_df = pd.DataFrame(scaled_data,
columns=df.columns)
print(scaled_df.head())
Output:
LotArea MSSubClass 0 -0.254076 0.2 1 0.030015 -0.6 2 0.437624 0.2 3 0.017663 0.4 4 1.181201 0.2
In conclusion scaling, normalization and standardization are essential feature engineering techniques that ensure data is well-prepared for machine learning models. They help improve model performance, enhance convergence and reduce biases. Choosing the right method depends on your data and algorithm.
Why use Feature Scaling?In machine learning feature scaling is used for number of purposes:
Machine Learning - Implementation of Data Scaling Using Python
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4