Note: This same article appears in my median.com account as well
In data science problems it is often required to scale data so that the algorithms fit well with the learning goals.
How to scale efficiently? There are ways in which we can scale data manually or using some libraries.
In this article, I provide simple, quick and quite accurate way to scale data to a new range and then to descale data back.
Scaling back is mostly required to get the output in form it was given in input, such as a target class for instance, this is what I have covered it in this article.
Further, we have checked on this short data the errors in descaling. We have computed the errors in scaling and descaling. Because we don’t want more errors from this process, apart from at time algorithmic errors in misclassification.
The illustration is performed on popular IRIS data.
We shall use MinMaxScaler from sklearn
Lets import some libraries required in
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt
Read the data using pandas, the data is in excel format
df = pd.read_excel('/content/IRIS_Edited.xlsx')
print(df)
Read the data and split data in test and train, this is to give the MinMax scaler distributed data
feature_names = [ 'sepal_length','sepal_width', 'petal_length','petal_width']
feature = df[feature_names]
target_names = ['species']
target = df[target_names]
x = feature.values
y = target.values
train_x, test_x, train_y, test_y =train_test_split(x, y, test_size=0.20)
Scale the features and target class using two iinstances of scaler class
scaler1 = MinMaxScaler()
x_train_ = scaler1.fit_transform(train_x)
x_test_ = scaler1.transform(test_x)
print(x_test_[0])
scaler2 = MinMaxScaler()
y_train_ = scaler2.fit_transform(train_y)
y_test_ = scaler2.transform(test_y)
print(y_test_[0])
Let’s test the test target class for scaling and descaling errors
inverse_y_test_ = scaler2.inverse_transform(y_test_)
print(inverse_y_test_[0], test_y[0], test_x[0])
compute the error of scaled and descaled values
def computeError():
error = 0
for i in range(len(test_y)):
error = error + (test_y[i] - inverse_y_test_[i]) * (test_y[i] - inverse_y_test_[i])
print("error is", error)
computeError()
mse = mean_squared_error(test_y, inverse_y_test_)
print(mse)
Plot the errors
def plotResults():
plt.figure(figsize=(25,10))
plt.plot(inverse_y_test_, label='true')
plt.plot(test_y, label='pred')
plt.show()
plotResults()
Results

The error was nil in these computations
The spread of the graph is not uniform, because we have used a train-test data split, which caused it to be nonsequential as is in the input excel sheet.