First the code then a description of the key points concerning Linear Regression.
Let us consider the colon cancer data. The data has been taken in csv form. Here is a code in Python to compute Linear Regression based computations.
The code wraps up all the internal processing behind Linear Regression.
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import metrics
#read data
dataset = pd.read_csv('/content/colon_cancer.csv')
#size of data
numRows, numCols = np.shape(dataset)
#read the target class and input features
X = dataset.iloc[0:numRows, 0:numCols-2].values
Y = dataset.iloc[0:numRows, numCols-1].values
#split data in testing and training
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.3)
#make linear regression object
regObject = LinearRegression()
#execute the data with Linear Regression
regObject.fit(X_train, Y_train)
#predicted values
X_Predict = regObject.predict(X_test)
#compute accuracy
predictedValue = accuracy_score(X_Predict, Y_test)
— Regression is the way to predict the numerical value of the target class, and fit the data, unlike in classification problems wherein target values are computed.
— Linear word in linear regression emphasize that the learned function is linear in terms of input features.
— Let us understand what Linear Regression is all about.
— A linear function is learned to simulate the input data and then the continuous value of the output target class is predicted.
— Suppose we have n variables and features, say 1,…k, and say x1, x2…., xn be data points for training
— And let t1,….tn be the target values.
— The aim is to learn a linear regressor that depends on input
— The equations can be written as a linear combination of input variables
— Here w1, ……,wk are weights to be learned and g(x) is a linear function in terms of input features.
— The weights are learned based on minimizing mean square error between the predicted values and the target values.
— The means square error is computed to find the weights
— This can be solved for the weights either analytically or computationally.
— With these newly found weights, we write the equation of linear regressor and this can be used to compute the predicted value of any input data.