Home Artificial Intelligence Lease Squares Regression (OLS)

Lease Squares Regression (OLS)

6
Lease Squares Regression (OLS)

Code_From_Scratch

Hey everyone, let me let you know the story of Ian, whose boss has challenged to predict the brain’s weight given the pinnacle size.

Photo by Vadim Bogulov on Unsplash

Let’s help Ian, solve this issue with the next steps:-

Our friend #Kaggle turns out to be useful, here is how our data looks like

Dataset Overview

By understanding the medical terms and the pattern of the info, Ian concludes that OLS can be the most effective method to predict the values. Let’s attempt to see if it really works

He first jots down the concept behind OLS

The major aim of OLS is to scale back the error between the actual and predicted value by utilizing the proper type of prediction for slope (m) and the constant (c) i.e.,

∑( — i)

where i is the actual value, i is the expected value.

Regression eq of y on x is

where x̄ — mean of x, ȳ — mean of y and m_yx is slope of y on x

We use the next formula to search out the slope(m)

To search out c,

c = ȳ — (m * )

To predict the pinnacle weight as in the instance, let’s follow the identical procedure on the python

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Deifne dependent and independent variable
x=df['Head Size(cm^3)'].values
y=df['Brain Weight(grams)'].values
n=len(x)
mean_x = np.mean(x) # mean of x
mean_y = np.mean(y) # mean of y
# Let's calculate slope
num=0
den=0
for i in range(n):
num+= (x[i] - mean_x) * (y[i]- mean_y)
den+= (x[i] - mean_x) ** 2
m=num/den
# Let's calculate constant (c)
c=mean_y-(m*mean_x)

These are the values Ian got as output

max_x = np.max(x) + 100
min_x = np.min(x) - 100

w = np.linspace(min_x, max_x, 1000)
v = c + m*w

#ploting line
plt.plot(w,v, color = '#58b970',label = 'Regression Line')
#ploting Scatter points
plt.scatter(x,y,c= '#ef5423', label = 'Scatter Plot')

plt.xlabel('Head size in cm')
plt.ylabel('Brain Weight in grams')
plt.legend()
plt.show()

Plot showing the expected line with the actual values

Now Ian can easily predict the Brain weight given the Head size, just by substituting the x and y values.

  1. : There should be linear relationship between the dependent variable and the independent variables.
  2. : The observations should be independent of one another.
  3. : The variance of the residuals must be constant across all levels of the independent variables.
  4. : The residuals / errors must be normally distributed.
  5. : The independent variables mustn’t be highly correlated with one another.

6 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here