Home Artificial Intelligence Unveiling the Power of Principal Component Evaluation (PCA): A Mathematical Journey INTRODUCTION BASIC CONCEPT OF PCA: PERFORMING PCA STEP BY STEP: SAMPLE CODE FOR PCA ON IRIS DATASET: CONCLUSION:

Unveiling the Power of Principal Component Evaluation (PCA): A Mathematical Journey INTRODUCTION BASIC CONCEPT OF PCA: PERFORMING PCA STEP BY STEP: SAMPLE CODE FOR PCA ON IRIS DATASET: CONCLUSION:

0
Unveiling the Power of Principal Component Evaluation (PCA): A Mathematical Journey
INTRODUCTION
BASIC CONCEPT OF PCA:
PERFORMING PCA STEP BY STEP:
SAMPLE CODE FOR PCA ON IRIS DATASET:
CONCLUSION:

Photo by Thomas T on Unsplash

On this, we dive into the depths of Principal Component Evaluation (PCA), a strong mathematical technique for dimensionality reduction and data visualization. On this, we demystify the underlying mathematical concepts and walk you thru the step-by-step means of implementing PCA. From eigenvectors to eigenvalues, covariance matrices to singular value decomposition, we’ll equip you with the knowledge and tools to confidently apply PCA in your personal data evaluation projects. Get able to unlock the true potential of PCA and transform your understanding of knowledge exploration and have extraction.

NOTE: PCA is dimensionality reduction or compression technique not an information reduction technique which suggests we get a compressed representation of the unique data. (Any attribute of knowledge won’t be lost but have some part in the ultimate data)

PCA in a nutshell. Source: Lavrenko and Sutton 2011, slide 13.
WORKING OF PCA (image by Dr. Aditya Nigam)

Suppose there may be an information which consist of N-tuples (data vectors) and having d- attributes or we will say that d-dimensions.

REPRESENTATION OF A DATA
DATA SHOWING THE TUPLE

Let qi, where i = 1,2,3….d be the in d-dimension space , qi belongs to R^d . These are the unit vectors that every point in a direction perpendicular to the others. These are also often known as direction of projection.

Now what PCA do is it searches for ‘l’ orthonormal vectors that may best be used to represent the info such that

The unique data (each of the tuples, Xn) is then projected onto each of the l orthonormal vectors to get the

Ani is an Ith principal component of Xn
diagram showing the principal component.

This transform each of the d- dimensional vectors to l- dimensional vectors

final vector of l- dimension

STEP 1: MEAN SUBTRACTION—

To begin with, remove mean for every attribute (dimension) in data samples (tuples) to acquire the mean subtracted samples (MSD).

IMAGE SHOWING DATASET ALONG WITH MEAN (image by Dr. Aditya Nigam)
GRAPH SHOWING DATA BEFORE AND AFTER APPLYING MEAN SUBTRACTION(image by Dr. Adiya Nigam)

Divide the info points by the usual deviation of the dataset for each dimension 1,2,3…d. Now the info is unit free, and it has variance 1 along each axis.

STANDARDIZATION FORMULA

STEP 3: CALCULATE THE COVARIANCE MATRIX —

After standardization, compute correlation matrix from the mean subtracted data matrix (i.e., covariance matrix from original data matrix). Now query arises that how we will calculate the correlation matrix so for that, it’s the matrix multiplication of transpose of standardized matrix and the unique standardized matrix.(In Professor Strang’s linear algebra lectures, “” — with this nomenclature, versus X′X, for instance — is the revolving axis.)

formula to seek out the correlation matrix.
final correlation matrix (image by Dr. Aditya Nigam)

STEP 3:

To know the concept first we have now to know the meaning of eigenvalues and eigenvectors.

Eigenvalues represent the scaling factor by which a vector is transformed when a linear transformation is applied, while eigenvectors represent the directions during which the transformation occurs.

Let A be a square matrix (in our case the covariance matrix), ν a vector and λ a scalar that satisfies Aν = λν, then λ is known as eigenvalue related to eigenvector ν of A.

Rearranging the above equation,

Aν-λν =0; (A-λI)ν = 0

Since we have now already known ν is a non- zero vector, only way this equation will be equal to zero, if

det(A-λI) = 0

image showing eigenvalues and eigenvectors (image by Dr. Aditya Nigam)

Arrange the eigenvectors within the descending order of their corresponding eigenvalues. Consider the 2 leading (significant) eigenvalues and their corresponding eigenvectors.

Project the mean subtracted data matrix onto the chosen two eigenvectors corresponding to leading eigenvalues.

final data after projecting eigenvalues (data with reduced dimension, image by Dr. Aditya Nigam)

We are able to construct whole original data from the reduced data but there may be some loss on this process that’s why we are saying that PCA is a lossy process and that’s one among the drawbacks of PCA.

An approximation of mean subtracted data, xn , is obtained as linear combination of the direction of projection (strongest eigenvectors), qi, and the principal components Ani.

Error in reconstruction: the Euclidian distance between the unique and approximated tuples.

error in reconstruction( image by Dr. Aditya Nigam)

NOTE: I’m going to clarify reconstruction process intimately in another blog.

# Adapted from source: https://towardsdatascience.com/pca-using-python-scikit-learn-e653f8989e60
# Accessed 2019-01-12.

# Do PCA on iris dataset and plot
# ---------------------------------------------------------------------
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler

# Download and cargo iris dataset
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
names = ['sepal length','sepal width','petal length','petal width','target']
df = pd.read_csv(url, names=names)

# Standardize data to 0 mean and 1 variance
features = names[:-1]
x = df.loc[:, features].values
y = df.loc[:,['target']].values
x = StandardScaler().fit_transform(x)

# Perform PCA
pca = PCA(n_components=2)
principalComponents = pca.fit_transform(x)
principalDf = pd.DataFrame(data = principalComponents
, columns = ['PC1', 'PC2'])
finalDf = pd.concat([principalDf, df[['target']]], axis = 1)

# Plot
fig = plt.figure(figsize = (8,8))
ax = fig.add_subplot(1,1,1)
explained = np.around(pca.explained_variance_ratio_*100, 2)
ax.set_xlabel('PC1 ({}%)'.format(explained[0]), fontsize = 15)
ax.set_ylabel('PC2 ({}%)'.format(explained[1]), fontsize = 15)
ax.set_title('Two-Component PCA', fontsize = 20)
targets = ['Iris-setosa', 'Iris-versicolor', 'Iris-virginica']
colours = ['r', 'g', 'b']
for goal, color in zip(targets,colours):
indicesToKeep = finalDf['target'] == goal
ax.scatter(finalDf.loc[indicesToKeep, 'PC1']
, finalDf.loc[indicesToKeep, 'PC2']
, c = color
, s = 50)
ax.legend(targets)
ax.grid()

In conclusion, we have now explored the fascinating world of Principal Component Evaluation (PCA) and gained a deeper understanding of its mathematical features, the underlying process, and its practical implementation using the Iris dataset. Through this blog, we have now delved into the essence of PCA, which is a strong dimensionality reduction technique that helps uncover meaningful patterns and reduce complexity in high-dimensional data.

Starting with the mathematical foundation of PCA, we learned about eigenvectors and eigenvalues, which play an important role in transforming the unique feature space right into a latest set of orthogonal axes often known as principal components. These components capture the utmost variance in the info and enable us to visualise and analyze the info in a reduced dimensional space.

We then explored the step-by-step means of PCA, from standardizing the info to calculating the covariance matrix, obtaining the eigenvectors and eigenvalues, and choosing the principal components. By sorting the eigenvalues in descending order, we will determine essentially the most informative principal components and discard the less important ones. This helps us retain the essential features of the dataset while discarding the noise or redundancy present in the info.

To solidify our understanding, we implemented PCA on the Iris dataset, a well-liked benchmark dataset in machine learning. By leveraging the scikit-learn library in Python, we were in a position to effortlessly apply PCA to the dataset, extract the principal components, and visualize the transformed data. Through these visualizations, we observed the ability of PCA in reducing the dimensionality of the Iris dataset while preserving its inherent structure and patterns.

The source code provided on this blog serves as a helpful resource for anyone thinking about exploring PCA further. By studying the code, readers can gain insights into the sensible implementation of PCA and adapt it to their very own datasets or problem domains. The Iris dataset serves as a wonderful start line for understanding PCA as a result of its simplicity and well-defined class separability.

In conclusion, PCA offers a strong tool for dimensionality reduction and data exploration. It allows us to achieve helpful insights from complex datasets by capturing essentially the most significant features while discarding noise and redundancy. By mastering the mathematical foundations and practical implementation of PCA, we will unleash its potential in various fields, from data evaluation and visualization to machine learning and pattern recognition. So dive into the world of PCA and unlock the hidden dimensions inside your data!

LEAVE A REPLY

Please enter your comment!
Please enter your name here