Run this notebook: Open in Colab Open in Kaggle

Plot Pca Vs Lda¶

======================================================= Comparison of LDA and PCA 2D projection of Iris dataset¶

The Iris dataset represents 3 kind of Iris flowers (Setosa, Versicolour and Virginica) with 4 attributes: sepal length, sepal width, petal length and petal width.

Principal Component Analysis (PCA) applied to this data identifies the combination of attributes (principal components, or directions in the feature space) that account for the most variance in the data. Here we plot the different samples on the 2 first principal components.

Linear Discriminant Analysis (LDA) tries to identify attributes that account for the most variance between classes. In particular, LDA, in contrast to PCA, is a supervised method, using known class labels.

Imports for PCA vs LDA Projection Comparison¶

PCA: unsupervised variance maximization: PCA(n_components=2) projects the 4D iris features onto the two directions that capture the most total variance in the data, without any knowledge of class labels. The explained_variance_ratio_ attribute reveals what fraction of total variance each component retains. While PCA often produces useful visualizations, it can miss directions that separate classes if those directions have low overall variance compared to within-class spread.

LDA: supervised between-class separation: LinearDiscriminantAnalysis(n_components=2) uses the class labels y during fitting to find projections that maximize the ratio of between-class scatter to within-class scatter (Fisher’s criterion). For k classes, LDA produces at most k-1 discriminant components. On the iris dataset with 3 classes, LDA’s 2 components are specifically optimized to separate species, often producing cleaner class clusters than PCA’s 2 components. The comparison demonstrates that when labeled data is available and the goal is classification rather than general-purpose dimensionality reduction, LDA is the more appropriate choice.

# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause

import matplotlib.pyplot as plt

from sklearn import datasets
from sklearn.decomposition import PCA
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

iris = datasets.load_iris()

X = iris.data
y = iris.target
target_names = iris.target_names

pca = PCA(n_components=2)
X_r = pca.fit(X).transform(X)

lda = LinearDiscriminantAnalysis(n_components=2)
X_r2 = lda.fit(X, y).transform(X)

# Percentage of variance explained for each components
print(
    "explained variance ratio (first two components): %s"
    % str(pca.explained_variance_ratio_)
)

plt.figure()
colors = ["navy", "turquoise", "darkorange"]
lw = 2

for color, i, target_name in zip(colors, [0, 1, 2], target_names):
    plt.scatter(
        X_r[y == i, 0], X_r[y == i, 1], color=color, alpha=0.8, lw=lw, label=target_name
    )
plt.legend(loc="best", shadow=False, scatterpoints=1)
plt.title("PCA of IRIS dataset")

plt.figure()
for color, i, target_name in zip(colors, [0, 1, 2], target_names):
    plt.scatter(
        X_r2[y == i, 0], X_r2[y == i, 1], alpha=0.8, color=color, label=target_name
    )
plt.legend(loc="best", shadow=False, scatterpoints=1)
plt.title("LDA of IRIS dataset")

plt.show()