Plot Iris SvcΒΆ
================================================== Plot different SVM classifiers in the iris datasetΒΆ
Comparison of different linear SVM classifiers on a 2D projection of the iris dataset. We only consider the first 2 features of this dataset:
Sepal length
Sepal width
This example shows how to plot the decision surface for four SVM classifiers with different kernels.
The linear models LinearSVC() and SVC(kernel='linear') yield slightly
different decision boundaries. This can be a consequence of the following
differences:
LinearSVCminimizes the squared hinge loss whileSVCminimizes the regular hinge loss.LinearSVCuses the One-vs-All (also known as One-vs-Rest) multiclass reduction whileSVCuses the One-vs-One multiclass reduction.
Both linear models have linear decision boundaries (intersecting hyperplanes) while the non-linear kernel models (polynomial or Gaussian RBF) have more flexible non-linear decision boundaries with shapes that depend on the kind of kernel and its parameters.
β¦ NOTE:: while plotting the decision function of classifiers for toy 2D datasets can help get an intuitive understanding of their respective expressive power, be aware that those intuitions donβt always generalize to more realistic high-dimensional problems.
Imports for Comparing SVM Classifiers on the Iris DatasetΒΆ
Different SVM implementations and kernels produce different decision boundaries even on the same data. SVC(kernel='linear') and LinearSVC both learn linear boundaries, but they differ subtly: SVC minimizes the hinge loss and uses one-vs-one multi-class decomposition (training K*(K-1)/2 binary classifiers), while LinearSVC minimizes the squared hinge loss and uses one-vs-rest decomposition (K binary classifiers). These differences can produce slightly different boundaries.
Linear vs. non-linear kernels: On the 2D projection of iris data, linear kernels produce straight-line decision boundaries that may not capture the true class structure. The RBF kernel (gamma=0.7) creates flexible curved boundaries that can wrap around class clusters, while the polynomial kernel (degree=3) produces smoother curves. In practice, the kernel choice should be guided by cross-validation rather than visual inspection, especially since 2D projections can be misleading about the structure in higher dimensions.
# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause
import matplotlib.pyplot as plt
from sklearn import datasets, svm
from sklearn.inspection import DecisionBoundaryDisplay
# import some data to play with
iris = datasets.load_iris()
# Take the first two features. We could avoid this by using a two-dim dataset
X = iris.data[:, :2]
y = iris.target
# we create an instance of SVM and fit out data. We do not scale our
# data since we want to plot the support vectors
C = 1.0 # SVM regularization parameter
models = (
svm.SVC(kernel="linear", C=C),
svm.LinearSVC(C=C, max_iter=10000),
svm.SVC(kernel="rbf", gamma=0.7, C=C),
svm.SVC(kernel="poly", degree=3, gamma="auto", C=C),
)
models = (clf.fit(X, y) for clf in models)
# title for the plots
titles = (
"SVC with linear kernel",
"LinearSVC (linear kernel)",
"SVC with RBF kernel",
"SVC with polynomial (degree 3) kernel",
)
# Set-up 2x2 grid for plotting.
fig, sub = plt.subplots(2, 2)
plt.subplots_adjust(wspace=0.4, hspace=0.4)
X0, X1 = X[:, 0], X[:, 1]
for clf, title, ax in zip(models, titles, sub.flatten()):
disp = DecisionBoundaryDisplay.from_estimator(
clf,
X,
response_method="predict",
cmap=plt.cm.coolwarm,
alpha=0.8,
ax=ax,
xlabel=iris.feature_names[0],
ylabel=iris.feature_names[1],
)
ax.scatter(X0, X1, c=y, cmap=plt.cm.coolwarm, s=20, edgecolors="k")
ax.set_xticks(())
ax.set_yticks(())
ax.set_title(title)
plt.show()