Run this notebook: Open in Colab Open in Kaggle

Plot Gpc Iris¶

===================================================== Gaussian process classification (GPC) on iris dataset¶

This example illustrates the predicted probability of GPC for an isotropic and anisotropic RBF kernel on a two-dimensional version for the iris-dataset. The anisotropic RBF kernel obtains slightly higher log-marginal-likelihood by assigning different length-scales to the two feature dimensions.

Imports for GPC on Iris with Isotropic vs Anisotropic Kernels¶

Isotropic vs anisotropic RBF kernels control per-feature sensitivity: An isotropic RBF kernel with a single length_scale parameter (RBF([1.0])) uses the same length scale for all input features, assuming both features contribute equally to similarity. An anisotropic kernel with per-feature length scales (RBF([1.0, 1.0])) allows the GP to learn different smoothness scales along each feature dimension, effectively performing automatic relevance determination (ARD) – features with larger optimized length scales are less important for classification, while those with smaller length scales have stronger influence on the decision boundary.

Multiclass GPC via one-vs-rest with probability calibration: GaussianProcessClassifier handles the three iris classes by training multiple binary GP classifiers internally using a one-vs-rest strategy, then combining their outputs into calibrated class probabilities via softmax normalization. The predict_proba method returns a 3-column probability matrix that is visualized as an RGB color image over the 2D feature mesh, where each color channel represents one class’s predicted probability. The log_marginal_likelihood reported for each model serves as a model selection criterion – the anisotropic kernel typically achieves a higher LML because it can assign different importance to sepal length versus sepal width, adapting to the actual discriminative structure of the iris features.

# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause

import matplotlib.pyplot as plt
import numpy as np

from sklearn import datasets
from sklearn.gaussian_process import GaussianProcessClassifier
from sklearn.gaussian_process.kernels import RBF

# import some data to play with
iris = datasets.load_iris()
X = iris.data[:, :2]  # we only take the first two features.
y = np.array(iris.target, dtype=int)

h = 0.02  # step size in the mesh

kernel = 1.0 * RBF([1.0])
gpc_rbf_isotropic = GaussianProcessClassifier(kernel=kernel).fit(X, y)
kernel = 1.0 * RBF([1.0, 1.0])
gpc_rbf_anisotropic = GaussianProcessClassifier(kernel=kernel).fit(X, y)

# create a mesh to plot in
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))

titles = ["Isotropic RBF", "Anisotropic RBF"]
plt.figure(figsize=(10, 5))
for i, clf in enumerate((gpc_rbf_isotropic, gpc_rbf_anisotropic)):
    # Plot the predicted probabilities. For that, we will assign a color to
    # each point in the mesh [x_min, m_max]x[y_min, y_max].
    plt.subplot(1, 2, i + 1)

    Z = clf.predict_proba(np.c_[xx.ravel(), yy.ravel()])

    # Put the result into a color plot
    Z = Z.reshape((xx.shape[0], xx.shape[1], 3))
    plt.imshow(Z, extent=(x_min, x_max, y_min, y_max), origin="lower")

    # Plot also the training points
    plt.scatter(X[:, 0], X[:, 1], c=np.array(["r", "g", "b"])[y], edgecolors=(0, 0, 0))
    plt.xlabel("Sepal length")
    plt.ylabel("Sepal width")
    plt.xlim(xx.min(), xx.max())
    plt.ylim(yy.min(), yy.max())
    plt.xticks(())
    plt.yticks(())
    plt.title(
        "%s, LML: %.3f" % (titles[i], clf.log_marginal_likelihood(clf.kernel_.theta))
    )

plt.tight_layout()
plt.show()