Plot Sgd Loss FunctionsΒΆ

========================== SGD: convex loss functionsΒΆ

A plot that compares the various convex loss functions supported by

class:

~sklearn.linear_model.SGDClassifier .

Imports for Visualizing SGD Loss FunctionsΒΆ

The choice of loss function fundamentally determines what a classifier optimizes for and how it treats misclassified or borderline samples. SGDClassifier supports several convex surrogate loss functions that approximate the ideal but non-differentiable zero-one loss (which simply counts errors).

Loss functions compared: The hinge loss (used by SVMs) is zero for correctly classified samples with margin >= 1, and linear otherwise – it only penalizes samples that violate or are close to the margin. The log loss (logistic regression) is always positive, penalizing even well-classified samples slightly, which produces calibrated probability estimates. The modified Huber loss is quadratic near the decision boundary but becomes linear for large margin violations, offering robustness to outliers. The perceptron loss is zero for any correctly classified sample regardless of margin. Understanding these shapes explains why different classifiers behave differently on noisy or overlapping data.

# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause

import matplotlib.pyplot as plt
import numpy as np
def modified_huber_loss(y_true, y_pred):
    z = y_pred * y_true
    loss = -4 * z
    loss[z >= -1] = (1 - z[z >= -1]) ** 2
    loss[z >= 1.0] = 0
    return loss


xmin, xmax = -4, 4
xx = np.linspace(xmin, xmax, 100)
lw = 2
plt.plot([xmin, 0, 0, xmax], [1, 1, 0, 0], color="gold", lw=lw, label="Zero-one loss")
plt.plot(xx, np.where(xx < 1, 1 - xx, 0), color="teal", lw=lw, label="Hinge loss")
plt.plot(xx, -np.minimum(xx, 0), color="yellowgreen", lw=lw, label="Perceptron loss")
plt.plot(xx, np.log2(1 + np.exp(-xx)), color="cornflowerblue", lw=lw, label="Log loss")
plt.plot(
    xx,
    np.where(xx < 1, 1 - xx, 0) ** 2,
    color="orange",
    lw=lw,
    label="Squared hinge loss",
)
plt.plot(
    xx,
    modified_huber_loss(xx, 1),
    color="darkorchid",
    lw=lw,
    linestyle="--",
    label="Modified Huber loss",
)
plt.ylim((0, 8))
plt.legend(loc="upper right")
plt.xlabel(r"Decision function $f(x)$")
plt.ylabel("$L(y=1, f(x))$")
plt.show()