Run this notebook: Open in Colab Open in Kaggle

Plot Rfe Digits¶

============================= Recursive feature elimination¶

This example demonstrates how Recursive Feature Elimination (:class:~sklearn.feature_selection.RFE) can be used to determine the importance of individual pixels for classifying handwritten digits.

class:: ~sklearn.feature_selection.RFE recursively removes the least significant features, assigning ranks based on their importance, where higher ranking_ values denote lower importance. The ranking is visualized using both shades of blue and pixel annotations for clarity. As expected, pixels positioned at the center of the image tend to be more predictive than those near the edges.

… note::

See also :ref:`sphx_glr_auto_examples_feature_selection_plot_rfe_with_cross_validation.py`

Imports for Recursive Feature Elimination on Digit Images¶

RFE iteratively removes the least important features based on the estimator’s learned coefficients: Starting with all 64 pixel features of the 8x8 digit images, RFE fits a LogisticRegression model, examines the absolute coefficient values to identify the least important feature, removes it, and repeats. With n_features_to_select=1 and step=1, this process runs 63 iterations, assigning each pixel a rank from 1 (most important, eliminated last) to 64 (least important, eliminated first). The MinMaxScaler in the Pipeline ensures pixel values are normalized before logistic regression, which is important because RFE’s feature ranking depends on coefficient magnitudes that are scale-sensitive.

The resulting pixel ranking map reveals the spatial structure of discriminative information: Reshaping the ranking_ array to the 8x8 image grid and visualizing with a colormap shows that center pixels (which contain stroke information) receive low ranks (high importance), while edge and corner pixels (which are typically blank across all digits) receive high ranks (low importance). This spatial pattern confirms the intuition that handwritten digit classification relies primarily on the central pixel region. RFE’s greedy backward elimination is computationally expensive – O(n_features) model fits – but provides a complete feature ordering rather than just a selected subset, making it useful for understanding feature importance hierarchies.

# noqa: E501

# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause

import matplotlib.pyplot as plt

from sklearn.datasets import load_digits
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MinMaxScaler

# Load the digits dataset
digits = load_digits()
X = digits.images.reshape((len(digits.images), -1))
y = digits.target

pipe = Pipeline(
    [
        ("scaler", MinMaxScaler()),
        ("rfe", RFE(estimator=LogisticRegression(), n_features_to_select=1, step=1)),
    ]
)

pipe.fit(X, y)
ranking = pipe.named_steps["rfe"].ranking_.reshape(digits.images[0].shape)

# Plot pixel ranking
plt.matshow(ranking, cmap=plt.cm.Blues)

# Add annotations for pixel numbers
for i in range(ranking.shape[0]):
    for j in range(ranking.shape[1]):
        plt.text(j, i, str(ranking[i, j]), ha="center", va="center", color="black")

plt.colorbar()
plt.title("Ranking of pixels with RFE\n(Logistic Regression)")
plt.show()