Plot Estimator RepresentationΒΆ
=========================================== Displaying estimators and complex pipelinesΒΆ
This example illustrates different ways estimators and pipelines can be displayed.
Imports for Displaying Estimator and Pipeline RepresentationsΒΆ
Scikit-learn provides two complementary representations for inspecting estimators: compact text and interactive HTML diagrams: The text representation (__repr__) only shows parameters that differ from their defaults, reducing visual clutter when comparing estimator configurations β for example, LogisticRegression(l1_ratio=1) omits the dozens of default parameters. The HTML representation, automatically rendered in Jupyter notebooks, displays pipelines and composite estimators as interactive diagrams where each step can be expanded to reveal its parameters, making it easy to verify the structure of complex preprocessing chains.
make_column_transformer and make_pipeline compose heterogeneous preprocessing steps into a single estimator object that can be visualized as a unified diagram: The example builds a ColumnTransformer that routes numeric features through SimpleImputer(strategy="median") followed by StandardScaler, and categorical features through SimpleImputer(strategy="constant") followed by OneHotEncoder, all feeding into a LogisticRegression classifier. This nested structure β a pipeline containing a column transformer containing sub-pipelines β is clearly represented in the HTML diagram, where clicking on each component reveals its configuration. This visualization capability is especially valuable for debugging production ML pipelines where misconfigurations in preprocessing can silently degrade model performance.
# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause
from sklearn.compose import make_column_transformer
from sklearn.impute import SimpleImputer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler
# %%
# Compact text representation
# ---------------------------
#
# Estimators will only show the parameters that have been set to non-default
# values when displayed as a string. This reduces the visual noise and makes it
# easier to spot what the differences are when comparing instances.
lr = LogisticRegression(l1_ratio=1)
print(lr)
# %%
# Rich HTML representation
# ------------------------
# In notebooks estimators and pipelines will use a rich HTML representation.
# This is particularly useful to summarise the
# structure of pipelines and other composite estimators, with interactivity to
# provide detail. Click on the example image below to expand Pipeline
# elements. See :ref:`visualizing_composite_estimators` for how you can use
# this feature.
num_proc = make_pipeline(SimpleImputer(strategy="median"), StandardScaler())
cat_proc = make_pipeline(
SimpleImputer(strategy="constant", fill_value="missing"),
OneHotEncoder(handle_unknown="ignore"),
)
preprocessor = make_column_transformer(
(num_proc, ("feat1", "feat3")), (cat_proc, ("feat0", "feat2"))
)
clf = make_pipeline(preprocessor, LogisticRegression())
clf