Run this notebook: Open in Colab Open in Kaggle

Plot Omp¶

=========================== Orthogonal Matching Pursuit¶

Using orthogonal matching pursuit for recovering a sparse signal from a noisy measurement encoded with a dictionary

Imports for Orthogonal Matching Pursuit (OMP)¶

Orthogonal Matching Pursuit is a greedy algorithm for sparse signal recovery that iteratively selects the dictionary atom (feature) most correlated with the current residual, then projects the signal onto all selected atoms simultaneously. Unlike Lasso, which solves a continuous optimization problem, OMP directly targets the L0 “norm” (number of non-zero coefficients) by adding exactly one feature per iteration.

How OMP works: At each step, OMP finds the column of the design matrix X that has the largest inner product with the residual r = y - X_selected * w, adds it to the active set, and recomputes the weights via least squares on the active set. OrthogonalMatchingPursuit requires specifying the number of non-zero coefficients, while OrthogonalMatchingPursuitCV selects this automatically via cross-validation. OMP is particularly useful in compressed sensing and signal processing where the sparsity level is known or can be estimated, and it is often faster than Lasso for very sparse problems.

# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause

import matplotlib.pyplot as plt
import numpy as np

from sklearn.datasets import make_sparse_coded_signal
from sklearn.linear_model import OrthogonalMatchingPursuit, OrthogonalMatchingPursuitCV

n_components, n_features = 512, 100
n_nonzero_coefs = 17

# generate the data

# y = Xw
# |x|_0 = n_nonzero_coefs

y, X, w = make_sparse_coded_signal(
    n_samples=1,
    n_components=n_components,
    n_features=n_features,
    n_nonzero_coefs=n_nonzero_coefs,
    random_state=0,
)
X = X.T

(idx,) = w.nonzero()

# distort the clean signal
y_noisy = y + 0.05 * np.random.randn(len(y))

# plot the sparse signal
plt.figure(figsize=(7, 7))
plt.subplot(4, 1, 1)
plt.xlim(0, 512)
plt.title("Sparse signal")
plt.stem(idx, w[idx])

# plot the noise-free reconstruction
omp = OrthogonalMatchingPursuit(n_nonzero_coefs=n_nonzero_coefs)
omp.fit(X, y)
coef = omp.coef_
(idx_r,) = coef.nonzero()
plt.subplot(4, 1, 2)
plt.xlim(0, 512)
plt.title("Recovered signal from noise-free measurements")
plt.stem(idx_r, coef[idx_r])

# plot the noisy reconstruction
omp.fit(X, y_noisy)
coef = omp.coef_
(idx_r,) = coef.nonzero()
plt.subplot(4, 1, 3)
plt.xlim(0, 512)
plt.title("Recovered signal from noisy measurements")
plt.stem(idx_r, coef[idx_r])

# plot the noisy reconstruction with number of non-zeros set by CV
omp_cv = OrthogonalMatchingPursuitCV()
omp_cv.fit(X, y_noisy)
coef = omp_cv.coef_
(idx_r,) = coef.nonzero()
plt.subplot(4, 1, 4)
plt.xlim(0, 512)
plt.title("Recovered signal from noisy measurements with CV")
plt.stem(idx_r, coef[idx_r])

plt.subplots_adjust(0.06, 0.04, 0.94, 0.90, 0.20, 0.38)
plt.suptitle("Sparse signal recovery with Orthogonal Matching Pursuit", fontsize=16)
plt.show()