Run this notebook: Open in Colab Open in Kaggle

Chapter 9: Diffusion Models ¶

Generating Images by Learning to Reverse Noise¶

Diffusion models represent a fundamentally different approach to generative AI compared to autoregressive models like GPT. Instead of generating outputs token-by-token, diffusion models start with pure random noise and gradually refine it into a coherent image through a sequence of small denoising steps. The mathematical framework is elegant: define a forward process that progressively adds Gaussian noise to a real image until it becomes indistinguishable from pure noise, then train a neural network to learn the reverse process – predicting and removing the noise at each step.

The forward process is a fixed Markov chain: \(x_t = \sqrt{\alpha_t} x_{t-1} + \sqrt{1 - \alpha_t} \epsilon\) where \(\epsilon \sim \mathcal{N}(0, I)\). After \(T\) steps (typically 1000), the original image \(x_0\) has been completely destroyed. The neural network learns to approximate the reverse: given a noisy image \(x_t\) and the timestep \(t\), predict the noise \(\epsilon\) that was added. At generation time, you sample random noise and iteratively denoise it, producing a realistic image from nothing. This is the foundation of DALL-E 2, Stable Diffusion, and Midjourney.

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.patches import Circle

sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (16, 10)
np.random.seed(42)

How Diffusion Works¶

Training¶

Take real image
Add noise gradually
Train network to predict noise

Generation¶

Start with pure noise
Denoise step by step
Get realistic image!

Applications: DALL-E, Stable Diffusion, Midjourney

# Diffusion concept
import numpy as np
print("Forward: Image → Noise")
print("Reverse: Noise → Image")
print("\nTrained on millions of images!")

Chapter 9: Diffusion Models¶

Generating Images by Learning to Reverse Noise¶

How Diffusion Works¶

Training¶

Generation¶

Chapter 9: Diffusion Models ¶