Run this notebook: Open in Colab Open in Kaggle

# Chapter 5: Three-dimensional linear transformations ¶

1. The Big Idea¶

Typical Behavior¶

Most vectors change direction when transformed: $$A\vec{v} \neq \lambda \vec{v}$$

Special Vectors¶

But eigenvectors stay on their own span! They just get scaled: $$A\vec{v} = \lambda \vec{v}$$

where:

$\vec{v}$ is an eigenvector (special direction)
$\lambda$ is an eigenvalue (scale factor)

Interpretation¶

“If $\vec{v}$ is an eigenvector with eigenvalue $\lambda$, then transforming $\vec{v}$ just scales it by $\lambda$.”

Why It Matters¶

Eigenvectors reveal the natural axes of the transformation!

In those directions, the transformation is simple (just scaling)
Everything else is messy (rotation + scaling)

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import FancyArrowPatch, Circle
from matplotlib.animation import FuncAnimation
from IPython.display import HTML
import seaborn as sns

sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 10)
np.set_printoptions(precision=3, suppress=True)

def draw_vector(ax, start, end, color='blue', label='', width=0.015):
    """Draw a vector arrow"""
    arrow = FancyArrowPatch(start, end, arrowstyle='->', 
                           mutation_scale=20, linewidth=2.5,
                           color=color, label=label)
    ax.add_patch(arrow)
    if label:
        mid = ((start[0] + end[0])/2, (start[1] + end[1])/2)
        ax.text(mid[0], mid[1], label, fontsize=12, 
               color=color, fontweight='bold',
               bbox=dict(boxstyle='round', facecolor='white', alpha=0.8))

def setup_ax(ax, xlim=(-3, 3), ylim=(-3, 3), title=''):
    ax.set_xlim(xlim)
    ax.set_ylim(ylim)
    ax.set_aspect('equal')
    ax.axhline(y=0, color='k', linewidth=0.8)
    ax.axvline(x=0, color='k', linewidth=0.8)
    ax.grid(True, alpha=0.3)
    if title:
        ax.set_title(title, fontsize=14, fontweight='bold')

# Example: Show eigenvectors vs regular vectors
fig, axes = plt.subplots(1, 2, figsize=(16, 7))

# Transformation matrix (shear + scale)
A = np.array([[3, 1], [0, 2]])

# Compute eigenvectors and eigenvalues
eigenvalues, eigenvectors = np.linalg.eig(A)
v1 = eigenvectors[:, 0]
v2 = eigenvectors[:, 1]
λ1, λ2 = eigenvalues

# Normalize for visualization
v1 = v1 / np.linalg.norm(v1) * 2
v2 = v2 / np.linalg.norm(v2) * 2

# Regular vector (not an eigenvector)
v_regular = np.array([1.5, 1.5])

# LEFT: Before transformation
setup_ax(axes[0], xlim=(-1, 4), ylim=(-1, 4), title='Before Transformation')

# Draw eigenvectors
draw_vector(axes[0], [0, 0], v1, 'green', 'eigenvector v₁')
draw_vector(axes[0], [0, 0], v2, 'red', 'eigenvector v₂')

# Draw regular vector
draw_vector(axes[0], [0, 0], v_regular, 'blue', 'regular vector')

# Draw their spans (lines)
t = np.linspace(-3, 3, 100)
axes[0].plot(t * v1[0], t * v1[1], 'g--', alpha=0.3, linewidth=2, label='span(v₁)')
axes[0].plot(t * v2[0], t * v2[1], 'r--', alpha=0.3, linewidth=2, label='span(v₂)')
axes[0].legend(loc='upper left')

# RIGHT: After transformation
setup_ax(axes[1], xlim=(-1, 8), ylim=(-1, 6), title='After Transformation')

# Transform all vectors
v1_transformed = A @ v1
v2_transformed = A @ v2
v_regular_transformed = A @ v_regular

# Draw transformed eigenvectors (still on same span!)
draw_vector(axes[1], [0, 0], v1, 'lightgreen', '', width=0.01)
draw_vector(axes[1], [0, 0], v1_transformed, 'darkgreen', 
           f'λ₁v₁ (λ₁={λ1:.2f})')

draw_vector(axes[1], [0, 0], v2, 'lightcoral', '', width=0.01)
draw_vector(axes[1], [0, 0], v2_transformed, 'darkred', 
           f'λ₂v₂ (λ₂={λ2:.2f})')

# Draw transformed regular vector (changed direction!)
draw_vector(axes[1], [0, 0], v_regular, 'lightblue', '', width=0.01)
draw_vector(axes[1], [0, 0], v_regular_transformed, 'darkblue', 'rotated!')

# Draw the eigenspaces
axes[1].plot(t * v1[0], t * v1[1], 'g--', alpha=0.3, linewidth=2)
axes[1].plot(t * v2[0], t * v2[1], 'r--', alpha=0.3, linewidth=2)

plt.tight_layout()
plt.show()

print("Key Observation:")
print(f"Eigenvector v₁: Stays on its span, just scaled by λ₁ = {λ1:.3f}")
print(f"Eigenvector v₂: Stays on its span, just scaled by λ₂ = {λ2:.3f}")
print(f"Regular vector: CHANGES DIRECTION (leaves its span)")
print("\n💡 Eigenvectors reveal the 'natural axes' of the transformation!")

2. Geometric Intuition¶

Finding Eigenvectors¶

Question: Which vectors stay on their span during transformation $A$?

Answer: Look for lines through the origin that:

Stay lines after transformation
Vectors on that line only get scaled (not rotated)

Examples¶

Diagonal Matrix¶

\[\begin{split}\begin{bmatrix} 3 & 0 \\ 0 & 2 \end{bmatrix}\end{split}\]

Eigenvectors: $\hat{i}$ and $\hat{j}$ (basis vectors)
Eigenvalues: 3 and 2
Why: Pure scaling along axes!

Rotation Matrix¶

\[\begin{split}\begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}\end{split}\]

No real eigenvectors!
Why: Every vector rotates (no vector stays on its span)
Has complex eigenvalues: $i$ and $-i$

Shear Matrix¶

\[\begin{split}\begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}\end{split}\]

Eigenvector: $\hat{i} = \begin{bmatrix} 1 \\ 0 \end{bmatrix}$
Eigenvalue: 1
Why: Horizontal vectors stay horizontal!

# Visualize different types of matrices and their eigenvectors
fig, axes = plt.subplots(2, 3, figsize=(18, 12))
axes = axes.flatten()

matrices = [
    ('Diagonal\n(scale axes)', np.array([[3, 0], [0, 2]])),
    ('Rotation 90°\n(no real eigen)', np.array([[0, -1], [1, 0]])),
    ('Horizontal Shear', np.array([[1, 1], [0, 1]])),
    ('Symmetric', np.array([[2, 1], [1, 2]])),
    ('Projection', np.array([[1, 0], [0, 0]])),
    ('Reflection', np.array([[-1, 0], [0, 1]]))
]

for idx, (name, M) in enumerate(matrices):
    ax = axes[idx]
    setup_ax(ax, xlim=(-3, 3), ylim=(-3, 3), title=name)
    
    try:
        # Compute eigenvectors and eigenvalues
        eigenvalues, eigenvectors = np.linalg.eig(M)
        
        # Draw grid transformed
        for i in range(-2, 3):
            line = np.array([[i, y] for y in np.linspace(-2, 2, 20)])
            transformed = line @ M.T
            ax.plot(transformed[:, 0], transformed[:, 1], 
                   'gray', alpha=0.2, linewidth=0.5)
            
            line = np.array([[x, i] for x in np.linspace(-2, 2, 20)])
            transformed = line @ M.T
            ax.plot(transformed[:, 0], transformed[:, 1], 
                   'gray', alpha=0.2, linewidth=0.5)
        
        # Draw eigenvectors if real
        colors = ['green', 'red', 'purple', 'orange']
        real_eigen_count = 0
        
        for i, (λ, v) in enumerate(zip(eigenvalues, eigenvectors.T)):
            if np.isreal(λ) and np.abs(λ.imag) < 1e-10:
                λ = λ.real
                v = v.real
                v_norm = v / (np.linalg.norm(v) + 1e-10) * 1.5
                
                # Draw the eigenspace (line)
                t = np.linspace(-3, 3, 100)
                ax.plot(t * v_norm[0], t * v_norm[1], 
                       colors[i], alpha=0.3, linewidth=3)
                
                # Draw the eigenvector
                ax.arrow(0, 0, v_norm[0], v_norm[1],
                        head_width=0.15, head_length=0.15,
                        fc=colors[i], ec=colors[i], width=0.05)
                
                ax.text(v_norm[0]*1.3, v_norm[1]*1.3, 
                       f'λ={λ:.2f}', fontsize=11, 
                       color=colors[i], fontweight='bold',
                       bbox=dict(boxstyle='round', facecolor='white', alpha=0.7))
                
                real_eigen_count += 1
        
        if real_eigen_count == 0:
            ax.text(0.5, 0.5, 'No real\neigenvectors!\n(Complex eigenvalues)', 
                   transform=ax.transAxes, fontsize=12, ha='center', va='center',
                   bbox=dict(boxstyle='round', facecolor='yellow', alpha=0.7))
    
    except Exception as e:
        ax.text(0.5, 0.5, f'Error:\n{str(e)}', 
               transform=ax.transAxes, ha='center', va='center')
    
    # Show matrix
    ax.text(0.02, 0.02, f'{M.tolist()}', 
           transform=ax.transAxes, fontsize=9, va='bottom',
           family='monospace',
           bbox=dict(boxstyle='round', facecolor='lightgray', alpha=0.5))

plt.tight_layout()
plt.show()

print("Observations:")
print("• Diagonal matrices: Basis vectors are eigenvectors")
print("• Rotations: No real eigenvectors (all vectors rotate)")
print("• Shears: One eigenvector (direction that doesn't rotate)")
print("• Symmetric matrices: Always have real eigenvalues")
print("• Projections: Eigenvalue 0 (squishing dimension)")

3. Finding Eigenvectors Algebraically¶

The Equation¶

We want: $A\vec{v} = \lambda \vec{v}$

Rearrange: $$A\vec{v} - \lambda \vec{v} = \vec{0}$$(A - \lambda I)\vec{v} = \vec{0}$$

Key Insight¶

This has a non-zero solution $\vec{v}$ only if: $$\det(A - \lambda I) = 0$$

Why? Because det = 0 means the matrix squishes space, so there’s a non-zero vector that gets mapped to zero!

The Process¶

Find eigenvalues: Solve $\det(A - \lambda I) = 0$
Find eigenvectors: For each $\lambda$, solve $(A - \lambda I)\vec{v} = \vec{0}$

# Detailed example: Finding eigenvalues and eigenvectors
print("Example: Find eigenvalues and eigenvectors of A")
print("="*60)

A = np.array([[3, 1],
              [0, 2]])

print(f"\nMatrix A:")
print(A)

# Step 1: Find eigenvalues
print("\n" + "="*60)
print("STEP 1: Find eigenvalues")
print("="*60)
print("\nWe need to solve: det(A - λI) = 0")
print(f"\nA - λI = [[3-λ,  1  ],")
print(f"          [ 0,   2-λ]]")
print(f"\ndet(A - λI) = (3-λ)(2-λ) - (1)(0)")
print(f"            = (3-λ)(2-λ)")
print(f"            = 6 - 5λ + λ²")
print(f"\nSet equal to 0: λ² - 5λ + 6 = 0")
print(f"Factor: (λ - 3)(λ - 2) = 0")
print(f"\nSolutions: λ₁ = 3, λ₂ = 2")

# Verify with numpy
eigenvalues, eigenvectors = np.linalg.eig(A)
print(f"\n✓ NumPy confirms: λ = {eigenvalues}")

# Step 2: Find eigenvectors
print("\n" + "="*60)
print("STEP 2: Find eigenvectors")
print("="*60)

for i, λ in enumerate(eigenvalues):
    print(f"\nFor λ = {λ}:")
    print(f"Solve (A - {λ}I)v = 0")
    
    A_lambda_I = A - λ * np.eye(2)
    print(f"\nA - {λ}I = {A_lambda_I}")
    print(f"\nThis gives us: {A_lambda_I[0]} @ [v₁, v₂]ᵀ = 0")
    print(f"               {A_lambda_I[1]} @ [v₁, v₂]ᵀ = 0")
    
    v = eigenvectors[:, i]
    print(f"\nEigenvector: {v / v[0] if v[0] != 0 else v}  (can scale by any constant)")
    print(f"NumPy gives: {v}")
    
    # Verify
    Av = A @ v
    λv = λ * v
    print(f"\nVerify: Av = {Av}")
    print(f"        λv = {λv}")
    print(f"        Match? {np.allclose(Av, λv)} ✓")

print("\n" + "="*60)
print("SUMMARY")
print("="*60)
print(f"Eigenvalue λ₁ = {eigenvalues[0]:.3f}, eigenvector v₁ = {eigenvectors[:, 0]}")
print(f"Eigenvalue λ₂ = {eigenvalues[1]:.3f}, eigenvector v₂ = {eigenvectors[:, 1]}")

4. The Characteristic Equation¶

Definition¶

\[\det(A - \lambda I) = 0\]

This is called the characteristic equation or characteristic polynomial.

For 2×2 Matrices¶

\[\begin{split}\det\begin{pmatrix}\begin{bmatrix} a-\lambda & b \\ c & d-\lambda \end{bmatrix}\end{pmatrix} = (a-\lambda)(d-\lambda) - bc = 0\end{split}\]

Expands to: $$\lambda^2 - (a+d)\lambda + (ad-bc) = 0$$\lambda^2 - \text{tr}(A)\lambda + \det(A) = 0$$

where $\text{tr}(A) = a + d$ is the trace (sum of diagonal elements).

Key Facts¶

Product of eigenvalues = $\det(A)$
Sum of eigenvalues = $\text{tr}(A)$
Eigenvalues can be real or complex
$n \times n$ matrix has $n$ eigenvalues (counting multiplicities)

# Verify eigenvalue properties
print("Eigenvalue Properties\n" + "="*60)

# Test with several matrices
test_matrices = [
    ('Example 1', np.array([[3, 1], [0, 2]])),
    ('Example 2', np.array([[4, 2], [1, 3]])),
    ('Symmetric', np.array([[5, 2], [2, 5]])),
    ('Rotation', np.array([[0, -1], [1, 0]]))
]

for name, M in test_matrices:
    eigenvalues = np.linalg.eigvals(M)
    trace = np.trace(M)
    det = np.linalg.det(M)
    
    print(f"\n{name}:")
    print(f"Matrix: {M.tolist()}")
    print(f"Eigenvalues: {eigenvalues}")
    print(f"\nSum of eigenvalues:     {np.sum(eigenvalues):.6f}")
    print(f"Trace (a + d):          {trace:.6f}")
    print(f"Match? {np.allclose(np.sum(eigenvalues), trace)} ✓")
    
    print(f"\nProduct of eigenvalues: {np.prod(eigenvalues):.6f}")
    print(f"Determinant:            {det:.6f}")
    print(f"Match? {np.allclose(np.prod(eigenvalues), det)} ✓")

print("\n" + "="*60)
print("BEAUTIFUL FACTS:")
print("  λ₁ + λ₂ + ... + λₙ = tr(A)  (sum of diagonal)")
print("  λ₁ × λ₂ × ... × λₙ = det(A) (area scaling factor)")
print("="*60)

5. Eigendecomposition (Diagonalization)¶

The Magic Formula¶

If $A$ has $n$ linearly independent eigenvectors, then: $$A = PDP^{-1}$$

where:

$P$ = matrix with eigenvectors as columns
$D$ = diagonal matrix with eigenvalues
$P^{-1}$ = inverse of $P$

Why This Is Magical¶

Computing $A^n$ becomes easy: $$A^n = PD^nP^{-1}$$

And $D^n$ is trivial (just raise diagonal elements to power $n$)!

Interpretation¶

$P^{-1}$: Change to eigenvector basis
$D$: Apply scaling in each direction
$P$: Change back to standard basis

In the eigenvector basis, the transformation is just scaling!

# Demonstrate eigendecomposition
print("Eigendecomposition Example\n" + "="*60)

A = np.array([[3, 1],
              [0, 2]])

print("Original matrix A:")
print(A)

# Compute eigendecomposition
eigenvalues, eigenvectors = np.linalg.eig(A)

P = eigenvectors
D = np.diag(eigenvalues)
P_inv = np.linalg.inv(P)

print("\nEigenvector matrix P (eigenvectors as columns):")
print(P)

print("\nDiagonal matrix D (eigenvalues on diagonal):")
print(D)

print("\nInverse matrix P⁻¹:")
print(P_inv)

# Verify decomposition
A_reconstructed = P @ D @ P_inv
print("\nReconstruct: A = P D P⁻¹ =")
print(A_reconstructed)

print(f"\nMatch original? {np.allclose(A, A_reconstructed)} ✓")

# Show power of this
print("\n" + "="*60)
print("Power of Eigendecomposition")
print("="*60)

n = 10
print(f"\nCompute A^{n} two ways:")

# Method 1: Direct (slow)
import time
start = time.time()
A_n_direct = np.linalg.matrix_power(A, n)
time_direct = time.time() - start

# Method 2: Using eigendecomposition
start = time.time()
D_n = np.diag(eigenvalues ** n)
A_n_eigen = P @ D_n @ P_inv
time_eigen = time.time() - start

print(f"\nDirect computation: {A_n_direct}")
print(f"Via eigendecomposition: {A_n_eigen}")
print(f"\nMatch? {np.allclose(A_n_direct, A_n_eigen)} ✓")

print(f"\nFor large powers, eigendecomposition is faster!")
print(f"Just compute D^n (raise each eigenvalue to power n)")

6. Applications¶

Real-World Uses¶

Principal Component Analysis (PCA)
- Find directions of maximum variance in data
- Eigenvectors = principal components
- Eigenvalues = amount of variance explained
Google PageRank
- Find the dominant eigenvector of the web link matrix
- Eigenvalue = 1 (steady state)
Differential Equations
- Solutions involve $e^{\lambda t}$
- Eigenvalues determine stability
Vibration Analysis
- Natural frequencies = eigenvalues
- Mode shapes = eigenvectors
Quantum Mechanics
- Observable properties = eigenvalues
- States = eigenvectors
Markov Chains
- Long-term behavior = dominant eigenvector
- Convergence rate = second eigenvalue

# Example: Population dynamics with eigenvalues
print("Application: Population Growth Model\n" + "="*60)

# Leslie matrix for population with two age groups
# [juveniles, adults]ᵀ
L = np.array([[0.0, 2.0],    # Juveniles: 0 stay juvenile, adults produce 2 juveniles
              [0.5, 0.8]])    # Adults: 50% of juveniles survive to adult, 80% of adults survive

print("Leslie Matrix L (population transition):")
print(L)
print("\nInterpretation:")
print("  L[0,1] = 2.0: Each adult produces 2 juveniles per time step")
print("  L[1,0] = 0.5: 50% of juveniles survive to become adults")
print("  L[1,1] = 0.8: 80% of adults survive to next time step")

# Find eigenvalues
eigenvalues, eigenvectors = np.linalg.eig(L)
dominant_idx = np.argmax(np.abs(eigenvalues))
growth_rate = eigenvalues[dominant_idx].real
stable_dist = eigenvectors[:, dominant_idx].real
stable_dist = stable_dist / np.sum(stable_dist)

print(f"\nEigenvalues: {eigenvalues}")
print(f"\nDominant eigenvalue (growth rate): {growth_rate:.3f}")

if growth_rate > 1:
    print(f"  → Population GROWS by {(growth_rate-1)*100:.1f}% per time step")
elif growth_rate < 1:
    print(f"  → Population SHRINKS by {(1-growth_rate)*100:.1f}% per time step")
else:
    print("  → Population is STABLE")

print(f"\nStable age distribution (eigenvector):")
print(f"  Juveniles: {stable_dist[0]*100:.1f}%")
print(f"  Adults:    {stable_dist[1]*100:.1f}%")

# Simulate
print("\n" + "="*60)
print("Simulation (starting with 100 juveniles, 50 adults):")
print("="*60)

population = np.array([100.0, 50.0])
print(f"\nYear  Juveniles  Adults    Total    Growth")
print("-" * 50)

for year in range(10):
    total = np.sum(population)
    growth = "" if year == 0 else f"{(total/prev_total - 1)*100:+.1f}%"
    print(f"{year:3d}   {population[0]:8.0f}   {population[1]:6.0f}   {total:7.0f}   {growth}")
    prev_total = total
    population = L @ population

print(f"\n→ Converges to growth rate of {growth_rate:.3f} ≈ {(growth_rate-1)*100:.1f}% per year")
print(f"→ Age distribution converges to {stable_dist[0]*100:.1f}%/{stable_dist[1]*100:.1f}%")

Summary¶

The Essence¶

Eigenvectors = Special directions where transformation just scales

Eigenvalues = The scaling factors

Key Equations¶

Definition: $A\vec{v} = \lambda \vec{v}$
Finding them: $\det(A - \lambda I) = 0$
Decomposition: $A = PDP^{-1}$

Beautiful Facts¶

Sum of eigenvalues = trace(A)
Product of eigenvalues = det(A)
Eigenvectors of symmetric matrices are orthogonal
Real matrices can have complex eigenvalues

Geometric Intuition¶

Think of eigenvectors as the natural coordinate system for understanding a transformation:

In that basis, the matrix is diagonal
Each direction just gets scaled
No rotation, no shearing - just pure scaling!

Why They Matter¶

Eigenvectors reveal:

Stability of systems (eigenvalue < 1 → decay)
Resonance frequencies (vibration modes)
Principal directions (PCA, data analysis)
Long-term behavior (Markov chains, PageRank)

Exercises¶

Find eigenvalues and eigenvectors of $\begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}$
Show that rotation matrices (except ±I) have no real eigenvectors
Prove that trace = sum of eigenvalues for a 2×2 matrix
Compute $A^{100}$ using eigendecomposition for $A = \begin{bmatrix} 0.8 & 0.3 \\ 0.2 & 0.7 \end{bmatrix}$
Explain geometrically why det(A) = product of eigenvalues

# Chapter 5: Three-dimensional linear transformations¶

1. The Big Idea¶

Typical Behavior¶

Special Vectors¶

Interpretation¶

Why It Matters¶

2. Geometric Intuition¶

Finding Eigenvectors¶

Examples¶

Diagonal Matrix¶

Rotation Matrix¶

Shear Matrix¶

3. Finding Eigenvectors Algebraically¶

The Equation¶

Key Insight¶

The Process¶

4. The Characteristic Equation¶

Definition¶

For 2×2 Matrices¶

Key Facts¶

5. Eigendecomposition (Diagonalization)¶

The Magic Formula¶

Why This Is Magical¶

Interpretation¶

6. Applications¶

Real-World Uses¶

Summary¶

The Essence¶

Key Equations¶

Beautiful Facts¶

Geometric Intuition¶

Why They Matter¶

Exercises¶

# Chapter 5: Three-dimensional linear transformations ¶