Phase 6: Neural NetworksΒΆ
This module is where the repo shifts from classical ML intuition into modern deep learning. The goal is not just to run PyTorch code, but to understand why gradient-based learning, attention, and transformers work well enough that later LLM modules feel connected instead of magical.
Recommended OrderΒΆ
Companion reading:
What You Should Be Able To ExplainΒΆ
Why nonlinear activations are needed
How backpropagation moves signal through a network
Why PyTorch autograd matters in practice
What attention is computing and why scaling matters
How transformer blocks combine attention, MLPs, residual paths, and normalization
How To Study This ModuleΒΆ
Spend more time on 02_backpropagation_explained.ipynb than on framework syntax.
Treat 04_attention_mechanism.ipynb as the bridge into LLM architecture.
Revisit 03-maths/foundational/07_neural_network_math.ipynb if gradients feel mechanical instead of intuitive.
Suggested PracticeΒΆ
Implement a tiny MLP from scratch with NumPy
Rebuild the same idea in PyTorch
Write down tensor shapes at each step of attention
Explain a transformer block without using the phrase βit just learns itβ
Why This Module MattersΒΆ
If this phase is weak, later phases on fine-tuning, local LLMs, evaluation, and agents become tool memorization. If this phase is strong, the rest of the repo becomes a connected system.