Phase 12: LLM Fine-TuningΒΆ
This module is strongest when approached as decision-making, not just training mechanics. The real question is when fine-tuning is the right tool, how to prepare data well enough for it to matter, and how to evaluate whether the resulting model is actually better than prompting or RAG.
Actual Module ContentsΒΆ
Recommended OrderΒΆ
First pass:
00 -> 01 -> 02 -> 03 -> 04 -> 06 -> 07Second pass for alignment:
05 -> 08 -> 11Deployment and efficiency depth:
09 -> 10
What To Learn HereΒΆ
When fine-tuning beats prompting
Why dataset quality dominates training quality
How LoRA and QLoRA reduce hardware needs
Why evaluation must be task-specific
The distinction between SFT, preference optimization, and RL-style alignment
Study AdviceΒΆ
Do not start with RLHF terminology if SFT data formatting is still fuzzy.
Treat 06_evaluation.ipynb as a required notebook, not an optional one.
Compare every fine-tuning idea against a prompting baseline and a RAG baseline.
Practical OutcomesΒΆ
After this module, you should be able to:
Prepare instruction-format data
Run an adapter-based fine-tune
Evaluate whether the tuned model improved on a concrete task
Package or deploy the result without confusing training success for production readiness