Phase 17: Debugging & Troubleshooting β Start HereΒΆ
Diagnose and fix AI system failures systematically β from data issues to slow inference to hallucinating models.
Why This Phase MattersΒΆ
90% of AI project failures are not model failures β they are data issues, evaluation mistakes, or infrastructure problems. This phase teaches a systematic debugging mindset.
Notebooks in This PhaseΒΆ
Notebook |
Topic |
|---|---|
|
Systematic AI debugging methodology |
|
Data leakage, class imbalance, drift detection |
|
Profile slow code, CUDA bottlenecks, memory |
|
Overfitting, underfitting, gradient issues |
|
Confusion matrices, failure mode analysis |
Common AI Bugs TaxonomyΒΆ
Category |
Examples |
|---|---|
Data bugs |
Train/test leakage, label noise, class imbalance |
Training bugs |
Wrong loss function, LR too high/low, batch size |
Evaluation bugs |
Wrong metric, leaky evaluation, benchmark overfitting |
Inference bugs |
Wrong preprocessing, tokenization mismatch |
LLM-specific |
Hallucination, context overflow, prompt injection |
PrerequisitesΒΆ
Machine learning basics
Model evaluation (Phase 16)
Learning PathΒΆ
01_debugging_workflow.ipynb β Start here
02_data_issues.ipynb
03_performance_profiling.ipynb
04_model_debugging.ipynb
05_error_analysis.ipynb