Assignment: Build a Neural Network from Scratch¶

🎯 Objective¶

Build a complete neural network from scratch (without PyTorch/TensorFlow) to classify the MNIST handwritten digits dataset. This assignment will solidify your understanding of how neural networks actually work under the hood.

Estimated Time: 6-8 hours
Difficulty: ⭐⭐⭐ Intermediate
Due Date: 2 weeks from assignment

📊 Grading Rubric¶

Criteria	Exemplary (A: 90-100%)	Proficient (B: 80-89%)	Adequate (C: 70-79%)	Needs Work (D/F: <70%)
Implementation	Clean, efficient, well-commented code; all functions work correctly	Mostly correct, minor bugs; adequate comments	Basic implementation with several bugs	Broken or incomplete code
Architecture	Proper layer sizes, activations, and initialization; optimized design	Correct structure with minor inefficiencies	Basic structure but suboptimal choices	Incorrect architecture
Training	Smooth convergence, proper validation, excellent learning curves	Good training process with minor issues	Training works but inefficient	Poor training or doesn’t converge
Evaluation	Comprehensive analysis, insightful visualizations, >92% accuracy	Good analysis, clear results, >90% accuracy	Basic evaluation, >85% accuracy	Incomplete evaluation or <85% accuracy
Experiments	4+ experiments, thorough analysis, clear insights	3-4 experiments with good documentation	2-3 experiments, basic documentation	<2 experiments or poor analysis
Documentation	Exceptionally clear, professional, insightful	Well-written and organized	Adequate but could be clearer	Poor or missing documentation

Grade Breakdown¶

A (90-100): All requirements met + bonus challenges + exceptional documentation
B (80-89): All core requirements met with good quality
C (70-79): Most requirements met, basic functionality
D/F (<70): Major requirements missing or not working

📦 Submission Requirements¶

What to Submit¶

Code Files:
- neural_network.py - Your NN class implementation
- train.py - Training script
- evaluate.py - Evaluation script
- requirements.txt - Dependencies
Jupyter Notebook:
- analysis.ipynb - Complete analysis with:
  - Training process
  - Visualizations
  - Experiments
  - Results discussion
Report:
- REPORT.md - Markdown report with:
  - Methodology
  - Results tables
  - Conclusions
  - Lessons learned
Assets:
- models/ - Saved model weights
- plots/ - All generated visualizations
- results/ - Experiment results (CSV/JSON)

Submission Format¶

GitHub Repository:

your-name-mnist-nn/
├── README.md              # Setup and run instructions
├── requirements.txt       # Dependencies
├── neural_network.py      # Core implementation
├── train.py              # Training script
├── evaluate.py           # Evaluation script
├── analysis.ipynb        # Analysis notebook
├── REPORT.md             # Written report
├── models/
│   └── best_model.npz    # Saved weights
├── plots/
│   ├── learning_curves.png
│   ├── confusion_matrix.png
│   └── ...
└── results/
    └── experiments.csv

Submit:

GitHub repository link (make it public)
Include all files listed above
Ensure code runs with: pip install -r requirements.txt && python train.py

💡 Hints & Tips¶

Hint 1: Weight Initialization

Use Xavier/He initialization to prevent gradient vanishing:

# Xavier initialization for layers with sigmoid/tanh
W = np.random.randn(n_in, n_out) * np.sqrt(2.0 / (n_in + n_out))

# He initialization for layers with ReLU
W = np.random.randn(n_in, n_out) * np.sqrt(2.0 / n_in)

Hint 2: Debugging Gradients

Implement gradient checking to verify backpropagation:

def numerical_gradient(f, x, eps=1e-5):
    """Compute gradient numerically for verification."""
    grad = np.zeros_like(x)
    for i in range(x.size):
        old_val = x.flat[i]
        x.flat[i] = old_val + eps
        pos = f(x)
        x.flat[i] = old_val - eps
        neg = f(x)
        x.flat[i] = old_val
        grad.flat[i] = (pos - neg) / (2 * eps)
    return grad

Hint 3: Vectorization

Avoid loops! Process entire batches at once:

# Bad: Loop through samples
for i in range(batch_size):
    output[i] = np.dot(W, X[i]) + b

# Good: Vectorized
output = np.dot(X, W.T) + b  # Entire batch at once

Hint 4: Debugging Low Accuracy

If accuracy is low, check:

Data normalization (scale to 0-1)
Learning rate (try 0.001, 0.01, 0.1)
Weight initialization
Gradient flow (print gradient magnitudes)
Loss decreasing? (plot loss curve)

📚 Resources¶

Essential Reading¶

Code References¶

Optional Deep Dives¶

❓ FAQ¶

Q: Can I use PyTorch/TensorFlow for parts of it?
A: No - the point is to implement from scratch. You can use NumPy, but not ML frameworks.

Q: What if I can’t reach 90% accuracy?
A: 85-89% is still acceptable for a passing grade. Document what you tried and why you think it didn’t work better.

Q: Can I work with a partner?
A: Discuss concepts together, but write your own code. No shared code submissions.

Q: How long should the report be?
A: Quality over quantity. 2-4 pages of clear analysis is better than 10 pages of fluff.

Q: Can I use a different dataset?
A: No - use MNIST so we can fairly compare submissions.

🎓 Learning Objectives¶

After completing this assignment, you will be able to:

✅ Implement forward and backward propagation from scratch
✅ Understand the mathematical foundations of neural networks
✅ Debug gradient computation issues
✅ Choose appropriate hyperparameters
✅ Evaluate model performance comprehensively
✅ Communicate technical results clearly

🚀 Getting Started¶

Fork the starter repository: github.com/zero-to-ai/nn-assignment-starter

Set up your environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install numpy matplotlib scikit-learn jupyter

Download MNIST:

from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784', version=1)

Start coding! Begin with the neural_network.py skeleton

💬 Questions & Support¶

Office Hours: Tuesdays 2-4 PM, Thursdays 3-5 PM
Discussion Forum: GitHub Discussions
Email: instructor@zero-to-ai.com (response within 24 hours)
Stuck? Post your question in Discussions - help others by answering too!

Good luck! You’ve got this! 🚀

Remember: This assignment is designed to be challenging but doable. Start early, test often, and don’t hesitate to ask for help.

Assignment: Build a Neural Network from Scratch¶

🎯 Objective¶

📋 Requirements¶

Part 1: Network Architecture (25 points)¶

Part 2: Training Loop (25 points)¶

Part 3: Evaluation & Analysis (25 points)¶

Part 4: Experimentation & Documentation (25 points)¶