Challenges: Neural NetworksΒΆ

Hands-on challenges to deepen your understanding of neural networks

πŸš€ Challenge 1: Gradient Vanishing DetectiveΒΆ

Difficulty: ⭐⭐ Beginner-Intermediate
Time: 30-45 minutes
Concepts: Activation functions, gradient flow, deep networks

The ProblemΒΆ

Build a very deep neural network (10+ layers) and observe the gradient vanishing problem in action.

Your TaskΒΆ

  1. Create a 15-layer network with sigmoid activations

  2. Train on MNIST dataset

  3. Track gradient magnitudes at each layer during training

  4. Visualize the gradients - notice how they get smaller in early layers?

  5. Fix it by switching to ReLU - observe the difference!

Starter CodeΒΆ

class DeepNetwork:
    def __init__(self, n_layers=15, activation='sigmoid'):
        self.n_layers = n_layers
        self.activation = activation
        # TODO: Initialize layers
    
    def track_gradients(self):
        """Return gradient magnitudes for each layer."""
        # TODO: Track gradient norms
        pass

Success CriteriaΒΆ

  • Demonstrate gradient vanishing with sigmoid

  • Show improvement with ReLU

  • Create visualization comparing both

  • Explain why this happens mathematically

πŸ’‘ Hint Sigmoid derivative: max value is 0.25. If you chain 15 layers: 0.25^15 β‰ˆ 0 (vanishing!) ReLU derivative: either 0 or 1, so gradients flow better.

πŸš€ Challenge 3: Attention VisualizationΒΆ

Difficulty: ⭐⭐⭐ Intermediate-Advanced
Time: 2-3 hours
Concepts: Attention mechanism, visualization, interpretability

The ProblemΒΆ

Implement and visualize a simple attention mechanism for sequence classification.

Your TaskΒΆ

  1. Build attention layer from scratch

  2. Train on text sentiment classification

  3. Visualize attention weights for sample inputs

  4. Show which words the model focuses on

  5. Compare with/without attention

Example OutputΒΆ

Input: "This movie was absolutely terrible and boring"
Attention weights:
- "terrible": 0.45 β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
- "boring": 0.38   β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
- "absolutely": 0.10 β–ˆβ–ˆ
- Other words: <0.05
Prediction: Negative (0.92 confidence)

Success CriteriaΒΆ

  • Working attention implementation

  • Heatmap visualization of attention

  • Interpretable results

  • Performance comparison

πŸ’‘ Hint Attention formula: Ξ± = softmax(score(h_i, query)) Output = Ξ£ Ξ±_i * h_i Start with simple dot-product attention.

πŸš€ Challenge 4: Transfer Learning MasterΒΆ

Difficulty: ⭐⭐⭐⭐ Advanced
Time: 3-4 hours
Concepts: Transfer learning, fine-tuning, domain adaptation

The ProblemΒΆ

Use a pre-trained ImageNet model for a custom classification task with limited data.

Your TaskΒΆ

  1. Choose a small custom dataset (100-500 images, 5-10 classes)

  2. Load pre-trained ResNet/VGG/MobileNet

  3. Try 3 approaches:

    • Feature extraction (freeze all layers)

    • Fine-tuning (unfreeze last few layers)

    • Full training (train entire network)

  4. Compare results and training time

  5. Analyze which layers learn what

Dataset SuggestionsΒΆ

  • Food classification

  • Dog breed recognition

  • Flower species

  • Medical images

  • Your own photos

Analysis RequiredΒΆ

  • Learning curves for each approach

  • Confusion matrices

  • Layer-wise feature visualization

  • Recommendations for when to use each approach

πŸ’‘ Hint Feature extraction works well with <1000 images. Fine-tuning helps when task is somewhat different from ImageNet. Full training needs lots of data.

πŸš€ Challenge 5: Neural Network DebuggerΒΆ

Difficulty: ⭐⭐⭐ Intermediate
Time: 1-2 hours
Concepts: Debugging, troubleshooting, systematic diagnosis

The ProblemΒΆ

You’re given 5 broken neural network implementations. Find and fix the bugs!

ScenariosΒΆ

Bug 1: The Never-Learning Network

# Network trains but loss doesn't decrease
# What's wrong?
for epoch in range(100):
    loss = compute_loss(model(X), y)
    gradients = backprop(loss)
    # weights stay the same! Why?

Bug 2: The Exploding Loss

# Loss goes to infinity after a few batches
# What's the issue?
def train_step(X, y):
    pred = model(X)
    loss = mse_loss(pred, y)
    # loss = 1e+10 after iteration 3

Bug 3: The Plateauing Network

# Accuracy stuck at 10% on MNIST (10 classes)
# Red flag! What's happening?

Bug 4: The Slow Learner

# Training takes 10x longer than expected
# Same architecture, same data
# Where's the bottleneck?

Bug 5: The Overfitting Champion

# Training accuracy: 99%
# Validation accuracy: 45%
# How do you fix this?

Your TaskΒΆ

  • Identify each bug

  • Explain why it happens

  • Provide the fix

  • Share prevention strategies

πŸ’‘ Hint 1 Bug 1: Are you actually updating the weights? Bug 2: Check learning rate and weight initialization Bug 3: Is your model predicting all one class? Bug 4: Profile your code - is it the data loading? Bug 5: Classic overfitting - what's missing?

🌟 Meta Challenge: Build Your Own Framework¢

Difficulty: ⭐⭐⭐⭐⭐ Expert
Time: 8-12 hours
Concepts: Software engineering, API design, comprehensive understanding

The Ultimate ChallengeΒΆ

Build a mini deep learning framework (like PyTorch, but simpler).

RequirementsΒΆ

Your framework should support:

  • Automatic differentiation (autograd)

  • Common layers (Linear, Conv2D, ReLU, etc.)

  • Loss functions (MSE, CrossEntropy)

  • Optimizers (SGD, Adam)

  • Model save/load

  • GPU support (optional but impressive!)

Example UsageΒΆ

from your_framework import Module, Linear, ReLU, CrossEntropyLoss, Adam

class MyModel(Module):
    def __init__(self):
        self.fc1 = Linear(784, 128)
        self.relu = ReLU()
        self.fc2 = Linear(128, 10)
    
    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

model = MyModel()
optimizer = Adam(model.parameters(), lr=0.001)
criterion = CrossEntropyLoss()

# Training loop works just like PyTorch!

Bonus PointsΒΆ

  • Clean, documented API

  • Comprehensive tests

  • Tutorial notebooks

  • Performance benchmarks

  • PyTorch compatibility layer

πŸ“Š Challenge Completion TrackerΒΆ

Mark off challenges as you complete them:

  • Challenge 1: Gradient Vanishing Detective

  • Challenge 2: Architecture Search

  • Challenge 3: Attention Visualization

  • Challenge 4: Transfer Learning Master

  • Challenge 5: Neural Network Debugger

  • Meta Challenge: Build Your Own Framework

πŸ† Share Your Solutions!ΒΆ

Completed a challenge? Share your solution:

  1. GitHub: Create a repo with your solution

  2. Discussions: Post in Challenges Section

  3. Tag: Use #ZeroToAI and #NNChallenge

  4. Help Others: Comment on others’ solutions

πŸ’‘ Learning TipsΒΆ

  • Start small: Don’t try all challenges at once

  • Debug systematically: Print intermediate values, visualize data

  • Compare with libraries: Check your implementation against PyTorch

  • Read papers: Many challenges have research papers behind them

  • Ask for help: Post in Discussions if stuck

Happy challenging! πŸš€

Remember: The goal is to learn, not to complete everything perfectly. Each challenge deepens your understanding!