Pre-Quiz: Model Evaluation & MetricsΒΆ
Test your baseline knowledge before starting Phase 15!
Time: 10 minutes
Questions: 10
Passing Score: 60%
QuestionsΒΆ
1. What is the main limitation of using accuracy for imbalanced datasets?ΒΆ
A) Accuracy is too difficult to calculate
B) Accuracy can be misleadingly high even if the model doesnβt work well
C) Accuracy only works for regression problems
D) Accuracy requires too much computational power
Show Answer
**Correct Answer: B**Explanation: With imbalanced data (e.g., 99% negative, 1% positive), a model that always predicts βnegativeβ achieves 99% accuracy but is completely useless for detecting the positive class.
2. Which metric answers: βOf all positive predictions, how many were correct?βΒΆ
A) Recall
B) Precision
C) F1-Score
D) Accuracy
Show Answer
**Correct Answer: B**Explanation: Precision = TP / (TP + FP), which measures the proportion of positive predictions that are actually correct.
3. What does RMSE stand for?ΒΆ
A) Root Mean Square Estimation
B) Regression Mean Squared Error
C) Root Mean Squared Error
D) Random Mean Square Evaluation
Show Answer
**Correct Answer: C**Explanation: RMSE = Root Mean Squared Error, calculated as sqrt(MSE).
4. Why is BLEU score primarily used for machine translation?ΒΆ
A) It only works with translated text
B) It measures word-level overlap between generated and reference text
C) It can only evaluate English text
D) It measures grammatical correctness
Show Answer
**Correct Answer: B**Explanation: BLEU compares n-gram overlap between generated and reference text, making it suitable for translation where lexical similarity matters.
5. What is demographic parity in fairness metrics?ΒΆ
A) All groups must have equal prediction accuracy
B) All groups must have equal positive outcome rates
C) All groups must have the same features
D) All groups must be the same size
Show Answer
**Correct Answer: B**Explanation: Demographic parity requires that the positive outcome rate (e.g., approval rate) is equal across different demographic groups.
6. What does an RΒ² score of 0 mean?ΒΆ
A) Perfect predictions
B) Model is as good as predicting the mean
C) Model is completely wrong
D) Cannot calculate RΒ²
Show Answer
**Correct Answer: B**Explanation: RΒ² = 0 means the model performs no better than simply predicting the average value for all samples.
7. Why use cross-validation instead of a single train/test split?ΒΆ
A) Itβs faster to compute
B) It uses less data
C) It provides a more reliable performance estimate
D) It always gives better accuracy
Show Answer
**Correct Answer: C**Explanation: Cross-validation reduces variance in performance estimates by testing on multiple splits of the data.
8. What is the primary difference between BLEU and ROUGE?ΒΆ
A) BLEU is for classification, ROUGE is for regression
B) BLEU is precision-focused, ROUGE is recall-focused
C) BLEU only works in English
D) ROUGE is more accurate
Show Answer
**Correct Answer: B**Explanation: BLEU emphasizes precision (generated text vs reference), while ROUGE emphasizes recall (reference vs generated text).
9. What does a low perplexity score indicate?ΒΆ
A) Model is confused
B) Model is confident/text is familiar
C) Model needs more training
D) Text is too complex
Show Answer
**Correct Answer: B**Explanation: Low perplexity means the model assigns high probability to the actual tokens, indicating confidence and familiarity with the text.
10. What is the β80% ruleβ in fairness testing?ΒΆ
A) Model must be 80% accurate
B) The ratio of positive outcomes between groups should be β₯ 0.8
C) 80% of features must be fair
D) Training data must be 80% balanced
Show Answer
**Correct Answer: B**Explanation: The 80% rule (or 4/5ths rule) states that the selection rate for any protected group should be at least 80% of the rate for the highest group.
Scoring GuideΒΆ
9-10 correct: Excellent! You have strong foundational knowledge
7-8 correct: Good! Review a few concepts before starting
5-6 correct: Moderate. Pay extra attention during Phase 15
0-4 correct: Review prerequisite materials before Phase 15
Key Topics to ReviewΒΆ
If you scored low, review these topics:
Classification metrics (precision, recall, F1, accuracy)
Regression metrics (MAE, RMSE, RΒ²)
Fairness concepts (demographic parity, equalized odds)
LLM evaluation (BLEU, ROUGE, perplexity)
Cross-validation basics
Ready to start Phase 15? Letβs go! π