Interview PreparationΒΆ

15 core ML questions, coding problems, 3 system design walkthroughs, and LLM/GenAI questions. Use after building at least one project.

Part 1: Core ML Concepts (15 Questions)ΒΆ

Q1: Bias-variance tradeoff?ΒΆ

Bias = underfitting (too simple). Variance = overfitting (too sensitive to training data). Goal: find the sweet spot. Diagnose: high train + val error = bias; low train, high val = variance.

Q2: L1 vs L2 regularization?ΒΆ

L1 (Lasso): Drives some weights to exactly zero. Good for feature selection. L2 (Ridge): Shrinks all weights toward zero. More stable with correlated features. Elastic Net: Both combined.

Q3: Gradient descent?ΒΆ

Iteratively adjust parameters to minimize loss. w = w - lr * gradient. Variants: Batch (full dataset), SGD (one sample), Mini-batch (best of both), Adam (adaptive per-parameter, most common).

Q4: Class imbalance?ΒΆ

  1. Collect more minority data. 2. Use precision/recall/F1 instead of accuracy. 3. Resample (SMOTE / undersample). 4. Class weights. 5. If extreme (99.9%+), treat as anomaly detection.

Q5: Precision, recall, F1?ΒΆ

  • Precision: TP / (TP + FP) β€” optimize when false positives are costly

  • Recall: TP / (TP + FN) β€” optimize when false negatives are costly

  • F1: Harmonic mean of both

Q6: Cross-validation?ΒΆ

Split data into K folds, train on K-1, validate on remaining, rotate K times. Use stratified K-fold for classification. Time series: use temporal splits. Never cross-validate on the test set.

Q7: Transformer architecture?ΒΆ

Processes sequences in parallel via self-attention (not sequential like RNNs). Key components: token embeddings β†’ positional encoding β†’ multi-head self-attention β†’ feed-forward β†’ layer norm + residuals. Attention(Q,K,V) = softmax(QK^T / sqrt(d_k)) V Encoder (BERT) = bidirectional. Decoder (GPT) = causal/masked.

Q8: Overfitting prevention?ΒΆ

More data, regularization (L1/L2/dropout), early stopping, data augmentation, simpler model, cross-validation, ensembles.

Q9: Bagging vs boosting?ΒΆ

Bagging (Random Forest): parallel training on random subsets, reduces variance. Boosting (XGBoost): sequential, each model corrects previous errors, reduces bias.

Q10: Random Forest?ΒΆ

Multiple decision trees on bootstrap samples with random feature subsets. Average predictions. Works because averaging decorrelated trees reduces variance without increasing bias.

Q11: ROC curve and AUC?ΒΆ

ROC: TPR vs FPR at various thresholds. AUC: area under ROC (0.5 = random, >0.9 = outstanding). For imbalanced data, use precision-recall curves instead.

Q12: Supervised vs unsupervised vs reinforcement?ΒΆ

Supervised: labeled data (classification, regression). Unsupervised: find patterns (clustering, PCA). Reinforcement: learn from rewards/penalties. Self-supervised: create own labels (BERT masking, GPT next-token).

Q13: PCA?ΒΆ

Finds directions of maximum variance, projects data onto them. Use for dimensionality reduction, visualization, noise reduction. Assumes linear relationships β€” use t-SNE/UMAP for non-linear.

Q14: Backpropagation?ΒΆ

Forward pass β†’ compute loss β†’ backward pass (chain rule to compute gradients) β†’ update weights. Each layer needs only the gradient from above and its own local gradient.

Q15: Loss functions?ΒΆ

MSE: regression, penalizes large errors. Cross-entropy: classification, measures probability divergence. Binary cross-entropy: binary classification. Others: Huber, hinge, focal, triplet, contrastive.

Part 2: Coding ProblemsΒΆ

Problem 1: Data CleaningΒΆ

Given a messy DataFrame: handle missing values, remove duplicates, compute stats, find correlations. (20 min)

Problem 2: Train a ClassifierΒΆ

Iris dataset: split, train 2+ models, cross-validate, compare metrics, select best. (20 min)

Problem 3: Simple RAG PipelineΒΆ

Chunk docs β†’ embed with SentenceTransformer β†’ store in ChromaDB β†’ retrieve β†’ construct LLM prompt. (30 min)

Problem 4: Feature EngineeringΒΆ

Raw transactions β†’ aggregated user features (count, sum, mean, max, std, unique categories, days active). (15 min)

Part 3: System Design WalkthroughsΒΆ

Design 1: RAG Document Q&A (10K+ PDFs)ΒΆ

Ingestion β†’ Chunk + Embed β†’ Vector DB (Qdrant) β†’ FastAPI β†’ Retrieval + LLM β†’ Streamlit

Key decisions: 512-token chunks with 50-token overlap. Hybrid search (dense + BM25) with reranking. GPT-4.1-mini or Claude Haiku 4.5 for cost. Redis caching. RAGAS evaluation.

Design 2: Recommendation System (10M users, 1M products)ΒΆ

Two-stage: candidate generation (ANN via FAISS/Qdrant) β†’ ranking (XGBoost on features). Real-time features from last 10 interactions. Cold start via content-based features. A/B test with 5% traffic. Latency budget: <100ms.

Design 3: LLM Customer Support AgentΒΆ

LangGraph orchestrator with tools: lookup_order, process_return, search_faq, escalate_to_human. Guardrails: content moderation, PII redaction, rate limiting. GPT-4.1-mini for high volume, Claude Sonnet 4.6 for complex reasoning. Human-in-the-loop for high-stakes actions.

Part 4: LLM & GenAI QuestionsΒΆ

Q1: RAG vs fine-tuning vs prompting? Try prompting β†’ add RAG if hallucinations β†’ fine-tune if style/format still wrong.

Q2: RAG pipeline end-to-end? Document β†’ Chunk β†’ Embed β†’ Store β†’ Query β†’ Retrieve β†’ Rerank β†’ Augment prompt β†’ Generate β†’ Return with citations.

Q3: What is LoRA? Freeze pre-trained weights, inject small trainable adapter matrices. Train ~2-5% of parameters. QLoRA: 4-bit base model + LoRA = 7B model on 4-6GB VRAM.

Q4: How to evaluate RAG? RAGAS: faithfulness, answer relevancy, context precision, context recall. Also: latency, cost/query, user satisfaction.

Q5: AI agents β€” what makes them reliable? Structured outputs, explicit error handling per tool, retry with backoff, human-in-the-loop for high stakes, comprehensive logging, test scenarios.

Part 5: Behavioral QuestionsΒΆ

β€œTell me about a challenging ML project.” Use STAR: Situation, Task, Action, Result (quantify).

β€œHow do you stay current?” Follow key researchers, read papers, participate in communities, build prototypes of new tools.

β€œTechnical disagreements?” Define metrics, run experiment, let data decide.

β€œExplain ML to non-technical audience?” Use analogies. RAG = β€œsmart student with an open-book exam.”

Study PlanΒΆ

  • Week 1: Review 15 concept questions + solve coding problems

  • Week 2: Walk through 3 system designs (35 min each: 5 requirements, 10 high-level, 15 deep dive, 5 scaling)

  • Week 3: Mock interviews, record yourself, practice on shared screen