Contents Menu Expand Light mode Dark mode Auto light/dark, in light mode Auto light/dark, in dark mode Skip to content
Zero to AI
Logo
Zero to AI

Foundations

  • Phase 0: Course Setup
    • Phase 0: Course Setup Catalog
    • AI Model Landscape: March 6, 2026
    • AI Coding Tools for ML Engineers (March 2026)
    • Troubleshooting Guide
  • Phase 1: Python Fundamentals
    • Phase 1: Python Fundamentals Catalog
  • Phase 2: Data Science Foundations
    • Phase 2: Data Science Foundations Catalog
    • How To Become a Data Engineer
    • NumPy Examples - Consolidated & Deduplicated
      • πŸ“Š Content Statistics
      • πŸ’‘ Study Tips
      • πŸ“– Additional Resources
      • πŸ—ΊοΈ Next Steps
      • 01 Basics
        • Data types
        • Comparison Operators
        • Logic Operators
        • if, elif, else Statements
        • for Loops
        • while Loops
        • range()
        • List Comprehension
        • Functions
        • Lambda Expressions
        • map and filter
        • Methods
        • Great Job!
        • Exercises
        • Great job!
        • Load in NumPy
        • The Basics
        • Accessing/Changing Specific Elements, Rows, Columns
        • 3-D Array Example
        • Initializing Different Types of Arrays
        • Be Careful When Copying Arrays
        • Mathematics
        • Linear Algebra
        • Statistics
        • Reorganizing Arrays
        • Miscellaneous
        • NumPy Exercises
        • Great Job!
        • NumPy Operations
        • Great Job!
        • NumPy
        • Numpy Arrays
        • Great Job!
        • NumPy Indexing and Selection
        • Great Job!
      • 02 Intermediate
        • Copy vs. View in NumPy
        • Filtering NumPy Arrays with Boolean Indexing
        • NumPy Universal Functions (ufuncs)
        • Square Root – np.sqrt()
        • Absolute Value – np.absolute()
        • Exponential – np.exp()
        • Min and Max – np.min() / np.max()
        • Sign Function – np.sign()
        • Introduction to NumPy Arrays
        • Array Creation Methods
        • Iterating Over NumPy Arrays
        • Iterating a 1-D Array
        • Iterating 2-D and 3-D Arrays
        • Sorting NumPy Arrays
        • Sorting Numbers
        • Sorting Strings Alphabetically
        • Sorting Booleans
        • Searching NumPy Arrays with np.where()
        • Retrieving Search Results
        • Finding Even and Odd Numbers
        • Reshaping NumPy Arrays
        • 2-D Array Shape
        • Reshaping from 1-D to 2-D
        • Slicing 1-D NumPy Arrays
        • Slicing 2-D NumPy Arrays
      • 03 Exercises
        • 100 numpy exercises
        • 100 numpy exercises
        • Random Sampling
        • Random Sampling – Solutions
        • Set Routines
        • Set Routines – Solutions
        • Sorting, Searching, and Counting
        • Sorting, Searching, and Counting – Solutions
        • Statistics
        • Statistics – Solutions
        • Array Creation Routines
        • Array Creation Routines – Solutions
        • Array Manipulation Routines
        • Array Manipulation Routines – Solutions
        • String Operations
        • Comparison
        • String Information
        • String Operations – Solutions
        • Comparison
        • String Information
        • NumPy-Specific Help Functions
        • NumPy-Specific Help Functions β€” Solutions
        • Input and Output
        • Input and Output – Solutions
        • Linear Algebra
        • Linear Algebra – Solutions
        • Discrete Fourier Transform
        • Complex Numbers
        • Discrete Fourier Transform
        • Window Functions
        • Complex Numbers
        • Discrete Fourier Transform
        • Window Functions
        • Logic Functions
        • Logic Functions – Solutions
        • Mathematical Functions
        • Mathematical Functions – Solutions
      • 04 Advanced
        • Numpy Tutorials
          • Determining Moore’s Law with real data in NumPy
          • Pairing Jupyter notebooks and MyST-NB
          • Saving and sharing your NumPy arrays
          • Analyzing the impact of the lockdown on air quality in Delhi, India
          • Deep learning on MNIST
          • Deep reinforcement learning with Pong from pixels
          • Masked Arrays
          • Sentiment Analysis on notable speeches of the last decade
          • Plotting Fractals
          • Determining Static Equilibrium in NumPy
          • Learn to write a NumPy tutorial
          • Linear algebra on n-dimensional arrays
          • X-ray image processing
          • Data used for building the NLP from scratch tutorial
    • Pandas Examples - Consolidated & Organized
      • Pandas Examples Consolidation Summary
      • πŸ“Š What You’ll Learn
      • πŸ’‘ Pro Tips for Success
      • πŸ“– Essential pandas Operations
      • πŸ“ˆ Progress Tracking
      • πŸ”— Additional Resources
      • πŸ“Š Collection Statistics
      • πŸŽ“ Ready to Begin?
      • 01 Basics
        • Differences between Shared Methods
        • Selecting One Column from a DataFrame
        • Select Two or More Columns in DataFrame
        • Add New Column to DataFrame
        • Broadcasting Operations
        • A Review of the .value_counts() Method
        • Drop Rows with Null Values
        • Fill in Null Values with the .fillna() Method
        • The .astype() Method
        • Sort a DataFrame with the .sort_values() Method, Part 1
        • Sort a DataFrame with the .sort_values() Method, Part 2
        • Sort a DataFrame with the .sort_index() Method
        • Rank Values with the .rank() Method
        • Filter a DataFrame Based on a Condition
        • Filter with More than One Condition (AND)
        • Filter with More than One Condition (OR)
        • The .isnull() and .notnull() Methods
        • The .between() Method
        • The .duplicated() Method
        • The .drop_duplicates() Method
        • The .unique() and .nunique() Methods
        • set_index() and reset_index() Methods
        • Retrieve Rows by Index Label with .loc[]
        • Retrieve Row(s) by Index Position with iloc
        • The Catch-All .ix[] Method
        • Second Argument to .loc[], .iloc[], and .ix[] Methods
        • Set New Values for Specific Cell or Row
        • Set Multiple Values in DataFrame
        • Rename Index Labels or Columns in a DataFrame
        • Delete Rows or Columns from a DataFrame
        • Create Random Sample
        • The .nsmallest() and .nlargest() Methods
        • Filtering with the where Method
        • The .query() Method
        • A Review of the .apply() Method on Single Columns
        • The .copy() Method
        • Data Cleaning in Pandas
        • EDA in Pandas
        • Filtering and Ordering
        • Group by and Aggregating
        • Indexing
        • Merge, Join, and Concatenate
        • Pandas Series and DataFrames
        • Reading In Files
        • Pandas Visualization
        • Pandas Numpy Lessons
          • Lesson1
            • Exploratory Analysis
            • Create a Pandas Dataframe
            • Introduction to Pandas
            • Loading data into Pandas
            • Load a CSV from a local file
            • Load JSON from a local file
            • You can read from many formats
            • Writing data from Pandas Dataframes
            • Write to many other destinations
            • Copy/Paste into other formats
          • Lesson2
            • Applying Functions
            • Common Dataframe operations
            • Manipulating text in DataFrames
            • Visualizing data
          • Lesson3
            • Common array operations
            • Introduction to NumPy arrays
            • More array operations
      • 02 Intermediate
        • Apple Health Data – Real-World Data Analysis Project
        • DataFrame Fundamentals – Inspection, Selection, and Cleaning
        • DataFrame Operations – Filtering, Deduplication, and Unique Values
        • DataFrame Operations – Indexing, Renaming, Querying, and Applying Functions
        • Electronic Production in India – Data Analysis and Visualization
        • GroupBy – Split-Apply-Combine for Aggregation
        • Input and Output – Reading and Writing Data
        • Merge, Join, and Concatenate – Combining DataFrames
        • MultiIndex – Hierarchical Indexing, Pivoting, and Reshaping
        • Options and Settings – Controlling Pandas Display Behavior
        • Data Cleaning in Pandas
        • EDA in Pandas
        • Filtering and Ordering
        • Group by and Aggregating
        • Indexing
        • Merge, Join, and Concatenate
        • Pandas Series and DataFrames
        • Reading In Files
        • Pandas Visualization
        • Panels – Three-Dimensional Data Structures (Deprecated)
        • India Electronic Production – Interactive Visualization with Plotly
        • Reading CSV and Working with Series
        • Creating Series in Python
        • Tamil Nadu Population Literacy Analysis
        • Visualization – Plotting Stock Data with Pandas and Matplotlib
        • Working with Date and Time
        • Working with Text Data
        • Geographic Visualization – Flight Path Maps with Plotly
      • 03 Exercises
        • 100 pandas puzzles
        • 100 pandas puzzles
      • 05 Real World Projects
        • Apple Health Data – Real-World Analysis Project
        • Electronic Production in India – Data Analysis and Visualization
        • India Production Data – Advanced Plotly Visualizations
    • Data Science Examples
      • πŸ“– Study Schedule Examples
      • πŸ’‘ Success Tips
      • πŸŽ“ Skill Progression
      • πŸ“Š Track Your Progress
      • πŸ”— Essential Resources
      • 🎯 Your Next Steps
      • πŸ’ͺ Motivation
      • πŸš€ Ready to Begin?
      • Data Science for Beginners - A Curriculum
      • Are you a student?
      • Getting Started
        • Microsoft Open Source Code of Conduct
        • Contributing
        • Security
        • Reporting Security Issues
        • Preferred Languages
        • Policy
        • Support
        • Contribute by translating lessons
        • For Educators
        • Using the repo as is
        • Included in this curriculum:
        • Please give us your thoughts!
        • Introduction to Data Science
          • Defining Data Science
            • Assignment: Data Science Scenarios
            • Challenge: Analyzing Text about Data Science
            • Solution
              • Assignment: Data Science Scenarios
              • Challenge: Analyzing Text about Data Science
          • Introduction to Data Ethics
          • Assignment
            • Write A Data Ethics Case Study
            • Instructions
            • Rubric
          • Defining Data
            • Classifying Datasets
          • A Brief Introduction to Statistics and Probability
            • Introduction to Probability and Statistics
            • Assignment
            • Introduction to Probability and Statistics
            • Solution
              • Introduction to Probability and Statistics
              • Assignment
        • Working with Data
          • Working with Data: Relational Databases
            • Displaying airport data
          • Working with Data: Non-Relational Data
            • Soda Profits
          • Working with Data: Python and the Pandas Library
            • Assignment for Data Processing in Python
            • Estimation of COVID-19 Pandemic
            • Analyzing COVID-19 Papers
            • Basic Pandas Examples
            • DataFrame
            • Printing and Plotting
            • R
              • Pandas Usecase in R
              • Series
              • DataFrame
              • Printing and Plotting
          • Working with Data: Data Preparation
            • Assignment: Evaluating Data from a Form
            • Data Preparation
        • Visualizations
          • Visualizing Quantities
            • Lines, Scatters and Bars
            • Let’s learn about birds
            • Solution
              • Let’s learn about birds
          • Visualizing Distributions
            • Apply your skills
            • Bird distributions
            • Solution
              • Bird distributions
          • Visualizing Proportions
            • Try it in Excel
            • πŸ„ Mushroom Proportions
            • Solution
              • πŸ„ Mushroom Proportions
              • Pie chart
              • Donut chart
              • Waffle chart
          • Visualizing Relationships: All About Honey 🍯
            • Dive into the beehive
            • Visualizing Honey Production 🍯 🐝
            • Solution
              • Visualizing Honey Production 🍯 🐝
          • Making Meaningful Visualizations
            • Build your own custom vis
            • Dangerous Liaisons data visualization project
            • Dangerous Liaisons data visualization project
          • R
            • Visualizing Quantities
              • Lines, Scatters and Bars
            • Visualizing Distributions
              • Apply your skills
            • Visualizing Proportions
            • Visualizing Relationships: All About Honey 🍯
            • Making Meaningful Visualizations
        • The Data Science Lifecycle
          • Introduction to the Data Science Lifecycle
            • Assessing a Dataset
            • NYC Taxi data in Winter and Summer
          • The Data Science Lifecycle: Analyzing
            • NYC Taxi data in Winter and Summer
            • Use the cells below to do your own Exploratory Data Analysis
            • Analyzing Data
          • The Data Science Lifecycle: Communication
          • Introduction
          • Effective Communication
          • Communication Case Study
          • Conclusion
            • Tell a story
        • Data Science in the Cloud
          • Introduction to Data Science in the Cloud
            • Market Research
          • Data Science in the Cloud: The β€œLow code/No code” way
            • Low code/No code Data Science project on Azure ML
          • Data Science in the Cloud: The β€œAzure ML SDK” way
            • Data Science project using Azure ML SDK
            • Data Science in the Cloud: The β€œAzure ML SDK” way
            • Solution
              • Data Science in the Cloud: The β€œAzure ML SDK” way
        • Data Science in the Wild
          • Data Science in the Real World
            • Explore a Planetary Computer Dataset
        • Docs
        • Quizzes
        • Credits
      • Core Data Science Topics
        • Matplotlib Reference
          • Visualization with Matplotlib
          • Simple Line Plots
          • Simple Scatter Plots
          • Visualizing Errors
          • Density and Contour Plots
          • Histograms, Binnings, and Density
          • Customizing Plot Legends
          • Customizing Colorbars
          • Multiple Subplots
          • Text and Annotation
          • Customizing Ticks
          • Customizing Matplotlib: Configurations and Stylesheets
          • Three-Dimensional Plotting in Matplotlib
          • Geographic Data with Basemap
          • Visualization with Seaborn
          • Further Resources
          • matplotlib-applied
          • matplotlib
        • Numpy Reference
          • Introduction to NumPy
          • Understanding Data Types in Python
          • The Basics of NumPy Arrays
          • Computation on NumPy Arrays: Universal Functions
          • Aggregations: Min, Max, and Everything In Between
          • Computation on Arrays: Broadcasting
          • Comparisons, Masks, and Boolean Logic
          • Fancy Indexing
          • Sorting Arrays
          • Structured Data: NumPy’s Structured Arrays
          • NumPy
        • Pandas Reference
          • Data Manipulation with Pandas
          • Introducing Pandas Objects
          • Data Indexing and Selection
          • Operating on Data in Pandas
          • Handling Missing Data
          • Hierarchical Indexing
          • Combining Datasets: Concat and Append
          • Combining Datasets: Merge and Join
          • Aggregation and Grouping
          • Pivot Tables
          • Vectorized String Operations
          • Working with Time Series
          • High-Performance Pandas: eval() and query()
          • Further Resources
          • Pandas
      • Machine Learning Examples
        • Kaggle Notebooks
          • Kaggle Machine Learning Competition: Predicting Titanic Survivors
        • Scikit Learn Reference
          • Density Estimation: Gaussian Mixture Models
          • Introduction to scikit-learn
          • Machine Learning Models Cheat Sheet
          • The Estimator API
          • The Iris Dataset
          • K-Nearest Neighbors Classifier
          • K-Means Clustering with scikit-learn
          • K-Means on the Iris Dataset
          • The K-Means Algorithm: Expectation Maximization
          • Linear Regression with scikit-learn
          • Linear Regression
          • Principal Component Analysis (PCA) with scikit-learn
          • PCA on the Iris Dataset
          • Dimensionality Reduction: Principal Component Analysis in-depth
          • Random Forests with scikit-learn
          • Decision Trees: The Building Block
          • Creating a Decision Tree
          • Decision Trees and over-fitting
          • Ensembles of Estimators: Random Forests
          • Random Forest Limitations
          • Support Vector Machines (SVM) with scikit-learn
          • Linear SVM Classifier
          • Support Vector Machine with Kernels Classifier
          • Validation and Model Selection
          • Fig Code
            • scikit-learn
      • Deep Learning Examples
        • Tensorflow Keras
          • Deep Dream
            • Deep Dreams (with Caffe)
          • Keras Tutorial
            • Outline (Draft)
            • Yam Peleg, Valerio Maggio
            • Goal of this Tutorial
            • (Tentative) Schedule
            • Requirements
            • How to set up your environment
            • Recreate the Conda Environment
            • Test if everything is up&running
            • Consulting Material
            • Introduction to Deep Learning
            • Artificial Neural Networks (ANN)
            • Building Neural Nets from scratch
            • Addendum
            • Theano
            • Symbolic variables
            • Evaluating expressions
            • Other tensor types
            • Automatic differention
            • Shared Variables
            • Updates
            • About the data
            • Keras
            • β€œData Sciencing” this example a little bit more
            • A simple implementation of ANN for MNIST
            • Convolutional Neural Network
            • The Problem Space
            • Convolutional Layer
            • Going Deeper Through the Network
            • CNN in Keras
            • ConvNet HandsOn with Keras
            • Basic data analysis on the dataset
            • Convolution Nets for MNIST
            • A simple CNN
            • Adding more Dense Layers
            • Adding Dropout
            • Adding more Convolution Layers
            • Exercise
            • Batch Normalisation
            • Practical Deep Learning
            • VGG16
            • Fine Tuning of a Pre-Trained Model
            • Hands On:
            • Unsupervised learning
            • Natural Language Processing using Artificial Neural Networks
            • Word Embeddings
            • Convolutional Neural Networks for Sentence Classification
            • Another Example
            • Recurrent Neural networks
            • Sentence Generation using RNN(LSTM)
            • RNN using LSTM
            • Using TFIDF Vectorizer as an input instead of one hot encoder
            • Sentence Generation using LSTM
            • Conclusions
            • Trained image classification models for Keras
          • Tensor Flow Examples
            • Download and Setup
            • Notebooks
              • 1 Intro
                • Basic Operations in TensorFlow
              • 2 Basic Classifiers
                • Linear Regression in TensorFlow
                • Logistic Regression in TensorFlow
                • Nearest Neighbor in TensorFlow
              • 3 Neural Networks
                • AlexNet in TensorFlow
                • Convolutional Neural Network (CNN) in TensorFlow
                • Multilayer Perceptron (MLP) in TensorFlow
                • Recurrent Neural Network (LSTM) in TensorFlow
              • 4 Multi Gpu
                • Basic Multi-GPU Computation in TensorFlow
              • 5 Ui
                • Graph Visualization with TensorBoard
                • Run the command line
                • Open http://localhost:6006/ into your web browser
                • Loss Visualization with TensorBoard
                • Run the command line
                • Open http://localhost:6006/ into your web browser
          • Exercises
            • Deep Learning with TensorFlow
            • Deep Learning with TensorFlow
            • Deep Learning with TensorFlow
            • Deep Learning with TensorFlow
            • Deep Learning with TensorFlow
            • Deep Learning with TensorFlow
          • Theano Tutorial
            • Intro Theano
              • Introduction to Theano
              • Graph definition and Syntax
              • Graph Transformations
              • Advanced Topics
              • Logistic Regression in Theano
            • Rnn Tutorial
              • Introduction
              • Recurrent Neural Networks in Theano
              • Generating sequences
            • Scan Tutorial
              • Introduction to Scan in Theano
            • Theano Mlp
              • Multilayer Perceptron in Theano
      • Reference Notebooks
        • Aws
          • Amazon Web Services (AWS)
        • Spark
          • HDFS
          • Spark
    • Matplotlib
      • Matplotlib interactive examples
      • Plot Types Jupyter
        • 3D
          • 3D Bar Charts with bar3d(x, y, z, dx, dy, dz)
          • 3D Fill Between with fill_between(x1, y1, z1, x2, y2, z2)
          • 3D Line Plots with plot(xs, ys, zs)
          • 3D Quiver (Vector Field) Plots with quiver(X, Y, Z, U, V, W)
          • 3D Scatter Plots with scatter(xs, ys, zs)
          • 3D Stem Plots with stem(x, y, z)
          • 3D Surface Plots with plot_surface(X, Y, Z)
          • Triangulated Surface Plots with plot_trisurf(x, y, z)
          • Voxel Plots with voxels([x, y, z], filled)
          • 3D Wireframe Plots with plot_wireframe(X, Y, Z)
        • Arrays
          • Wind Barb Plots with barbs(X, Y, U, V)
          • Contour Line Plots with contour(X, Y, Z)
          • Filled Contour Plots with contourf(X, Y, Z)
          • Image Display with imshow(Z)
          • Pseudocolor Mesh Plots with pcolormesh(X, Y, Z)
          • 2D Vector Field Plots with quiver(X, Y, U, V)
          • Streamline Plots with streamplot(X, Y, U, V)
        • Basic
          • Bar Charts with bar(x, height)
          • Fill Between Curves with fill_between(x, y1, y2)
          • Line and Marker Plots with plot(x, y)
          • Scatter Plots with scatter(x, y)
          • Stacked Area Charts with stackplot(x, y)
          • Step Function Plots with stairs(values)
          • Stem Plots with stem(x, y)
        • Stats
          • Box and Whisker Plots with boxplot(X)
          • Empirical Cumulative Distribution Functions with ecdf(x)
          • Error Bar Plots with errorbar(x, y, yerr, xerr)
          • Event Plots with eventplot(D)
          • Hexagonal Binning with hexbin(x, y, C)
          • 2D Histograms with hist2d(x, y)
          • Histograms with hist(x)
          • Pie Charts with pie(x)
          • Violin Plots with violinplot(D)
        • Unstructured
          • Contour Lines on Unstructured Grids with tricontour(x, y, z)
          • Filled Contours on Unstructured Grids with tricontourf(x, y, z)
          • Pseudocolor Plots on Unstructured Grids with tripcolor(x, y, z)
          • Triangulation Visualization with triplot(x, y)
    • Scikit-Learn Examples
      • Applications
        • Plot Cyclical Feature Engineering
        • ================================ Time-related feature engineering
        • Plot Digits Denoising
        • ================================ Image denoising using kernel PCA
        • Plot Face Recognition
        • =================================================== Faces recognition example using eigenfaces and SVMs
        • Plot Model Complexity Influence
        • ========================== Model Complexity Influence
        • Plot Out Of Core Classification
        • ====================================================== Out-of-core classification of text documents
        • Plot Outlier Detection Wine
        • ==================================== Outlier detection on a real data set
        • Plot Prediction Latency
        • ================== Prediction Latency
        • Plot Species Distribution Modeling
        • ============================= Species distribution modeling
        • Plot Stock Market
        • ======================================= Visualizing the stock market structure
        • Plot Time Series Lagged Features
        • =========================================== Lagged features for time series forecasting
        • Plot Tomography L1 Reconstruction
        • ====================================================================== Compressive sensing: tomography reconstruction with L1 prior (Lasso)
        • Plot Topics Extraction With Nmf Lda
        • ======================================================================================= Topic extraction with Non-negative Matrix Factorization and Latent Dirichlet Allocation
        • Wikipedia Principal Eigenvector
        • =============================== Wikipedia principal eigenvector
      • Bicluster
        • Plot Bicluster Newsgroups
        • ================================================================ Biclustering documents with the Spectral Co-clustering algorithm
        • Plot Spectral Biclustering
        • ============================================= A demo of the Spectral Biclustering algorithm
        • Plot Spectral Coclustering
        • ============================================== A demo of the Spectral Co-Clustering algorithm
      • Calibration
        • Plot Calibration
        • ====================================== Probability calibration of classifiers
        • Plot Calibration Curve
        • ============================== Probability Calibration curves
        • Plot Calibration Multiclass
        • ================================================== Probability Calibration for 3-class classification
        • Plot Compare Calibration
        • ======================================== Comparison of Calibration of Classifiers
      • Classification
        • Plot Classification Probability
        • =============================== Plot classification probability
        • Plot Classifier Comparison
        • ===================== Classifier comparison
        • Plot Digits Classification
        • ================================ Recognizing hand-written digits
        • Plot Lda
        • =========================================================================== Normal, Ledoit-Wolf and OAS Linear Discriminant Analysis for classification
        • Plot Lda Qda
        • ==================================================================== Linear and Quadratic Discriminant Analysis with covariance ellipsoid
      • Cluster
        • Plot Adjusted For Chance Measures
        • ========================================================== Adjustment for chance in clustering performance evaluation
        • Plot Affinity Propagation
        • ================================================= Demo of affinity propagation clustering algorithm
        • Plot Agglomerative Clustering Metrics
        • Agglomerative clustering with different metrics
        • Imports for Plotting Hierarchical Clustering Dendrograms
        • Plot Birch Vs Minibatchkmeans
        • ================================= Compare BIRCH and MiniBatchKMeans
        • Plot Bisect Kmeans
        • ============================================================= Bisecting K-Means and Regular K-Means Performance Comparison
        • Plot Cluster Comparison
        • ========================================================= Comparing different clustering algorithms on toy datasets
        • Plot Coin Segmentation
        • ================================================ Segmenting the picture of greek coins in regions
        • Plot Coin Ward Segmentation
        • ====================================================================== A demo of structured Ward hierarchical clustering on an image of coins
        • Plot Dbscan
        • =================================== Demo of DBSCAN clustering algorithm
        • Plot Dict Face Patches
        • Online learning of a dictionary of parts of faces
        • Plot Digits Agglomeration
        • ========================================================= Feature agglomeration
        • Plot Digits Linkage
        • ============================================================================= Various Agglomerative Clustering on a 2D embedding of digits
        • Plot Face Compress
        • =========================== Vector Quantization Example
        • Plot Feature Agglomeration Vs Univariate Selection
        • ============================================== Feature agglomeration vs. univariate selection
        • Imports for HDBSCAN: Hierarchical Density-Based Clustering
        • Plot Inductive Clustering
        • ==================== Inductive Clustering
        • Plot Kmeans Assumptions
        • ==================================== Demonstration of k-means assumptions
        • Plot Kmeans Digits
        • =========================================================== A demo of K-Means clustering on the handwritten digits data
        • Plot Kmeans Plusplus
        • =========================================================== An example of K-Means++ initialization
        • Plot Kmeans Silhouette Analysis
        • =============================================================================== Selecting the number of clusters with silhouette analysis on KMeans clustering
        • Plot Kmeans Stability Low Dim Dense
        • ============================================================ Empirical evaluation of the impact of k-means initialization
        • Plot Linkage Comparison
        • ================================================================ Comparing different hierarchical linkage methods on toy datasets
        • Plot Mean Shift
        • ============================================= A demo of the mean-shift clustering algorithm
        • Plot Mini Batch Kmeans
        • ==================================================================== Comparison of the K-Means and MiniBatchKMeans clustering algorithms
        • Plot Optics
        • =================================== Demo of OPTICS clustering algorithm
        • Plot Segmentation Toy
        • =========================================== Spectral clustering for image segmentation
        • Plot Ward Structured Vs Unstructured
        • =================================================== Hierarchical clustering with and without structure
      • Compose
        • Plot Column Transformer
        • ================================================== Column Transformer with Heterogeneous Data Sources
        • Plot Column Transformer Mixed Types
        • =================================== Column Transformer with Mixed Types
        • Plot Compare Reduction
        • ================================================================= Selecting dimensionality reduction with Pipeline and GridSearchCV
        • Plot Digits Pipe
        • ========================================================= Pipelining: chaining a PCA and a logistic regression
        • Plot Feature Union
        • ================================================= Concatenating multiple feature extraction methods
        • Plot Transformed Target
        • ====================================================== Effect of transforming the targets in regression model
      • Covariance
        • Plot Covariance Estimation
        • ======================================================================= Shrinkage covariance estimation: LedoitWolf vs OAS and max-likelihood
        • Plot Lw Vs Oas
        • ============================= Ledoit-Wolf vs OAS estimation
        • Imports for Robust Mahalanobis Distance Estimation
        • Imports for Robust vs Empirical Covariance Under Contamination
        • Plot Sparse Cov
        • ====================================== Sparse inverse covariance estimation
      • Cross Decomposition
        • Plot Compare Cross Decomposition
        • =================================== Compare cross decomposition methods
        • Plot Pcr Vs Pls
        • ================================================================== Principal Component Regression vs Partial Least Squares Regression
      • Datasets
        • Plot Random Multilabel Dataset
        • ============================================== Plot randomly generated multilabel dataset
      • Decomposition
        • Plot Faces Decomposition
        • ============================ Faces dataset decompositions
        • Plot Ica Blind Source Separation
        • ===================================== Blind source separation using FastICA
        • Plot Ica Vs Pca
        • ========================== FastICA on 2D point clouds
        • Plot Image Denoising
        • ========================================= Image denoising using dictionary learning
        • Plot Incremental Pca
        • =============== Incremental PCA
        • Plot Kernel Pca
        • ========== Kernel PCA
        • Plot Pca Iris
        • ================================================== Principal Component Analysis (PCA) on Iris Dataset
        • Plot Pca Vs Fa Model Selection
        • =============================================================== Model selection with Probabilistic PCA and Factor Analysis (FA)
        • Plot Pca Vs Lda
        • ======================================================= Comparison of LDA and PCA 2D projection of Iris dataset
        • Plot Sparse Coding
        • =========================================== Sparse coding with a precomputed dictionary
        • Plot Varimax Fa
        • =============================================================== Factor Analysis (with rotation) to visualize patterns
      • Developing Estimators
        • Sklearn Is Fitted
        • ======================================== __sklearn_is_fitted__ as Developer API
      • Ensemble
        • Plot Adaboost Multiclass
        • ===================================== Multi-class AdaBoosted Decision Trees
        • Plot Adaboost Regression
        • ====================================== Decision Tree Regression with AdaBoost
        • Plot Adaboost Twoclass
        • ================== Two-class AdaBoost
        • Plot Bias Variance
        • ============================================================ Single estimator versus bagging: bias-variance decomposition
        • Plot Ensemble Oob
        • ============================= OOB Errors for Random Forests
        • Plot Feature Transformation
        • =============================================== Feature transformations with ensembles of trees
        • Plot Forest Hist Grad Boosting Comparison
        • =============================================================== Comparing Random Forests and Histogram Gradient Boosting models
        • Plot Forest Importances
        • ========================================== Feature importances with a forest of trees
        • Plot Forest Iris
        • ==================================================================== Plot the decision surfaces of ensembles of trees on the iris dataset
        • Plot Gradient Boosting Categorical
        • ================================================ Categorical Feature Support in Gradient Boosting
        • Plot Gradient Boosting Early Stopping
        • =================================== Early stopping in Gradient Boosting
        • Plot Gradient Boosting Oob
        • ====================================== Gradient Boosting Out-of-Bag estimates
        • Plot Gradient Boosting Quantile
        • ===================================================== Prediction Intervals for Gradient Boosting Regression
        • Plot Gradient Boosting Regression
        • ============================ Gradient Boosting regression
        • Plot Gradient Boosting Regularization
        • ================================ Gradient Boosting regularization
        • Plot Hgbt Regression
        • ============================================== Features in Histogram Gradient Boosting Trees
        • Plot Isolation Forest
        • ======================= IsolationForest example
        • Plot Monotonic Constraints
        • ===================== Monotonic Constraints
        • Plot Random Forest Embedding
        • ========================================================= Hashing feature transformation using Totally Random Trees
        • Plot Random Forest Regression Multioutput
        • ============================================================ Comparing random forests and the multi-output meta estimator
        • Plot Stack Predictors
        • ================================= Combine predictors using stacking
        • Plot Voting Decision Regions
        • =============================================================== Visualizing the probabilistic predictions of a VotingClassifier
        • Plot Voting Regressor
        • ================================================= Plot individual and voting regression predictions
      • Feature Selection
        • Plot F Test Vs Mi
        • =========================================== Comparison of F-test and mutual information
        • Plot Feature Selection
        • ============================ Univariate Feature Selection
        • Plot Feature Selection Pipeline
        • ================== Pipeline ANOVA SVM
        • Plot Rfe Digits
        • ============================= Recursive feature elimination
        • Plot Rfe With Cross Validation
        • =================================================== Recursive feature elimination with cross-validation
        • Plot Select From Model Diabetes
        • ============================================ Model-based and sequential feature selection
      • Frozen
        • Plot Frozen Examples
        • =================================== Examples of Using FrozenEstimator
      • Gaussian Process
        • Plot Compare Gpr Krr
        • ========================================================== Comparison of kernel ridge and Gaussian process regression
        • Plot Gpc
        • ==================================================================== Probabilistic predictions with Gaussian process classification (GPC)
        • Plot Gpc Iris
        • ===================================================== Gaussian process classification (GPC) on iris dataset
        • Plot Gpc Isoprobability
        • ================================================================= Iso-probability lines for Gaussian Processes classification (GPC)
        • Plot Gpc Xor
        • ======================================================================== Illustration of Gaussian process classification (GPC) on the XOR dataset
        • Plot Gpr Co2
        • ==================================================================================== Forecasting of CO2 level on Mona Loa dataset using Gaussian process regression (GPR)
        • Plot Gpr Noisy
        • ========================================================================= Ability of Gaussian process regression (GPR) to estimate data noise-level
        • Plot Gpr Noisy Targets
        • ========================================================= Gaussian Processes regression: basic introductory example
        • Plot Gpr On Structured Data
        • ========================================================================== Gaussian processes on discrete data structures
        • Plot Gpr Prior Posterior
        • ========================================================================== Illustration of prior and posterior Gaussian process for different kernels
      • Impute
        • Plot Iterative Imputer Variants Comparison
        • ========================================================= Imputing missing values with variants of IterativeImputer
        • Plot Missing Values
        • ==================================================== Imputing missing values before building an estimator
      • Inspection
        • Plot Causal Interpretation
        • =================================================== Failure of Machine Learning to infer causal effects
        • Plot Linear Model Coefficient Interpretation
        • ====================================================================== Common pitfalls in the interpretation of coefficients of linear models
        • Plot Partial Dependence
        • =============================================================== Partial Dependence and Individual Conditional Expectation Plots
        • Plot Permutation Importance
        • ================================================================ Permutation Importance vs Random Forest Feature Importance (MDI)
        • Plot Permutation Importance Multicollinear
        • ================================================================= Permutation Importance with Multicollinear or Correlated Features
      • Kernel Approximation
        • Plot Scalable Poly Kernels
        • ====================================================== Scalable learning with polynomial kernel approximation
      • Linear Model
        • Plot Ard
        • ==================================== Comparing Linear Bayesian Regressors
        • Plot Bayesian Ridge Curvefit
        • ============================================ Curve Fitting with Bayesian Ridge Regression
        • Plot Elastic Net Precomputed Gram Matrix With Weighted Samples
        • ========================================================================== Fitting an Elastic Net with a precomputed Gram Matrix and Weighted Samples
        • Plot Huber Vs Ridge
        • ======================================================= HuberRegressor vs Ridge on dataset with strong outliers
        • Plot Lasso And Elasticnet
        • ================================== L1-based models for Sparse Signals
        • Plot Lasso Dense Vs Sparse Data
        • ============================== Lasso on dense and sparse data
        • Plot Lasso Lars Ic
        • ============================================== Lasso model selection via information criteria
        • Plot Lasso Lasso Lars Elasticnet Path
        • ======================================== Lasso, Lasso-LARS, and Elastic Net paths
        • Plot Lasso Model Selection
        • ================================================= Lasso model selection: AIC-BIC / cross-validation
        • Plot Logistic L1 L2 Sparsity
        • ============================================== L1 Penalty and Sparsity in Logistic Regression
        • Plot Logistic Multinomial
        • ====================================================================== Decision Boundaries of Multinomial and One-vs-Rest Logistic Regression
        • Plot Logistic Path
        • ============================================== Regularization path of L1- Logistic Regression
        • Plot Multi Task Lasso Support
        • ============================================= Joint feature selection with multi-task Lasso
        • Plot Nnls
        • ========================== Non-negative least squares
        • Plot Ols Ridge
        • =========================================== Ordinary Least Squares and Ridge Regression
        • Plot Omp
        • =========================== Orthogonal Matching Pursuit
        • Poisson Regression and Non-Normal Loss for Insurance Claims Modeling
        • Scoring Helper: Evaluating Regression Models with Multiple Metrics
        • Calibration Check: Mean Frequency by Predicted Risk Group
        • Plot Polynomial Interpolation
        • =================================== Polynomial and Spline interpolation
        • Plot Quantile Regression
        • =================== Quantile regression
        • Plot Ransac
        • =========================================== Robust linear model estimation using RANSAC
        • Plot Ridge Coeffs
        • ========================================================= Ridge coefficients as a function of the L2 Regularization
        • Plot Ridge Path
        • =========================================================== Plot Ridge coefficients as a function of the regularization
        • Plot Robust Fit
        • Robust linear estimator fitting
        • Plot Sgd Early Stopping
        • ============================================= Early stopping of Stochastic Gradient Descent
        • Plot Sgd Iris
        • ======================================== Plot multi-class SGD on the iris dataset
        • Plot Sgd Loss Functions
        • ========================== SGD: convex loss functions
        • Plot Sgd Penalties
        • ============== SGD: Penalties
        • Plot Sgd Separating Hyperplane
        • ========================================= SGD: Maximum margin separating hyperplane
        • Plot Sgd Weighted Samples
        • ===================== SGD: Weighted samples
        • Plot Sgdocsvm Vs Ocsvm
        • ==================================================================== One-Class SVM versus One-Class SVM using Stochastic Gradient Descent
        • Plot Sparse Logistic Regression 20Newsgroups
        • ==================================================== Multiclass sparse logistic regression on 20newgroups
        • Plot Sparse Logistic Regression Mnist
        • ===================================================== MNIST classification using multinomial logistic + L1
        • Plot Theilsen
        • ==================== Theil-Sen Regression
        • Tweedie Regression for Insurance Pure Premium Modeling
        • Data Loading Helper for the French Motor Third-Party Liability Dataset
        • Visualization Helper: Observed vs. Predicted by Feature Level
        • Multi-Metric Scoring Helper for Insurance Models
      • Manifold
        • Plot Compare Methods
        • ========================================= Comparison of Manifold Learning methods
        • Plot Lle Digits
        • ============================================================================= Manifold learning on handwritten digits: Locally Linear Embedding, Isomap…
        • Plot Manifold Sphere
        • ============================================= Manifold Learning methods on a severed sphere
        • Plot Mds
        • ========================= Multi-dimensional scaling
        • Plot Swissroll
        • =================================== Swiss Roll And Swiss-Hole Reduction
        • Plot T Sne Perplexity
        • ============================================================================= t-SNE: The effect of various perplexity values on the shape
      • Miscellaneous
        • Plot Anomaly Comparison
        • ============================================================================ Comparing anomaly detection algorithms for outlier detection on toy datasets
        • Plot Display Object Visualization
        • =================================== Visualizations with Display Objects
        • Plot Estimator Representation
        • =========================================== Displaying estimators and complex pipelines
        • Plot Isotonic Regression
        • =================== Isotonic Regression
        • Imports for Johnson-Lindenstrauss Random Projection Bounds
        • Plot Kernel Approximation
        • ================================================== Explicit feature map approximation for RBF kernels
        • Plot Kernel Ridge Regression
        • ============================================= Comparison of kernel ridge regression and SVR
        • Plot Metadata Routing
        • ================ Metadata Routing
        • Plot Multilabel
        • ========================= Multilabel classification
        • Plot Multioutput Face Completion
        • ============================================== Face completion with a multi-output estimators
        • Plot Outlier Detection Bench
        • ========================================== Evaluation of outlier detection estimators
        • Plot Partial Dependence Visualization Api
        • ========================================= Advanced Plotting With Partial Dependence
        • Plot Pipeline Display
        • ================================================================= Displaying Pipelines
        • Plot Roc Curve Visualization Api
        • ================================ ROC Curve with Visualization API
        • Plot Set Output
        • ================================ Introducing the set_output API
      • Mixture
        • Plot Concentration Prior
        • ======================================================================== Concentration Prior Type Analysis of Variation Bayesian Gaussian Mixture
        • Plot Gmm
        • ================================= Gaussian Mixture Model Ellipsoids
        • Plot Gmm Covariances
        • =============== GMM covariances
        • Plot Gmm Init
        • ========================== GMM Initialization Methods
        • Plot Gmm Pdf
        • ========================================= Density Estimation for a Gaussian mixture
        • Plot Gmm Selection
        • ================================ Gaussian Mixture Model Selection
        • Plot Gmm Sin
        • ================================= Gaussian Mixture Model Sine Curve
      • Model Selection
        • Plot Confusion Matrix
        • ============================================================== Evaluate the performance of a classifier with Confusion Matrix
        • Plot Cost Sensitive Learning
        • ============================================================== Post-tuning the decision threshold for cost-sensitive learning
        • Plot Cv Indices
        • Visualizing cross-validation behavior in scikit-learn
        • Plot Cv Predict
        • ==================================== Plotting Cross-Validated Predictions
        • Plot Det
        • ==================================== Detection error tradeoff (DET) curve
        • Plot Grid Search Digits
        • ============================================================ Custom refit strategy of a grid search with cross-validation
        • Plot Grid Search Refit Callable
        • ================================================== Balance model complexity and cross-validated score
        • Plot Grid Search Stats
        • ================================================== Statistical comparison of models using grid search
        • Plot Grid Search Text Feature Extraction
        • ========================================================== Sample pipeline for text feature extraction and evaluation
        • Plot Learning Curve
        • ========================================================= Plotting Learning Curves and Checking Models’ Scalability
        • Plot Likelihood Ratios
        • ============================================================= Class Likelihood Ratios to measure classification performance
        • Plot Multi Metric Evaluation
        • ============================================================================ Demonstration of multi-metric evaluation on cross_val_score and GridSearchCV
        • Plot Nested Cross Validation Iris
        • ========================================= Nested versus non-nested cross-validation
        • Plot Permutation Tests For Classification
        • ================================================================= Test with permutations the significance of a classification score
        • Plot Precision Recall
        • ================ Precision-Recall
        • Plot Randomized Search
        • ========================================================================= Comparing randomized search and grid search for hyperparameter estimation
        • Plot Roc
        • ================================================== Multiclass Receiver Operating Characteristic (ROC)
        • Plot Roc Crossval
        • ============================================================= Receiver Operating Characteristic (ROC) with cross validation
        • Plot Successive Halving Heatmap
        • Comparison between grid search and successive halving
        • Plot Successive Halving Iterations
        • Successive Halving Iterations
        • Plot Train Error Vs Test Error
        • ========================================================= Effect of model regularization on training and test error
        • Plot Tuned Decision Threshold
        • ====================================================== Post-hoc tuning the cut-off point of decision function
        • Plot Underfitting Overfitting
        • ============================ Underfitting vs. Overfitting
      • Multiclass
        • Plot Multiclass Overview
        • =============================================== Overview of multiclass training meta-estimators
      • Multioutput
        • Plot Classifier Chain Yeast
        • ================================================== Multilabel classification using a classifier chain
      • Neighbors
        • Approximate Nearest Neighbors
        • ===================================== Approximate nearest neighbors in TSNE
        • Plot Caching Nearest Neighbors
        • ========================= Caching nearest neighbors
        • Plot Classification
        • ================================ Nearest Neighbors Classification
        • Plot Digits Kde Sampling
        • ========================= Kernel Density Estimation
        • Plot Kde 1D
        • =================================== Simple 1D Kernel Density Estimation
        • Plot Lof Novelty Detection
        • ================================================= Novelty detection with Local Outlier Factor (LOF)
        • Plot Lof Outlier Detection
        • ================================================= Outlier detection with Local Outlier Factor (LOF)
        • Plot Nca Classification
        • ============================================================================= Comparing Nearest Neighbors with and without Neighborhood Components Analysis
        • Plot Nca Dim Reduction
        • ============================================================== Dimensionality Reduction with Neighborhood Components Analysis
        • Plot Nca Illustration
        • ============================================= Neighborhood Components Analysis Illustration
        • Plot Nearest Centroid
        • =============================== Nearest Centroid Classification
        • Plot Regression
        • ============================ Nearest Neighbors regression
        • Plot Species Kde
        • ================================================ Kernel Density Estimate of Species Distributions
      • Neural Networks
        • Plot Mlp Alpha
        • ================================================ Varying regularization in Multi-layer Perceptron
        • Plot Mlp Training Curves
        • ======================================================== Compare Stochastic learning strategies for MLPClassifier
        • Plot Mnist Filters
        • ===================================== Visualization of MLP weights on MNIST
        • Plot Rbm Logistic Classification
        • ============================================================== Restricted Boltzmann Machine features for digit classification
      • Preprocessing
        • Plot All Scaling
        • ============================================================= Compare the effect of different scalers on data with outliers
        • Plot Discretization
        • ================================================================ Using KBinsDiscretizer to discretize continuous features
        • Plot Discretization Classification
        • ====================== Feature discretization
        • Plot Discretization Strategies
        • ========================================================== Demonstrating the different strategies of KBinsDiscretizer
        • Plot Map Data To Normal
        • ================================= Map data to a normal distribution
        • Plot Scaling Importance
        • ============================= Importance of Feature Scaling
        • Plot Target Encoder
        • ============================================ Comparing Target Encoder with Other Encoders
        • Plot Target Encoder Cross Val
        • ======================================= Target Encoder’s Internal Cross fitting
      • Release Highlights
        • Plot Release Highlights 0 22 0
        • ======================================== Release Highlights for scikit-learn 0.22
        • Imports for scikit-learn 0.23 Release Highlights: GLMs, KMeans Improvements, and Monotonic Constraints
        • Imports for scikit-learn 0.24 Release Highlights: Successive Halving, Self-Training, and ICE Plots
        • Imports for scikit-learn 1.0 Release Highlights: Feature Names, SplineTransformer, and Quantile Regression
        • Imports for scikit-learn 1.1 Release Highlights: Quantile Loss, Feature Names, and BisectingKMeans
        • Imports for scikit-learn 1.2 Release Highlights: set_output API, Interaction Constraints, and New Displays
        • Imports for scikit-learn 1.3 Release Highlights: HDBSCAN, TargetEncoder, and Metadata Routing
        • Imports for scikit-learn 1.4 Release Highlights: Native Categorical DTypes, Polars Support, and Monotonic Constraints for Trees
        • Imports for scikit-learn 1.5 Release Highlights: Decision Threshold Tuning, PCA Speedups, and Custom Imputation
        • Smallest Abs
        • Levenshtein Distance
        • Imports for scikit-learn 1.6 Release Highlights: FrozenEstimator, Pipeline transform_input, and Free-Threaded CPython
        • Imports for scikit-learn 1.7 Release Highlights: Improved HTML Displays, Custom Validation Sets, and ROC from CV Results
        • Imports for scikit-learn 1.8 Release Highlights: Array API GPU Support, Temperature Scaling, and Gap Safe Screening
      • Semi Supervised
        • Plot Label Propagation Digits
        • =================================================== Label Propagation digits: Demonstrating performance
        • Plot Label Propagation Digits Active Learning
        • ========================================= Label Propagation digits: Active learning
        • Plot Label Propagation Structure
        • ======================================================= Label Propagation circles: Learning a complex structure
        • Plot Self Training Varying Threshold
        • ============================================= Effect of varying threshold for self-training
        • Plot Semi Supervised Newsgroups
        • ================================================ Semi-supervised Classification on a Text Dataset
        • Plot Semi Supervised Versus Svm Iris
        • =============================================================================== Decision boundary of semi-supervised classifiers versus SVM on the Iris dataset
      • Svm
        • Plot Custom Kernel
        • ====================== SVM with custom kernel
        • Plot Iris Svc
        • ================================================== Plot different SVM classifiers in the iris dataset
        • Plot Linearsvc Support Vectors
        • ===================================== Plot the support vectors in LinearSVC
        • Plot Oneclass
        • ========================================== One-class SVM with non-linear kernel (RBF)
        • Plot Rbf Parameters
        • ================== RBF SVM parameters
        • Plot Separating Hyperplane
        • ========================================= SVM: Maximum margin separating hyperplane
        • Plot Separating Hyperplane Unbalanced
        • ================================================= SVM: Separating hyperplane for unbalanced classes
        • Plot Svm Anova
        • ================================================= SVM-Anova: SVM with univariate feature selection
        • Plot Svm Kernels
        • ========================================================= Plot classification boundaries with different SVM Kernels
        • Plot Svm Margin
        • ========================================================= SVM Margins Example
        • Plot Svm Regression
        • =================================================================== Support Vector Regression (SVR) using linear and non-linear kernels
        • Scaling the Regularization Parameter C for SVMs
        • Plot Svm Tie Breaking
        • ========================================================= SVM Tie Breaking Example
        • Plot Weighted Samples
        • ===================== SVM: Weighted samples
      • Text
        • Plot Document Classification 20Newsgroups
        • ====================================================== Classification of text documents using sparse features
        • Plot Document Clustering
        • ======================================= Clustering text documents using k-means
        • Plot Hashing Vs Dict Vectorizer
        • =========================================== FeatureHasher and DictVectorizer Comparison
      • Tree
        • Plot Cost Complexity Pruning
        • ======================================================== Post pruning decision trees with cost complexity pruning
        • Plot Iris Dtc
        • ======================================================================= Plot the decision surface of decision trees trained on the iris dataset
        • Plot Tree Regression
        • ======================== Decision Tree Regression
        • Plot Unveil Tree Structure
        • ========================================= Understanding the decision tree structure
  • Phase 3: Mathematics for ML
    • Phase 3: Mathematics for ML Catalog
    • 3Blue1Brown Visual Mathematics
      • Calculus (3Blue1Brown)
        • Chapter 1: The Essence of Calculus
        • Chapter 2: The Paradox of the Derivative
        • Chapter 3: Derivative Formulas through Geometry
        • Chapter 4: Visualizing the Chain Rule and Product Rule
        • Chapter 5: Derivatives of Exponential
        • Chapter 6: Implicit Differentiation
        • Chapter 7: Limits and L’Hopital’s Rule
        • Chapter 8: Integration and the Fundamental Theorem
        • Chapter 9: What Does Area Have to Do with Slope?
        • Chapter 10: Higher Order Derivatives
        • Chapter 11: Taylor Series
        • Chapter 12: What Makes \(e^x\) So Special?
      • Chapter 1: Differential Equations - Introduction
      • Chapter 2: The Heat Equation
      • Chapter 3: Solving the Heat Equation
      • Chapter 4: Fourier Series and Complex Exponential
      • Chapter 5: Laplace Transforms
      • Chapter 6: Understanding the Laplace Transform
      • Chapter 7: Resonance and Forced Oscillations
      • Chapter 8: Matrix Exponents
      • Chapter 1: Abstract Vector Spaces
      • Chapter 2: Linear combinations, span, and basis vectors
      • Chapter 3: Linear transformations and matrices
      • Chapter 4: Matrix multiplication as composition
      • # Chapter 5: Three-dimensional linear transformations
      • Chapter 6: The determinant
      • Chapter 7: Inverse matrices, column space and null space
      • Chapter 8: Nonsquare matrices as transformations between dimensions
      • Chapter 9: Dot products and duality
      • Chapter 10: Three-Dimensional Linear Transformations
      • Chapter 11: Eigenvalues and Eigenvectors
      • Chapter 12: Cramer’s rule, explained geometrically
      • Chapter 13: Change of basis
      • Chapter 14: Eigenvalues and Eigenvectors - Deep Dive
      • Chapter 15: Quick Eigenvalue Trick
      • Chapter 16: Abstract Vector Spaces
      • Chapter 1: But What is a Neural Network?
      • Chapter 2: Gradient Descent
      • Chapter 3: What is Backpropagation?
      • Chapter 4: Backpropagation Calculus
      • Chapter 5: GPT and Large Language Models
      • Chapter 6: Attention in Transformers
      • Chapter 7: Attention Mechanism Deep Dive
      • Chapter 8: How GPT Stores Facts
      • Chapter 9: Diffusion Models
    • Advanced Mathematics for Machine Learning
      • 1. The Learning Problem
      • 2. Generalization: The Core Challenge
      • Advanced Statistical Learning Theory
      • 3. The Bias-Variance Tradeoff
      • 4. Sample Complexity
      • 5. PAC Learning Framework
      • 6. Summary
      • 7. Exercises
      • References
      • 1. The Sampling Problem
      • 2. Markov Chain Basics
      • 3. Metropolis-Hastings Algorithm
      • 4. Gibbs Sampling
      • 5. Convergence and Diagnostics
      • 6. Summary
      • 1. Rademacher Complexity Definition
      • 2. Generalization Bound via Rademacher Complexity
      • 3. Rademacher Complexity for Different Hypothesis Classes
      • 4. Connection to Generalization
      • 5. Comparison: Rademacher Complexity vs VC Dimension
      • 6. Advanced: Gaussian Complexity
      • Summary
      • 1. PAC-Bayes Framework
      • 2. Proof Sketch
      • 3. Application: Gaussian Posterior over Weights
      • 4. Training with PAC-Bayes Bound
      • 5. Advantages of PAC-Bayes
      • Summary
      • 1. Motivation: Neural Networks as Kernel Methods
      • 2. Linearized Training Dynamics
      • 3. Computing NTK for Simple Networks
      • 4. Empirical NTK via Jacobian
      • 5. Training Dynamics: Kernel Gradient Descent
      • 6. NTK Evolution During Training
      • 7. Theoretical Implications
      • Summary
      • 1. Motivation: The Intractability Problem
      • 2. Evidence Lower Bound (ELBO)
      • 3. Mean-Field Approximation
      • 4. Example: Bayesian Gaussian Mixture (Simple Case)
      • 5. Black Box Variational Inference (BBVI)
      • Summary
      • 1. Why Concentration Inequalities?
      • 2. Markov and Chebyshev Inequalities
      • 3. Hoeffding’s Inequality
      • 4. Bernstein’s Inequality
      • 5. Application to Learning Theory
      • Summary
      • 1. Motivation: Infinite Flexibility
      • 2. Dirichlet Distribution Review
      • 3. Dirichlet Process (DP)
      • 4. Chinese Restaurant Process (CRP)
      • 5. DP Mixture Model for Clustering
      • Summary
      • 1. The EM Framework
      • 2. Mathematical Derivation
      • 3. Convergence Proof
      • 4. Gaussian Mixture Model (GMM)
      • 5. EM for GMM
      • 6. Visualizing the E and M Steps
      • 7. Summary
      • 1. Function Properties
      • 2. Gradient Descent Algorithm
      • 3. Convergence for L-Smooth Functions
      • 4. Strong Convexity: Linear Convergence
      • 5. Summary of Convergence Rates
      • Summary
      • 1. State Space Model
      • 2. Kalman Filter Algorithm
      • 3. Kalman Smoother (RTS)
      • Summary
      • 1. Motivation
      • 2. Sklar’s Theorem
      • 3. Common Copula Families
      • 4. Application: Constructing Joint Distributions
      • Summary
      • 1. Motivation: Diversity
      • 2. DPP Definition
      • 3. L-ensemble DPP
      • 4. Sampling Algorithm
      • 5. Application: Diverse Subset Selection
      • Summary
      • 1. The Johnson-Lindenstrauss Lemma
      • 2. Random Projection
      • 3. Implementation
      • 4. Sparse Random Projections
      • 5. Application: Nearest Neighbors
      • Summary
      • 1. Primal Problem
      • 2. Dual Problem
      • 3. KKT Conditions
      • 4. Application: SVM Dual
      • Summary
      • 1. Problem: Solve \(Ax = b\)
      • 2. Conjugate Directions
      • 3. Convergence Analysis
      • 4. Preconditioned CG
      • Summary
      • 1. Matrix Bernstein Inequality
      • 2. Matrix Chernoff Bound
      • 3. Random Matrix Theory
      • 4. Marchenko-Pastur Law
      • 5. Application: Compressed Sensing
      • Summary
    • Stanford CS229 Machine Learning
      • CS229: Machine Learning Course (Stanford University)
        • Lecture 1: Introduction & Linear Regression
        • Lecture 2: Linear Regression and Gradient Descent
        • Lecture 3: Locally Weighted Regression (LWR)
        • Lecture 3 & 4: Logistic Regression & Classification
        • Lecture 5 & 6: Generative Learning Algorithms
        • Lecture 6-7: Support Vector Machines
        • Lecture 8: Bias, Variance, and Regularization
        • Lecture 9: Learning Theory
        • Lecture 10: Decision Trees and Ensemble Methods
        • Lecture 11: Introduction to Neural Networks
        • Lecture 12: Backpropagation & Deep Learning
        • Lecture 13: Advice for Applying Machine Learning
        • Lecture 14: Expectation Maximization & Clustering
        • Lectures 15-17: EM, Factor Analysis, PCA & ICA
        • Lectures 18-20: Reinforcement Learning & MDPs
        • Lecture 1: Linear Regression (10 Problems)
        • Lecture 2: Logistic Regression (10 Problems)
        • Lecture 3: Regularization (10 Problems)
        • Lecture 4: Generative Models (10 Problems)
        • Lecture 5: SVMs (10 Problems)
        • Lecture 6: Neural Networks - Basics (15 Problems)
        • Comprehensive Projects (10 Projects)
        • Solutions and Hints
        • Progress Tracker
        • Anomaly Detection
        • Recommender Systems
    • Foundational Mathematics
      • Setup: Install Required Packages
      • 1. NumPy: Numerical Python
      • 2. Matplotlib: Visualization Basics
      • 3. Seaborn: Beautiful Statistical Plots
      • 4. SciPy: Scientific Computing
      • 5. scikit-learn: Machine Learning Library
      • Quick Reference Cheat Sheet
      • Practice Exercise
      • Bonus: Next Steps Examples
      • πŸŽ“ Your Learning Journey
      • Linear Algebra Fundamentals
      • Calculus & Derivatives
      • Probability & Statistics
      • Gradient Descent
      • Information Theory
      • Statistical Inference
      • Neural Network Mathematics
      • Advanced Linear Algebra
      • The Architecture of Mathematics: Analytical vs Numerical Approaches
      • AI Foundations: Symbolic vs Non-Symbolic AI & Control Theory
      • Markov Models & Hidden Markov Models (HMMs)
      • Optimization from Scratch: Gradient Descent & Adam
    • Introduction to Statistical Learning with Python (ISLP)
      • Statistics & Probability for AI/ML
      • Chapter 1: Introduction to Statistical Learning
      • Chapter 2: Statistical Learning
      • Chapter 3: Linear Regression
      • Chapter 4: Classification
      • Chapter 5: Resampling Methods
      • Chapter 6: Linear Model Selection and Regularization
      • Chapter 7: Moving Beyond Linearity
      • Chapter 8: Tree-Based Methods
      • Chapter 9: Support Vector Machines
      • Chapter 10: Deep Learning
      • Chapter 11: Survival Analysis and Censored Data
      • Chapter 12: Unsupervised Learning
      • Chapter 13: Multiple Testing
      • Chapter 1: Introduction (Additional Problems)
      • Chapter 2: Statistical Learning (Additional Problems)
      • Chapter 3: Linear Regression (Additional Problems)
      • Chapter 4: Classification (Additional Problems)
      • Chapter 5: Resampling Methods (Additional Problems)
      • Chapter 6: Regularization (Additional Problems)
      • Chapter 7: Non-Linearity (Additional Problems)
      • Chapter 8: Tree Methods (Additional Problems)
      • Chapter 9: SVM (Additional Problems)
      • Chapter 10: Deep Learning (Additional Problems)
      • Chapter 11: Survival Analysis (Additional Problems)
      • Chapter 12: Unsupervised Learning (Additional Problems)
      • Chapter 13: Multiple Testing (Additional Problems)
      • Project Ideas
      • Challenge Problems
      • Solutions and Hints
      • Progress Tracker
      • Certificate of Completion
      • Part 1: Probability Fundamentals
      • Part 2: Descriptive Statistics
      • Part 3: Probability Distributions
      • Part 4: Statistical Inference
      • Part 5: Hypothesis Testing
      • Part 6: Correlation & Causation
      • Part 7: ML Applications
      • 🎯 Summary & Key Takeaways
      • πŸ“š Additional Resources
    • Machine Learning: A Probabilistic Perspective
      • 1. Basic Probability Rules
      • 2. Bayes’ Rule
      • 3. Common Probability Distributions
      • 4. Monte Carlo Sampling
      • 5. Information Theory Basics
      • Summary
      • Exercises
      • 1. Naive Bayes Classifier
      • 2. Gaussian Discriminant Analysis (GDA)
      • 3. Generative vs Discriminative Models
      • 4. Spam Detection with Naive Bayes
      • 5. Effect of the Naive Assumption
      • Summary
      • Key Takeaways
      • Exercises
      • 1. Multivariate Gaussian Distribution
      • 2. Maximum Likelihood Estimation
      • 3. Gaussian Mixture Models (GMM)
      • 4. Expectation-Maximization (EM) Algorithm
      • 5. Missing Data Imputation with EM
      • Summary
      • Key Takeaways
      • Exercises
      • 1. Bayesian Inference Fundamentals
      • 2. Conjugate Priors
      • 3. Posterior Predictive Distribution
      • 4. Bayesian Decision Theory
      • 5. Empirical Bayes
      • Summary
      • Key Takeaways
      • Exercises
      • 1. Linear Regression
      • 2. Ridge Regression (L2 Regularization)
      • 3. Bayesian Linear Regression
      • 4. Logistic Regression for Binary Classification
      • 5. Multinomial Logistic Regression (Softmax)
      • 6. Model Selection and Regularization Comparison
      • Summary
      • Key Takeaways
      • Exercises
      • 1. Lasso Regression (L1 Regularization)
      • 2. Regularization Path: Ridge vs Lasso
      • 3. Elastic Net: Combining L1 and L2
      • 4. Feature Selection with Lasso
      • 5. Sparse Logistic Regression
      • Summary
      • Key Takeaways
      • Exercises
      • 1. Kernel Functions and the Kernel Trick
      • 2. Kernel Ridge Regression
      • 3. Support Vector Machines (SVM)
      • 4. Gaussian Processes for Regression
      • 5. GP Hyperparameter Optimization
      • Summary
      • Key Takeaways
      • When to Use
      • Exercises
      • 1. Bayesian Network Basics
      • 2. Conditional Independence and d-Separation
      • 3. Naive Bayes Classifier
      • 4. Markov Chains
      • Summary
      • Key Takeaways
      • Exercises
      • 1. Hidden Markov Model Definition
      • 2. Forward Algorithm (Filtering)
      • 3. Backward Algorithm
      • 4. Viterbi Algorithm (Decoding)
      • 5. Baum-Welch Algorithm (Learning)
      • 6. Application: Part-of-Speech Tagging
      • Summary
      • Key Takeaways
      • Computational Complexity
      • Limitations
      • Extensions
      • Exercises
      • 1. Monte Carlo Basics
      • 2. Metropolis-Hastings Algorithm
      • MCMC Theory: Advanced Mathematical Foundations
      • 3. Gibbs Sampling
      • 4. Convergence Diagnostics
      • 5. Bayesian Linear Regression with MCMC
      • Summary
      • Key Takeaways
      • Comparison
      • Modern MCMC
      • Exercises
      • 1. Gaussian Mixture Models (GMM)
      • 2. K-Means as Hard EM
      • 3. EM Algorithm Theory
      • 4. EM for GMM
      • 5. Model Selection (BIC, AIC)
      • 6. Mixture of Bernoullis
      • Summary
      • Key Takeaways
      • Extensions
      • Exercises
      • 1. Principal Component Analysis (PCA)
      • 2. Probabilistic PCA
      • 3. Factor Analysis (FA)
      • 4. Independent Component Analysis (ICA)
      • Summary
      • Comparison Table
      • Key Insights
      • Practical Tips
      • Exercises
      • 1. K-Means Clustering (Detailed)
      • 2. Hierarchical Clustering
      • 3. Spectral Clustering
      • 4. DBSCAN (Density-Based)
      • 5. Affinity Propagation
      • 6. Cluster Evaluation Metrics
      • Summary
      • Algorithm Comparison
      • Choosing an Algorithm
      • Practical Tips
      • Exercises
    • Mathematics for Machine Learning (MML)
      • Course
        • 2.1 Systems of Linear Equations
        • 2.2 Matrices
        • 2.3 Solving Systems of Linear Equations
        • 2.4 Vector Spaces
        • 2.5 Linear Independence
        • 2.6 Basis and Rank
        • 2.7 Linear Mappings
        • 2.8 Affine Spaces
        • Summary
        • 3.1 Norms
        • 3.2 Inner Products
        • 3.3 Lengths and Distances
        • 3.4 Angles and Orthogonality
        • 3.5 Orthonormal Basis
        • 3.8 Orthogonal Projections
        • 3.9 Rotations
        • Summary
        • 4.1 Determinant and Trace
        • 4.2 Eigenvalues and Eigenvectors
        • 4.3 Cholesky Decomposition
        • 4.4 Eigendecomposition and Diagonalization
        • 4.5 Singular Value Decomposition (SVD)
        • 4.6 Matrix Approximation
        • Summary
        • 5.1 Differentiation of Univariate Functions
        • 5.2 Partial Differentiation and Gradients
        • 5.3 Gradients of Vector-Valued Functions
        • 5.6 Backpropagation and Automatic Differentiation
        • 5.7 Higher-Order Derivatives
        • 5.8 Linearization and Multivariate Taylor Series
        • Summary
        • 6.1 Probability Space
        • 6.2 Discrete and Continuous Probabilities
        • 6.3 Sum Rule, Product Rule, and Bayes’ Theorem
        • 6.4 Summary Statistics and Independence
        • 6.5 Gaussian Distribution
        • Summary
        • 7.1 Gradient Descent
        • Advanced Optimization Theory for Deep Learning
        • 7.2 Constrained Optimization and Lagrange Multipliers
        • 7.3 Convex Optimization
        • Summary
        • 9.1 Problem Formulation
        • 9.2 Maximum Likelihood Estimation (MLE)
        • 9.3 Regularization: Ridge and Lasso
        • 9.4 Bayesian Linear Regression
        • 9.5 Model Selection and Evaluation
        • Summary
        • 10.1 Problem Setting
        • 10.2 Maximum Variance Perspective
        • 10.3 Projection Perspective
        • 10.4 PCA Algorithm
        • 10.5 PCA on Real Data: Handwritten Digits
        • 10.6 PCA for Data Preprocessing
        • Summary
        • 11.1 Gaussian Mixture Model
        • 11.2 Expectation-Maximization (EM) Algorithm
        • 11.3 Soft vs Hard Clustering
        • 11.4 Model Selection: Choosing K
        • 11.5 GMM Applications and Limitations
        • Summary
        • 12.1 Separating Hyperplanes
        • 12.2 Maximum Margin
        • 12.3 Primal Optimization Problem
        • 12.4 Dual Problem and Lagrange Multipliers
        • 12.5 Kernel Trick
        • 12.6 Soft Margin SVM
        • Summary
      • Exercises
        • Chapter 2: Linear Algebra
        • Chapter 3: Analytic Geometry
        • Chapter 4: Matrix Decompositions
        • Chapter 5: Vector Calculus
        • Chapter 6: Probability and Distributions
        • Chapter 7: Continuous Optimization
        • 🎯 Bonus Challenge: Integration Exercise πŸ”΄πŸ”΄
        • Chapter 9: Linear Regression
        • Chapter 10: PCA (Principal Component Analysis)
        • Chapter 11: Gaussian Mixture Models
        • Chapter 12: Support Vector Machines
        • 🎯 Bonus Challenge: Complete ML Pipeline πŸ”΄πŸ”΄
        • Chapter 2: Linear Algebra Solutions
        • Chapter 3: Analytic Geometry Solutions
        • Chapter 4: Matrix Decompositions Solutions
        • Chapter 9: Linear Regression Solutions
        • Chapter 10: PCA Solutions
    • Resources

Core AI

  • Phase 4: Tokenization
    • Phase 4: Tokenization Catalog
    • Production Tokenization Guide
    • Tokenization Comparison Guide
    • Integration Guide: Using Tokenizers with Popular Frameworks
    • Phase 1: Understanding Tokens
    • HuggingFace Tokenizers - Complete Learning Module> Fast, blazing-fast tokenization with the πŸ€— Tokenizers library
    • HuggingFace Tokenizers Library - Complete Learning Guide
    • Understanding Tokens: The Foundation of Language Models
    • Phase 4: Tokenization β€” Start Here
    • 01 Tokenizers Quickstart
    • HuggingFace Tokenizers - Quick Start Examples
    • 02 Tokenizers Training
    • HuggingFace Tokenizers - Training Examples
    • 03 Advanced Training Methods
    • Advanced Training Methods for Tokenizers
    • Setup
    • Part 1: Normalization
    • Part 2: Pre-tokenization
    • Part 3: Post-processing
    • Part 4: Decoders
    • Part 5: Complete Pipeline Examples
    • Summary
    • Sentencepiece Example
    • Tiktoken Example
    • Basic Tokenization Example
    • Token Exercises
    • Token Exercises - Interactive Practice
    • Token Exploration
    • Token Exploration - Advanced Examples
  • Phase 5: Embeddings
    • Phase 5: Embeddings Catalog
    • Quick Start Guide - Phase 2 Embeddings
    • Embedding Models Comparison Guide
    • Phase 5: Embeddings β€” Start Here
    • Embeddings Intro
    • Embeddings Introduction
    • Huggingface Embeddings
    • Openai Embeddings
    • Paraphrase Mining With Sentence Transformers
    • Semantic Search With Sentence Transformers
    • Semantic Similarity
    • Semantic Similarity Explorer
    • Semantic Textual Similarity (STS)
    • Sentence Transformers Quickstart
    • Sparse Encoders: SPLADE and Learned Sparse Representations
    • Vector Database Demo
    • Vector Database Demo
  • Phase 6: Neural Networks
    • Phase 6: Neural Networks Catalog
    • Assignment: Build a Neural Network from Scratch
    • Attention Mechanism: The Breakthrough Innovation
    • Challenges: Neural Networks
    • Neural Networks: From Basics to Transformers
    • Transformer Architecture: Complete Guide
    • Verify Installation
    • 🎯 What You’ll Build
    • πŸ§ͺ Quick Neural Network Demo
    • πŸ“– Reading Material
    • πŸŽ“ Prerequisites Review
    • 🚦 Next Steps
    • πŸ“Š Progress Tracker
    • 🎯 Learning Goals
    • πŸ”— Helpful Resources
    • πŸš€ Let’s Begin!
    • What is a Neuron?
    • 2. Activation Functions
    • 3. Building a Neural Network Layer
    • 4. Building a Complete Neural Network
    • 5. Training a Neural Network
    • 6. Visualizing Decision Boundaries
    • 7. Experimenting with Architecture
    • Summary
    • The Problem
    • 2. The Chain Rule - Foundation of Backpropagation
    • 3. Backpropagation in a Simple Network
    • 4. Training with Backpropagation
    • 5. Multi-Layer Network with Matrix Operations
    • 6. Vanishing and Exploding Gradients
    • Summary
    • Creating Tensors
    • Tensor Operations
    • 2. Automatic Differentiation (Autograd)
    • 3. Building Neural Networks with nn.Module
    • 4. Training a Neural Network – The Complete Loop
    • 5. Modern Optimizers
    • 6. Real Dataset – MNIST Digit Classification
    • 7. Saving and Loading Models
    • Summary
    • Before Attention: The Bottleneck Problem
    • 2. Scaled Dot-Product Attention
    • Attention Mechanism: Mathematical Foundations
    • 3. Self-Attention Example – Understanding Context
    • 4. Multi-Head Attention
    • 5. Masked Attention - For Autoregressive Models
    • 6. Cross-Attention - Connecting Two Sequences
    • 7. Practical Application – Sequence Classification
    • Summary
    • Why Transformers Changed Everything
    • 2. Positional Encoding
    • 3. Feedforward Network
    • 4. Transformer Encoder Layer
    • 5. Complete Transformer Encoder
    • 6. Simple Classification with Transformer
    • 7. Using Pre-trained Transformers
    • 8. Fine-tuning Example
    • 9. Transformer Architecture Diagram
    • Summary
    • πŸŽ“ Congratulations!

Applied AI

  • Phase 7: Vector Databases
    • Phase 7: Vector Databases Catalog
    • πŸ“– Learning Path
    • 🎯 Prerequisites
    • πŸ—„οΈ Database Comparison
    • πŸ’‘ Common Use Cases
    • πŸ”— Additional Resources
    • 🚦 Ready to Start?
    • πŸ—ΊοΈ Your Complete Learning Journey
    • 1. Understanding Vectors and Embeddings
    • 2. Similarity Metrics
    • 3. Why Vector Databases?
    • 4. Simple Vector Database Implementation
    • 5. Using Our Simple Vector Database
    • 6. Update and Delete Operations
    • 7. Real-World Example with Sentence Embeddings
    • 8. Performance Comparison: Different Metrics
    • Key Takeaways
    • Next Steps
    • Chroma – Local Vector Database
    • Qdrant – Production Vector Database
    • Weaviate – Enterprise Vector Database
    • Milvus – Large-Scale Vector Database
    • 1. Connection Setup
    • 2. Enable pgvector Extension
    • 3. Create Tables for Embeddings
    • 4. Generate and Store Embeddings
    • 5. Semantic Search
    • 6. Filtered Semantic Search
    • 7. Hybrid Search (Vector + Full-Text)
    • 8. Product Search Example
    • 9. Performance Optimization
    • 10. Monitoring and Statistics
    • 11. Best Practices
    • 12. Integration with AWS Services
    • 13. Cleanup
    • Summary
  • Phase 8: RAG
    • Phase 8: RAG Catalog
    • Assignment: Build a Production-Ready RAG System
    • Challenges: RAG Systems
    • RAG: Retrieval-Augmented Generation - START HERE
    • Basic RAG from Scratch
    • Document Processing and Chunking
    • LangChain RAG
    • LlamaIndex RAG
    • Advanced Retrieval
    • Conversational RAG
    • RAG Evaluation
    • Advanced RAG Techniques (2025-2026 State of the Art)
    • GraphRAG and Visual RAG (Microsoft GraphRAG + ColPali)
    • Section A: Microsoft GraphRAG
    • Section B: ColPali Visual Document RAG
  • Phase 9: MLOps
    • Phase 9: MLOps Catalog
    • MLOps: Machine Learning in Production
    • Experiment Tracking with MLflow
    • Building ML APIs with FastAPI
    • Model Deployment Strategies
    • Containerizing ML Applications with Docker
    • Monitoring ML Models in Production
    • CI/CD for Machine Learning
    • Cloud Deployment for ML Models
    • LLM Infrastructure for Production (2025-2026 Essential Stack)
    • LLM Production Optimization
  • Phase 10: Specializations
    • Phase 10: Specializations Catalog
    • Phase 10: AI Specializations β€” Start Here
    • AI Agents Specialization
      • AI Agents Series - Completion Summary
      • AI Agents Specialization β€” Start Here
      • Function Calling & Tool Use
      • ReAct: Reasoning + Acting Agents
      • LangGraph: Stateful Agent Workflows
      • Multi-Agent Systems
      • Memory & State Management for Agents
      • Deploying Agents to Production
    • Computer Vision Specialization
      • Computer Vision Specialization β€” Start Here
      • Image Classification with Deep Learning
      • Object Detection: YOLO, DETR & Beyond
      • CLIP: Connecting Text and Images
      • Stable Diffusion & Image Generation
      • Multimodal RAG: Text + Images
    • Advanced NLP Specialization
      • Advanced NLP Specialization β€” Start Here
      • Named Entity Recognition (NER)
      • Machine Translation
      • Text Summarization: Extractive & Abstractive
      • Sentiment Analysis at Scale
      • Information Extraction from Documents

Advanced

  • Phase 11: Prompt Engineering
    • Phase 11: Prompt Engineering Catalog
    • Assignment: Build an Advanced Prompt Engineering System
    • Setup
    • Example 1: Basic vs. Improved Prompt
    • Example 2: Few-Shot Learning
    • Example 3: Chain-of-Thought Reasoning
    • Example 4: System Prompts
    • Example 5: Structured Output
    • Key Takeaways
    • Next Steps
    • 1. Zero-Shot Prompting
    • 2. One-Shot Prompting
    • 3. Few-Shot Prompting
    • 4. Dynamic Few-Shot Selection
    • 5. Best Practices
    • Key Takeaways
    • Model Selection (December 2025)
    • Next Steps
    • 1. The Classic Example
    • 2. Zero-Shot CoT
    • 3. Few-Shot CoT
    • 4. Self-Consistency
    • 5. Structured CoT
    • 6. CoT for Code Debugging
    • 7. Least-to-Most Prompting
    • Best Practices
    • Key Takeaways
    • Next Steps
    • Setup
    • 1. Simple ReAct Example
    • 2. ReAct Agent
    • 3. Real Tools – Wikipedia Search
    • 4. More Complex Example
    • 5. Custom Tools
    • 6. Error Handling and Self-Correction
    • Best Practices
    • Key Takeaways
    • Limitations
    • Next Steps
    • Structured LLM Outputs & Programmatic Prompting (2025-2026)
    • Long-Context Strategies: Working with 128K–1M Token Windows
  • Phase 12: LLM Fine-Tuning
    • Phase 12: LLM Fine-Tuning Catalog
    • Phase 12: LLM Fine-tuning β€” Start Here
    • Dataset Preparation for LLM Fine-tuning
    • Supervised Fine-Tuning (SFT) β€” Complete Workflow
    • LoRA Fine-tuning Basics (December 2025)
    • QLoRA β€” Memory-Efficient Fine-Tuning on Consumer GPUs
    • DPO Alignment: Teaching Models to Be Helpful and Harmless
    • Evaluating Fine-Tuned LLMs
    • Deploying Fine-Tuned LLMs to Production
    • GRPO Reasoning Training - Training R1-Style Thinking Models (2025)
    • Unsloth - 2x-5x Faster Fine-Tuning with 80% Less VRAM (2025)
    • Quantization: GPTQ, AWQ, GGUF & bitsandbytes
    • RLHF & Constitutional AI: Alignment Training
  • Phase 13: Multimodal AI
    • Phase 13: Multimodal AI Catalog
    • Multimodal AI β€” Start Here
    • Audio & Speech
      • Whisper: Speech Recognition & Audio Understanding
      • Text-to-Speech: TTS with OpenAI, Coqui & Edge TTS
    • Image Generation
      • Stable Diffusion: Text-to-Image Generation
      • ControlNet: Precise Control Over Image Generation
    • Vision-Language Models
      • CLIP Basics: Zero-Shot Vision with CLIP
      • Vision-Language Models: GPT-4V, LLaVA & Gemini Vision
      • Multimodal RAG: Retrieval-Augmented Generation with Images
  • Phase 14: Local LLMs
    • Phase 14: Local LLMs Catalog
    • Phase 14: Local LLMs β€” Start Here
    • Setup
    • 1. Download and Run a Model
    • 8. CLI Usage (from terminal)
    • Tips & Best Practices
    • Key Takeaways
    • Limitations
    • Next Steps
    • Part 1: Major Open Source Model Families
    • Part 2: Model Family Deep Dives
    • 🎯 Key Takeaways
    • πŸ“ Practice Exercises
    • πŸ”— Resources
    • Local RAG with Ollama
    • Local LLM Servers and APIs
    • Speculative Decoding: 2-3x Faster LLM Inference
  • Phase 15: AI Agents
    • Phase 15: AI Agents Catalog
    • Phase 14: AI Agents - Assignment
    • Phase 14: AI Agents - Challenges
    • Phase 14: AI Agents - Post-Quiz
    • Phase 14: AI Agents - Pre-Quiz
    • Phase 15: AI Agents β€” Start Here
    • Part 1: What is an AI Agent?
    • Part 2: Chatbot vs Agent
    • Part 4: Agent Design Patterns
    • 🎯 Summary
    • Setup
    • Part 1: Function Calling Basics
    • Part 2: Tool Schema Design
    • Part 4: Error Handling
    • Part 5: Advanced Patterns
    • Part 6: Best Practices Summary
    • 🎯 Final Knowledge Check
    • πŸš€ Next Steps
    • Part 1: What is ReAct?
    • 🎯 Final Knowledge Check
    • πŸš€ Next Steps
    • Part 1: Framework Overview
    • Part 2: LangChain Agents
    • Part 3: LangGraph Workflows
    • Part 4: Memory Integration
    • Part 5: Framework Comparison
    • Part 6: Production Patterns
    • 🎯 Knowledge Check
    • πŸš€ Next Steps
    • Part 1: Multi-Agent Basics
    • Part 2: Agent Coordination
    • Part 3: Role-Based Teams
    • Part 4: Communication Patterns
    • Part 5: Conflict Resolution
    • Part 6: Production Systems
    • 🎯 Final Knowledge Check
    • πŸš€ Next Steps
    • MCP β€” Model Context Protocol
    • OpenAI Agents SDK + LangGraph 1.0
    • Section A β€” OpenAI Agents SDK
    • Section B β€” LangGraph 1.0
    • Section C β€” Comparison and Production Guidance
    • Notebook 08: Working with Reasoning Models
    • Autonomous AI Agents in 2026

Supplementary

  • Phase 16: Model Evaluation
    • Phase 16: Model Evaluation Catalog
    • Phase 15 Assignment: Complete Model Evaluation Pipeline
    • Phase 15 Challenges: Model Evaluation & Metrics
    • Post-Quiz: Model Evaluation & Metrics
    • Pre-Quiz: Model Evaluation & Metrics
    • Phase 16: Model Evaluation β€” Start Here
    • Part 1: Confusion Matrix Basics
    • Part 2: Core Metrics
    • Part 3: ROC Curves & AUC
    • Part 4: Handling Imbalanced Data
    • Part 5: Multi-Class Metrics
    • Part 6: Choosing the Right Metric
    • 🎯 Knowledge Check
    • πŸš€ Next Steps
    • Part 1: Core Regression Metrics
    • Part 2: Understanding Residuals
    • Part 3: R-Squared Explained
    • Part 4: Choosing the Right Metric
    • Part 5: Outlier Handling
    • Part 6: Advanced Metrics
    • 🎯 Knowledge Check
    • πŸ“š Summary
    • πŸš€ Next Steps
    • Part 1: Introduction to LLM Evaluation
    • Part 2: BLEU Score
    • Part 3: ROUGE Metrics
    • Part 4: Perplexity
    • Part 5: Semantic Similarity (BERTScore)
    • Part 6: Human Evaluation
    • Part 7: RAG Evaluation
    • 🎯 Knowledge Check
    • πŸ“š Summary
    • πŸš€ Next Steps
    • Part 1: Understanding Bias in AI
    • Part 2: Fairness Metrics
    • Part 3: Detecting Bias
    • Part 4: Mitigation Strategies
    • Part 5: Real-World Case Studies
    • Part 6: Fairness Toolkits
    • 🎯 Knowledge Check
    • πŸ“š Summary
    • πŸš€ Next Steps
    • Part 1: Cross-Validation
    • Part 2: Comparing Multiple Models
    • Part 3: Statistical Significance Testing
    • Part 4: A/B Testing
    • Part 5: Multi-Objective Selection
    • Part 6: Model Selection Framework
    • 🎯 Knowledge Check
    • πŸ“š Summary
    • πŸš€ Next Steps
  • Phase 17: Debugging & Troubleshooting
    • Phase 17: Debugging & Troubleshooting Catalog
    • Assignment: Debug & Optimize a Broken ML Pipeline
    • Debugging & Troubleshooting Challenges
    • Post-Quiz: Debugging & Troubleshooting
    • Pre-Quiz: Debugging & Troubleshooting
    • Phase 17: Debugging & Troubleshooting β€” Start Here
    • Part 1: The Debugging Workflow
    • Part 2: Sanity Checks Checklist
    • Part 3: Baseline Models
    • Part 4: Debugging Checklist
    • Part 5: Logging and Instrumentation
    • 🎯 Key Takeaways
    • πŸ“ Practice Exercise
    • πŸš€ Next Steps
    • Missing Data: The Silent Model Killer
    • Duplicate Detection: Preventing Data Leakage and Inflated Metrics
    • Outlier Detection: Distinguishing Signal from Noise
    • Label Noise Detection: When Your Ground Truth Lies
    • Distribution Shift Detection: When the World Changes Under Your Model
    • 🎯 Key Takeaways
    • πŸ“ Practice Exercise
    • CPU Profiling with cProfile: Finding Where Time Disappears
    • Vectorized Optimization: Replacing Loops with NumPy
    • Memory Profiling: Tracking Allocation and Leaks
    • Identifying Bottlenecks in ML Pipelines
    • Common Optimization Techniques for ML Code
    • Vectorization: Broadcasting Over Loops
    • Caching and Memoization: Trading Memory for Speed
    • Batch Processing: Amortizing Per-Call Overhead
    • 🎯 Key Takeaways
    • πŸ“ Optimization Checklist
    • Learning Curves: Diagnosing Bias vs. Variance
    • Overfitting vs. Underfitting: Visual Diagnosis with Polynomial Regression
    • Regularization Strategies: Constraining Model Complexity
    • Convergence Issues: When Gradient Descent Gets Lost
    • Model Complexity Trade-off: Validation Curves
    • 🎯 Key Takeaways
    • πŸ“ Debugging Checklist
    • Confusion Matrix Deep Dive: Beyond Aggregate Accuracy
    • Per-Class Error Analysis: Finding the Weakest Links
    • Failure Case Analysis: Learning from Mistakes
    • Confidence Analysis: Does the Model Know What It Does Not Know?
    • Error Analysis Report: Structured Communication of Findings
    • 🎯 Key Takeaways
    • πŸ“ Error Analysis Checklist
    • πŸŽ‰ Congratulations!
  • Phase 18: Low-Code AI Tools
    • Phase 18: Low-Code AI Tools Catalog
    • Assignment: Build and Deploy a Complete Low-Code ML Application
    • Challenges: Low-Code AI Tools
    • Post-Quiz: Low-Code AI Tools
    • Pre-Quiz: Low-Code AI Tools
    • Phase 18: Low-Code AI Tools β€” Start Here
    • Gradio Basics: From Python Function to Web Interface in Three Lines
    • Key Gradio Components
    • Image Classification Interface
    • Text Generation Interface: Controlling LLM Output
    • Advanced Layouts with Blocks: Beyond Simple Input-Output
    • Multi-Modal Interface: Combining Data Types
    • Sharing and Deployment: From Local to Global
    • Deployment Options
    • 🎯 Key Takeaways
    • πŸ“ Practice Exercises
    • πŸ”— Resources
    • Streamlit Basics: Python Scripts that Become Web Apps
    • Running the App
    • ML Model Deployment App: Interactive Classification with Sidebar Controls
    • Session State: Persisting Data Across Reruns
    • Caching for Performance: Avoiding Redundant Computation
    • Interactive Data Dashboard: Filters, KPIs, and Multi-Tab Visualization
    • 🎯 Key Takeaways
    • πŸ“ Practice Exercises
    • πŸ”— Resources
    • Part 1: Introduction to Hugging Face Spaces
    • Part 6: Advanced Space Configuration
    • 🎯 Key Takeaways
    • πŸ“ Practice Exercises
    • πŸ”— Resources
    • Part 1: Introduction to AutoML
    • 🎯 Key Takeaways
    • πŸ“ Practice Exercises
    • πŸ”— Resources
    • Data Preparation: Building a Realistic Churn Dataset
    • Model Training with AutoML: From Data to Tuned Model in Minutes
    • Build Gradio Interface: Making the Model Accessible
    • Create Deployable App Files: From Notebook to Production
    • Production Considerations: From Demo to Reliable System
    • 🎯 Key Takeaways
    • πŸ“ Project Extensions
    • πŸŽ“ Final Exercise
    • πŸ† Congratulations!
  • Phase 19: AI Safety & Red Teaming
    • Phase 19: AI Safety & Red Teaming Catalog
    • Phase 18 Assignment: Secure AI System Implementation
    • Phase 18 Challenges: AI Security & Red Teaming
    • Phase 18 Quiz: AI Safety & Red Teaming
    • Phase 19: AI Safety & Red Teaming β€” Start Here
    • Part 1: Understanding Prompt Injection Attacks
    • Summary & Best Practices
    • Practice Exercises
    • OpenAI Moderation API: Production-Grade Content Classification
    • Toxicity Detection with Detoxify: Local ML-Based Analysis
    • Custom Content Filters: Domain-Specific Safety Rules
    • Multi-Layer Content Moderation: Defense in Depth for Safety
    • Moderation Policies and Response Strategies: Beyond Binary Blocking
    • Production Implementation: Integrating All Components
    • Summary & Best Practices
    • Understanding PII Types: A Risk-Based Classification
    • Basic PII Detection with Regex: Fast Pattern Matching
    • Advanced PII Detection with Presidio: ML-Powered Entity Recognition
    • Anonymization Strategies: Choosing the Right Approach
    • Privacy Compliance: GDPR and CCPA Requirements
    • Production PII Protection Pipeline: End-to-End System
    • Summary & Best Practices
    • Understanding Bias Types: Where Unfairness Enters the ML Pipeline
    • Fairness Metrics: Mathematical Definitions of Equality
    • Visualizing Bias: Making Disparities Visible
    • Bias Mitigation Strategies: Pre-processing, In-processing, and Post-processing
    • Bias in LLMs: Detecting and Measuring Language Model Bias
    • Building Fairness-Aware Systems: Runtime Monitoring
    • Summary & Best Practices
    • Part 1: Red Team Methodology
    • Summary & Best Practices
  • Phase 20: Real-Time & Streaming AI
    • Phase 20: Real-Time & Streaming AI Catalog
    • Phase 20: Real-Time Streaming β€” Start Here
    • Streaming LLM Responses
    • WebSocket Connections for Real-Time AI Chat
    • Streaming RAG Pipeline
    • Production-Grade Streaming Systems

Reference

  • Phase 21: Quizzes
    • Phase 21: Quizzes Catalog
    • Phase 5: Neural Networks - Post-Quiz
    • Phase 5: Neural Networks - Pre-Quiz
    • Phase 7: Retrieval-Augmented Generation (RAG) - Pre-Quiz
  • Phase 22: References & Hands-On Labs
    • Phase 22: References & Hands-On Labs Catalog
    • Cloud Platform Labs & Resources ☁️
    • Microsoft Hands-On Labs πŸ§ͺ
    • AI/ML Video Learning Resources πŸŽ₯
  • Phase 23: Glossary & Foundations
    • Phase 23: Glossary & Foundations Catalog
    • AI/ML Glossary

Resources

  • Master Study Guide
  • Career Roadmap
  • Interview Preparation
  • AI/ML Decision Matrices & Comparison Guides
  • Setup Guide - Zero to AI
  • AI/ML Learning Checklist βœ…

Research

  • Phase 24: Advanced Deep Learning
    • Phase 24: Advanced Deep Learning Catalog
    • Phase 24: Advanced Deep Learning β€” Start Here
    • GAN Mathematics: Comprehensive Theory
    • Generative Adversarial Networks (GANs)
    • Wasserstein GAN (WGAN)
    • Variational Autoencoders (VAEs)
    • Neural Ordinary Differential Equations (Neural ODEs)
    • InfoGAN: Information-Maximizing Generative Adversarial Networks - Comprehensive Theory
    • Vision Transformers (ViT)
    • 1. Motivation: Multi-Scale Latent Representations
    • 2. ELBO for Hierarchical VAE
    • Summary
    • Conditional GANs (cGAN): Comprehensive Theory
    • 1. Motivation: Discrete Latent Spaces
    • 2. Vector Quantization Layer
    • Summary
    • Advanced Vector Quantized Variational Autoencoders (VQ-VAE): Theory and Practice
    • 1. Motivation: Exact Likelihood
    • 3. Composing Flows
    • 6. Modern Flow Architectures
    • Summary
    • Advanced Normalizing Flows Theory
    • 1. Motivation: Iterative Refinement
    • 2. Forward Diffusion Process
    • 3. Reverse Process & Training
    • 4. Sampling (Reverse Process)
    • Summary
    • Advanced Diffusion Models Theory
    • 1. BERT vs GPT
    • 1.5. Masked Language Modeling: Deep Dive
    • 2.5. Segment Embeddings and Special Tokens
    • 3.5. BERT Training: Advanced Techniques
    • Summary
    • 1. GPT vs BERT
    • 1.5. Causal Masking: Mathematical Foundation
    • 2.5. Positional Encoding: Theory and Variants
    • 3.5. Scaling Laws for Language Models
    • 4.5. Advanced Generation Strategies: Complete Analysis
    • Summary
    • 1. Attention Complexity Problem
    • 2. Linformer: Low-Rank Attention
    • 3. Performer: Kernel Approximation
    • Summary
    • Advanced Efficient Transformers Theory
    • 1. Message Passing Framework
    • 2. Graph Convolutional Network (GCN)
    • Advanced Message Passing Theory
    • 3. Graph Attention Network (GAT)
    • Summary
    • Advanced Graph Neural Networks Theory
    • 1. Meta-Learning Problem
    • 2. MAML Algorithm
    • Summary
    • Advanced Meta-Learning Theory
    • Advanced Meta-Learning and MAML Theory
    • 1. Few-Shot Classification
    • 2. Algorithm
    • Summary
    • Advanced Prototypical Networks Theory
    • Neural Radiance Fields (NeRF): Comprehensive Theory
    • 1. Style-Based Generator
    • Summary
    • Advanced StyleGAN: Mathematical Foundations and Architecture Deep Dive
    • 1. Contrastive Learning Framework
    • Summary
    • Advanced Contrastive Learning Theory
    • Advanced Adversarial Robustness Theory
    • 1. Adversarial Examples
    • 5. PGD Attack
    • Summary
    • 1. Knowledge Distillation
    • Summary
    • Advanced Knowledge Distillation Theory
    • 1. Point Cloud Basics
    • 2. T-Net (Transformation Network)
    • 3. PointNet Architecture
    • Summary
    • Advanced Point Cloud Networks Theory
    • 1. CycleGAN Theory
    • Summary
    • 1. Progressive Growing
    • Summary
    • Advanced Neural Network Interpretability Theory
    • 1. GradCAM Theory
    • Summary
    • Advanced Curriculum Learning Theory
    • 1. Curriculum Learning
    • Summary
    • 1. Catastrophic Forgetting
    • Summary
    • Advanced Continual Learning Theory
    • Advanced Continual Learning Theory
    • Advanced Continual Learning: Mathematical Foundations and Modern Approaches
    • Advanced Neural Architecture Search Theory
    • 1. DARTS: Differentiable Architecture Search
    • Summary
    • Advanced Neural Architecture Search Theory
    • Advanced Neural Architecture Search: Mathematical Foundations and Modern Methods
    • 1. Bahdanau Attention (Additive)
    • 2. Luong Attention (Multiplicative)
    • 3. Scaled Dot-Product Attention
    • 4. Multi-Head Attention
    • Summary
    • Advanced Attention Mechanisms: Mathematical Foundations and Modern Architectures
    • 1. Memory Networks Concept
    • 2. Content-Based Addressing
    • Summary
    • Advanced Memory Networks Theory
    • Advanced Memory Networks: Mathematical Foundations and Modern Architectures
    • 1. Capsule Networks Concept
    • 2. Dynamic Routing
    • 5. Margin Loss
    • Summary
    • Advanced Capsule Networks Theory
    • Advanced Capsule Networks Theory
    • Advanced Capsule Networks: Mathematical Foundations and Modern Architectures
    • 1. Score Matching
    • 3. Langevin Dynamics Sampling
    • 6. Annealed Langevin Dynamics
    • Summary
    • Advanced Score-Based Generative Models Theory
    • Advanced Score-Based Generative Models: Mathematical Foundations and Modern Architectures
    • 1. Energy-Based Models
    • 2. Contrastive Divergence
    • Summary
    • Advanced Energy-Based Models Theory
    • Advanced Energy-Based Models: Mathematical Foundations and Modern Architectures
    • 1. Mixture of Experts Concept
    • Summary
    • Advanced Mixture of Experts: Mathematical Foundations and Modern Architectures
    • 1. Implicit Neural Representations
    • Summary
    • Advanced Implicit Neural Representations Theory
    • 1. Gaussian Process Theory
    • 1.5. Gaussian Process Theory: Deep Mathematical Foundations
    • 7.5. Sparse Gaussian Processes: Scaling to Large Data
    • Summary
    • Advanced GP Topics and Extensions
    • Advanced Gaussian Processes Theory
    • 1. Bayesian Neural Networks
    • Bayesian Neural Networks: Deep Theory and Variational Inference
    • Summary
    • Advanced Bayesian Neural Networks Theory
  • Phase 25: Reinforcement Learning
    • Phase 25: Reinforcement Learning Catalog
    • Phase 25: Reinforcement Learning β€” Start Here
    • 01: Markov Decision Processes (MDPs)
    • 02: Value-Based Methods (Q-Learning)
    • 03: Deep Q-Networks (DQN)
    • 04: Policy-Based Methods (REINFORCE)
    • 05: Advanced Topics & Real-World Applications
    • 06: Practical Exercises & Implementations
  • Phase 26: Time Series Analysis & Forecasting
    • Phase 26: Time Series Analysis & Forecasting Catalog
    • Phase 26: Time Series Analysis β€” Start Here
    • 01: Time Series Fundamentals
    • 02: Classical Statistical Methods
    • 03: Facebook Prophet
    • 04: Deep Learning for Time Series
    • 05: Advanced Techniques & Applications
    • 06: Practical Applications & Exercises
  • Phase 27: Causal Inference
    • Phase 27: Causal Inference Catalog
    • Phase 27: Causal Inference β€” Start Here
    • 01: Causal Fundamentals
    • 02: Causal Graphs & DAGs
    • 03: Experimental Design
    • 04: Observational Methods
    • 05: Advanced Topics & Applications
    • 🎯 Quasi-Experimental Designs

Production

  • Phase 28: Practical Data Science
    • Phase 28: Practical Data Science Catalog
    • Data Science Interview Prep: The 30 Questions That Actually Come Up
    • Data Science Interview Prep β€” Part 2: Q16 to Q30
    • Phase 28: Practical Data Science
    • Computer Vision
      • Image Processing Basics: NumPy Arrays, PIL, and Computer Vision Fundamentals
      • CNNs From Scratch: Understanding Convolution, Pooling, and Classification
      • Transfer Learning: Leveraging Pretrained Models for Custom Image Tasks
      • Object Detection: From Bounding Boxes to Modern Detectors
      • Image Segmentation: Pixel-Level Understanding
    • Deep Learning & NLP
      • Transformers from Scratch: Self-Attention & Positional Encoding
      • BERT Text Classification: Fine-Tuning Transformers on Your Own Data
      • LLM Application Patterns: Building Production-Grade AI Features
      • Text Preprocessing: From Raw Text to Features
      • LLM Fine-Tuning with LoRA and QLoRA
    • Machine Learning
      • sklearn Pipelines: The Right Way to Build ML Workflows
      • Model Selection: Cross-Validation, Learning Curves & Bias-Variance
      • Ensemble Methods: Bagging, Boosting & Stacking
      • Imbalanced Datasets: Handling Class Imbalance Properly
      • Model Interpretability: SHAP, Permutation Importance & Partial Dependence
      • End-to-End ML Project: Customer Churn Prediction
    • Python for Data Science
      • Pandas Fundamentals: The Operations Every Data Scientist Must Know
      • Exploratory Data Analysis: A Systematic Framework
      • Data Visualization: From Exploratory Plots to Publication-Quality Figures
      • Data Cleaning Pipelines: From Messy Data to Model-Ready Features
      • Feature Engineering: Turning Raw Data into Model Fuel
    • Recommender Systems & Causal Inference
      • Collaborative Filtering: Building Recommender Systems from User Behavior
      • Content-Based Filtering: Recommending by Item Similarity
      • Neural Collaborative Filtering: Deep Learning for Recommendations
      • Causal Inference: Moving Beyond Correlation
      • Difference-in-Differences: Causal Inference from Natural Experiments
      • Building an A/B Testing Platform
    • Solutions
      • Solutions: Computer Vision Track
      • Solutions: Deep Learning & NLP Track
      • Solutions: Machine Learning Track
      • Solutions: Python & Data Science Track
      • Solutions: Recommender Systems & Causal Inference Track
      • Solutions: SQL & Data Engineering Track
      • Solutions: Statistics & MLOps Track
      • Solutions: Time Series & Forecasting Track
    • SQL & Data Engineering
      • Advanced SQL: Window Functions, CTEs, and Query Patterns That Actually Matter
      • SQL Query Optimization: From Slow to Fast in 10 Patterns
      • Data Pipelines with Airflow: DAGs, Operators, and Production Patterns
      • PySpark Fundamentals: Distributed Data Processing for Large Datasets
      • dbt: Data Modeling for Analytics Engineering
      • Streaming Data: Kafka, Windowed Aggregations, and Real-Time Pipelines
    • Statistics & MLOps
      • Hypothesis Testing: The Statistics Behind A/B Tests and Decisions
      • Bayesian Thinking: Updating Beliefs with Data
      • Model Deployment with FastAPI: From Notebook to Production API
      • ML Monitoring & Drift Detection: Keeping Models Healthy in Production
      • Feature Stores: From Training to Production Without Data Leakage
    • Time Series Forecasting
      • Time Series Fundamentals: Decomposition, Stationarity, and Autocorrelation
      • ARIMA & SARIMA: Statistical Forecasting for Practitioners
      • Prophet: Scalable Forecasting for Business Time Series
      • LSTM for Time Series: Sequence Modeling with Deep Learning
      • Anomaly Detection: Finding the Signal in Noisy Time Series
      • Forecasting Competition: ARIMA vs Prophet vs LSTM
  • Phase 29: AI Hardware & Validation
    • Phase 29: AI Hardware & Validation Catalog
    • Section 1: Hardware Validation
    • Section 2: Kernel Validation
    • Section 3: Framework Validation
    • Section 4: Model Performance Validation
    • Section 5: End-to-End Pipeline Validation
    • Section 6: Distributed Training Validation
    • Section 7: Datacenter Validation
    • Section 8: Regression & Release Validation
    • Chapter 9: Industry AI Benchmarking & Performance Analysis
    • Part 1 β€” LLM Performance Metrics: What Gets Measured
    • Part 2 β€” API Performance Benchmarking
    • Part 3 β€” Hardware Benchmarking: AA-SLT (System Load Test)
    • Part 4 β€” Hardware Benchmarking: AA-AgentPerf
    • Part 5 β€” Intelligence Benchmarking
    • Part 6 β€” Multi-Modal Benchmarking
    • Part 7 β€” Other Industry Benchmarks
    • Exercises
    • Key Takeaways
    • Lab 01: Hardware Validation
    • Lab 02: Kernel Validation
    • Lab 03: Model Performance Validation
    • Lab 04: Regression & Release Validation Suite
    • Lab 05: Distributed Training Validation
    • Lab 06: Framework Validation
    • Lab 07 β€” GPGPU Backends: CoreML Β· DirectML Β· Vulkan
    • Part 1 β€” CoreML (Apple)
    • Part 2 β€” DirectML (Microsoft / Windows)
    • Part 3 β€” Vulkan (Cross-Platform)
    • Part 4 β€” Cross-Backend Parity Validation
    • Part 5 β€” Exercises
    • Key Takeaways
    • Lab 08 β€” Industry Benchmarking: Hands-On
    • Part 1 β€” TTFT & Output Speed Measurement
    • Part 2 β€” Mini AA-SLT (System Load Test)
    • Part 3 β€” SLO-Based Capacity Planning (AA-AgentPerf Style)
    • Part 4 – Hardware Comparison Dashboard
    • Part 5 β€” Mini Intelligence Eval Runner
    • Exercises
    • Key Takeaways
  • Phase 30: Inference Optimization & Model Serving
    • Phase 30: Inference Optimization & Model Serving Catalog
    • vLLM Quickstart: High-Throughput Model Serving

About

  • Contributing to Zero to AI
  • Changelog
  • Contributor Covenant Code of Conduct
  • Complete Reference Materials β€” Videos, Repos, Courses, Papers
  • Workspace Learning Review
  • License
  • Support
Back to top
View this page
Edit this page

The Data Science LifecycleΒΆ

communication

Photo by Headway on Unsplash

In these lessons, you’ll explore some of the aspects of the Data Science lifecycle, including analysis and communication around data.

TopicsΒΆ

  1. Introduction

  2. Analyzing

  3. Communication

CreditsΒΆ

These lessons were written with ❀️ by Jalen McGee and Jasmine Greenaway

Next
Introduction to the Data Science Lifecycle
Previous
Making Meaningful Visualizations
Copyright © 2024, Pavan Mudigonda
Made with Sphinx and @pradyunsg's Furo
On this page
  • The Data Science Lifecycle
    • Topics
    • Credits