Pandas Examples - Consolidated & OrganizedΒΆ
This directory contains a comprehensive collection of pandas tutorials, exercises, and real-world projects, consolidated from multiple sources and organized for progressive learning.
π Directory StructureΒΆ
01-basics/ΒΆ
Beginner-friendly tutorials covering fundamental pandas concepts
Pandas 101 Series (YouTube Course):ΒΆ
Pandas 101 - Pandas Series and Dataframes.ipynb- Core data structuresPandas 101 - Reading in Files.ipynb- Loading data from various sourcesPandas 101 - Data Cleaning in Pandas.ipynb- Handling missing data, duplicatesPandas 101 - Filtering and Ordering in Pandas.ipynb- Boolean indexing, sortingPandas 101 - Group by and Aggregating in Pandas.ipynb- Grouping and aggregationsPandas 101 - Indexing in Pandas.ipynb- Row/column selection techniquesPandas 101 - Merge, Join, and Concatenate in Pandas.ipynb- Combining DataFramesPandas 101 - Visualizing Data in Pandas.ipynb- Creating plotsPandas 101 - Exploratory Data Analysis in Pandas.ipynb- EDA techniques
DataFrame Fundamentals:ΒΆ
DataFrames I.ipynb- Introduction to DataFramesDataFrames II.ipynb- Intermediate DataFrame operationsDataFrames III.ipynb- Advanced DataFrame techniques
Pandas & NumPy Integration:ΒΆ
pandas-numpy-lessons/- Three lessons on pandas with NumPylesson1/- Basic integrationlesson2/- Intermediate techniqueslesson3/- Advanced operations
Topics Covered:
Series and DataFrame creation
Reading CSV, Excel, JSON files
Data cleaning (missing values, duplicates, formatting)
Indexing and selection (.loc, .iloc, boolean indexing)
Filtering, sorting, and ordering
Basic aggregations and statistics
Recommended Order:
Start with Pandas 101 series (in order listed above)
Work through DataFrames I-III
Explore pandas-numpy-lessons
02-intermediate/ΒΆ
Intermediate topics for deepening pandas knowledge
Comprehensive course from βData Analysis with Pandas and Pythonβ:
Core Operations:
DataFrames 1.ipynb- DataFrame fundamentals (review)DataFrames 2.ipynb- Intermediate operationsDataFrames 3.ipynb- Advanced operationsGroupBy.ipynb- Advanced grouping and aggregationInput and Output.ipynb- Reading/writing various formatsMerge, Join and Concat.ipynb- Combining datasetsMultiindex.ipynb- Hierarchical indexingOptions and Settings.ipynb- Customizing pandas behavior
Data Operations:
Filtering Methods.ipynb- Advanced filtering techniquesMissing Data.ipynb- Handling NaN and None valuesText Methods and Filtering.ipynb- String operationsWorking with Dates and Times.ipynb- DateTime operations
Visualization & Analysis:
Visualizations.ipynb- Plotting with pandasWorking with Duplicates.ipynb- Finding and removing duplicates
Topics Covered:
Advanced groupby operations (multiple aggregations, transformations)
Multi-level indexing (hierarchical data)
DateTime manipulation and time series
String methods and text processing
Complex filtering and boolean logic
Data type conversions and optimization
Handling large datasets efficiently
03-exercises/ΒΆ
Practice exercises to test and improve your pandas skills
100 Pandas Puzzles:ΒΆ
100-pandas-puzzles.ipynb- 100 curated pandas challenges100-pandas-puzzles-with-solutions.ipynb- Same with solutions
Description: Inspired by 100 NumPy exercises, these puzzles focus on core DataFrame and Series manipulation, covering indexing, grouping, aggregating, and data cleaning.
Difficulty Levels:
β ββ Easy - Basic operations
β β β Medium - Combining multiple techniques
β β β Hard - Complex multi-step solutions
Topic-Based Exercises:ΒΆ
Located in topic-based-exercises/:
01 - Getting & Knowing Your Data:
Chipotle, Occupation, World Food Facts datasets
Basic exploration, info, describe, shape, columns
02 - Filtering & Sorting:
Chipotle, Euro12, Fictional Army datasets
Boolean indexing, sorting, conditional selection
03 - Grouping:
Alcohol Consumption, Occupation, Regiment datasets
GroupBy operations, aggregations, transformations
04 - Apply:
Students Alcohol Consumption, US Crime Rates
Apply, map, applymap functions
05 - Merge:
Auto MPG, Fictitious Names, Housing Market
Merge, join, concat operations
06 - Stats:
US Baby Names, Wind Stats
Statistical operations, rolling windows
07 - Visualization:
Chipotle, Online Retail, Scores, Tips, Titanic
Matplotlib integration, plotting techniques
08 - Creating Series and DataFrames:
Pokemon dataset
Programmatic DataFrame creation
09 - Time Series:
Apple Stock, Financial Data, Investor Flows
DateTime indexing, resampling, time-based operations
10 - Deleting:
Iris, Wine datasets
Dropping rows, columns, duplicates
11 - Indexing:
Advanced indexing exercises
Setting, resetting, multi-level indices
Each topic includes:
Exercises.ipynb (practice problems)
Solutions.ipynb (detailed solutions with explanations)
04-advanced/ΒΆ
Advanced techniques and specialized topics
pandas-cookbook/ΒΆ
A comprehensive cookbook with advanced recipes.
Located in: pandas-cookbook/
Contents:
Advanced data manipulation techniques
Performance optimization strategies
Memory-efficient operations
Complex transformations and aggregations
Integration with other libraries
Topics:
Custom aggregation functions
Window functions and rolling operations
Categorical data optimization
Working with large datasets
Advanced indexing patterns
Data pipeline design
05-real-world-projects/ΒΆ
Real-world data analysis projects
Projects:
Apple Health Data.ipynb
Analyzing personal health data exports
Time series analysis of activity, heart rate, sleep
Visualization of health trends
Electronic Production India.ipynb
Economic data analysis
Industry production trends
Regional comparisons
Datasets:
bigmac.csv- Big Mac Index data (purchasing power parity)chicago.csv- Chicago city datacrime_india.csv- Crime statisticsAdditional real-world datasets
Skills Applied:
Data cleaning and preprocessing
Exploratory data analysis (EDA)
Time series analysis
Statistical analysis
Data visualization
Insight generation and reporting
π― Learning PathΒΆ
Beginner (0-3 weeks)ΒΆ
01-basics/
βββ Complete Pandas 101 series (9 notebooks)
βββ DataFrames I-III
βββ Practice: First 30 puzzles from 100-pandas-puzzles
Time: 2-3 hours daily
Goal: Understand Series, DataFrames, basic operations
Intermediate (3-8 weeks)ΒΆ
02-intermediate/
βββ All notebooks (focus on GroupBy, MultiIndex, DateTime)
03-exercises/
βββ Complete 100 pandas puzzles
βββ Topics 01-05 from topic-based-exercises
Time: 1-2 hours daily
Goal: Master grouping, merging, advanced indexing
Advanced (8-12 weeks)ΒΆ
03-exercises/
βββ Topics 06-11 (Stats, Viz, Time Series)
04-advanced/
βββ pandas-cookbook (select relevant recipes)
05-real-world-projects/
βββ Complete both projects
Time: 1-2 hours daily
Goal: Apply techniques to real-world scenarios
π Content StatisticsΒΆ
Total Notebooks: 151 notebooks
Total Size: 173 MB
Exercise Sets: 100+ puzzles + 11 topic-based sets
Real-World Projects: 2 complete projects
Datasets: 20+ CSV/Excel files included
ποΈ Source RepositoriesΒΆ
This consolidated collection combines content from:
100-pandas-puzzles - https://github.com/ajcr/100-pandas-puzzles
Data-Analysis-with-Pandas-and-Python - Udemy course materials
data-analysis-with-python-and-pandas - Another pandas course
pandas_exercises - https://github.com/guipsamora/pandas_exercises
pandas-and-numpy - Integration tutorial
pandas-cookbook - Advanced recipes and techniques
PandasYouTubeSeries - YouTube Pandas 101 course
π Getting StartedΒΆ
InstallationΒΆ
# Install pandas
pip install pandas
# Optional: Install visualization libraries
pip install matplotlib seaborn
# Optional: Install additional data libraries
pip install openpyxl xlrd
Quick StartΒΆ
import pandas as pd
import numpy as np
# Verify installation
print(f"pandas version: {pd.__version__}")
# Create a simple DataFrame
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['NYC', 'SF', 'LA']
})
print(df)
π‘ Study TipsΒΆ
Hands-On Practice
Type all examples (donβt copy-paste)
Modify examples to test understanding
Complete exercises before looking at solutions
Use Documentation
df.method?in Jupyter for quick helpOfficial pandas docs: https://pandas.pydata.org/docs/
Practice Daily
30-60 minutes daily beats weekend cramming
Complete 3-5 exercises per session
Learn Shortcuts
Method chaining for cleaner code
Vectorized operations over loops
Use
.pipe()for custom operations
Benchmark Performance
Use
%timeitto compare approachesLearn memory-efficient techniques
Understand when to use
.locvs.iloc
Real Data Practice
Use Kaggle datasets for practice
Analyze your own data (fitness, finance, etc.)
Contribute to open-source projects
π Key Pandas ConceptsΒΆ
Must-Know OperationsΒΆ
Selection:
.loc[],.iloc[], boolean indexingFiltering: Boolean conditions,
.query()Grouping:
.groupby(), aggregationsMerging:
.merge(),.join(),.concat()Reshaping:
.pivot(),.melt(),.stack(),.unstack()DateTime:
.dtaccessor, resamplingStrings:
.straccessor methodsMissing Data:
.isna(),.fillna(),.dropna()
Performance TipsΒΆ
Use categorical dtype for strings with few unique values
Use
.query()for complex boolean operationsPrefer vectorized operations over
.apply()Use
.pipe()for readable method chainsConsider chunking for very large datasets
π Additional ResourcesΒΆ
Official DocumentationΒΆ
Books (Free Online)ΒΆ
Python for Data Analysis (3rd Edition) by Wes McKinney (pandas creator)
Practice DatasetsΒΆ
π Certification ReadinessΒΆ
This collection prepares you for:
Data Analyst roles
Data Science positions (pandas foundation)
Python for Data Analysis certifications
Kaggle competitions
Skills Youβll Master:
Data cleaning and preprocessing
Exploratory data analysis (EDA)
Statistical analysis
Data visualization
Time series analysis
Data transformation and aggregation
π Next Steps After CompletionΒΆ
Apply to Real Projects
Analyze public datasets
Contribute to data science blogs
Build a portfolio on GitHub
Advanced Topics
Dask for big data (out-of-memory datasets)
Polars (faster alternative to pandas)
PySpark for distributed computing
Domain Applications
Finance: stock analysis, portfolio optimization
Healthcare: patient data analysis
Marketing: customer segmentation
Sports: performance analytics
π LicenseΒΆ
Individual directories may have their own licenses. Please refer to original repository licenses for attribution and usage rights.
Last Updated: December 2024
Consolidated By: Automated organization process
Original Size: 188 MB (166 files across 7 directories)
Consolidated Size: 173 MB (151 notebooks - 8% reduction)
Organization: Beginner β Intermediate β Exercises β Advanced β Projects