Run this notebook: Open in Colab Open in Kaggle

import pandas as pd
import numpy as np

print(f"pandas version: {pd.__version__}")
print(f"numpy version: {np.__version__}")
print(f"\n✅ pandas is ready to use!")

# Quick test - Create a simple DataFrame
df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 28],
    'City': ['NYC', 'SF', 'LA', 'Chicago'],
    'Salary': [70000, 85000, 92000, 78000]
})

print("\nSample DataFrame:")
print(df)
print(f"\nShape: {df.shape}")
print(f"Columns: {list(df.columns)}")
print(f"\n🎉 You're ready to start learning pandas!")

📊 What You’ll Learn¶

Core Skills¶

✅ Reading data from CSV, Excel, JSON, SQL
✅ Data cleaning (missing values, duplicates, formatting)
✅ Filtering, sorting, and selection
✅ GroupBy operations and aggregations
✅ Merging, joining, and concatenating DataFrames
✅ DateTime manipulation and time series
✅ String operations and text processing
✅ Data visualization with pandas

Advanced Skills¶

✅ MultiIndex and hierarchical data
✅ Custom aggregation functions
✅ Window functions and rolling operations
✅ Performance optimization
✅ Memory-efficient operations
✅ Method chaining and .pipe()

💡 Pro Tips for Success¶

1. Practice Daily¶

# 30-60 minutes daily > 3 hours on weekends
# Consistency builds muscle memory

2. Type, Don’t Copy¶

# Typing code builds understanding
# Copy-paste doesn't stick

3. Use Documentation¶

# In Jupyter:
df.groupby?  # Shows documentation

4. Experiment¶

# Modify examples
# Break things (safely)
# Understand WHY it works

5. Benchmark Performance¶

# Compare different approaches
%timeit df.groupby('column').sum()

6. Read Error Messages¶

# Errors are learning opportunities
# Read them carefully
# Google specific error messages

📖 Essential pandas Operations¶

Must-Know Methods¶

Selection:

df.loc[row, column]      # Label-based
df.iloc[0, 0]            # Position-based
df[df['Age'] > 25]       # Boolean indexing
df.query('Age > 25')     # SQL-like syntax

GroupBy:

df.groupby('column').sum()
df.groupby('column').agg({'col1': 'mean', 'col2': 'sum'})
df.groupby(['col1', 'col2']).size()

Merging:

pd.merge(df1, df2, on='key')
df1.join(df2)
pd.concat([df1, df2])

Missing Data:

df.isna()                # Check for NaN
df.fillna(0)             # Fill with value
df.dropna()              # Drop rows with NaN
df.fillna(method='ffill') # Forward fill

📈 Progress Tracking¶

Track Your Journey¶

Week 1-3: Basics ☐

Completed Pandas 101 series
Finished DataFrames I-III
Solved first 30 puzzles

Week 4-8: Intermediate ☐

All 02-intermediate notebooks
Completed 100 pandas puzzles
Topics 01-05 exercises

Week 9-12: Advanced ☐

Topics 06-11 exercises
pandas-cookbook recipes
Both real-world projects

🔗 Additional Resources¶

Official Documentation¶

Books¶

Python for Data Analysis by Wes McKinney (pandas creator)
Free online version available!

Practice Datasets¶

Kaggle - Thousands of datasets
UCI ML Repository
FiveThirtyEight

📊 Collection Statistics¶

Total Notebooks: 151
Exercise Sets: 100+ puzzles + 11 topic categories
Real Projects: 2 complete analyses
Datasets: 20+ CSV/Excel files
Total Size: 173 MB

🎓 Ready to Begin?¶

Next Steps:¶

Choose your path above (Beginner, Intermediate, or Advanced)
Read the README.md for detailed topic coverage
Start with: Pandas 101 - Series and Dataframes

Remember:¶

Start slow, build solid foundations
Practice daily for best results
Don’t skip exercises
Apply to real data when possible

Good luck on your pandas journey! 🐼