Run this notebook: Open in Colab Open in Kaggle

Indexing¶

The index is a fundamental concept in Pandas that determines how rows are labeled, accessed, and aligned during operations. Understanding indexing – including setting custom indices, resetting to defaults, using single and multi-level indices, and accessing data with .loc[] and .iloc[] – is essential for efficient data manipulation.

This notebook covers reading data with a custom index column, switching between custom and default indices with set_index() and reset_index(), label-based row access with .loc[], position-based access with .iloc[], creating hierarchical (multi-level) indices for grouped data, and sorting multi-indexed DataFrames. Multi-level indexing is particularly important when working with panel data or any dataset that has a natural hierarchy, such as continent-country relationships.

import pandas as pd

df = pd.read_csv(r"C:\Users\alexf\OneDrive\Documents\Pandas Tutorial\world_population.csv")

df

df = pd.read_csv(r"C:\Users\alexf\OneDrive\Documents\Pandas Tutorial\world_population.csv", index_col = "Country")

df

df.reset_index(inplace=True)

df

df.set_index('Country', inplace = True)

df

df.loc['Albania']

df.iloc[1]

df.reset_index(inplace = True)

df.set_index(['Continent','Country'], inplace=True)

df.sort_index()

#pd.set_option('display.max.rows', 235)

df.loc['Africa','Angola']

df.iloc[1]