Run this notebook: Open in Colab Open in Kaggle

Reading In Files¶

Loading data from external files is the starting point of virtually every data analysis project. Pandas supports a wide range of file formats out of the box, including CSV, TSV, JSON, Excel, and many more. Each format has its own read_* function with parameters for handling delimiters, headers, data types, and encoding.

This notebook demonstrates reading CSV files with pd.read_csv(), tab-separated files with pd.read_table(), JSON files with pd.read_json(), and Excel workbooks with pd.read_excel() (including sheet selection). It also covers display configuration with pd.set_option() for controlling how many rows and columns are shown, and basic inspection methods like .info(), .shape, .head(), .tail(), column selection, .loc[], and .iloc[] for getting an initial understanding of your data.

import pandas as pd

df = pd.read_csv(r"C:\Users\alexf\OneDrive\Documents\Pandas Tutorial\countries of the world.csv")
df

#df = pd.read_csv(r"C:\Users\alexf\OneDrive\Documents\Pandas Tutorial\countries of the world.txt", sep = '\t')
#df

df = pd.read_table(r"C:\Users\alexf\OneDrive\Documents\Pandas Tutorial\countries of the world.csv", sep = ',')
df

df = pd.read_json(r"C:\Users\alexf\OneDrive\Documents\Pandas Tutorial\json_sample.json")
df

df2 = pd.read_excel(r"C:\Users\alexf\OneDrive\Documents\Pandas Tutorial\world_population_excel_workbook.xlsx", sheet_name = 'Sheet1')
df2

pd.set_option('display.max.rows', 235)
pd.set_option('display.max.columns', 40)

df2.info()

df2.shape

df2.head(10)

df2.tail(10)

df2['Rank']

df2.loc['Uzbekistan']

df2.iloc[224]