LM Essential Pandas Functions for Data Analysis & Plotting

🧩 1. Essential Pandas Functions for Data Analysis & Plotting

To create statistical diagrams with Matplotlib later on, students should be familiar with the following Pandas functions and workflow steps:


Topic Important Functions Meaning
Load a file pd.read_csv() Load a CSV file
Quick look df.head(), df.tail() Show the first/last rows
Information df.info() Columns, data types, missing values
Statistics df.describe() Basic statistical summary
Count values df['col'].value_counts() e.g., number of males/females
Categories of value_counts df['col'].value_counts().index Returns category names (e.g., ['male','female'])
Frequencies of value_counts df['col'].value_counts().values Returns counts (e.g., [577, 314])
Select one column df['col'] Extract a single column
Select multiple columns df[['col1', 'col2']] Extract several columns
Filter rows df[df['Age'] > 18] Select rows matching a condition
Filter with multiple conditions df[(df['Sex']=='female') & (df['Survived']==1)] Combine logical conditions
Filter by multiple values df[df['col'].isin(list)] Select rows where values are in a given list
Sort data df.sort_values('col') Sort DataFrame by column
Check missing values df.isna().sum() Count missing values per column
Drop missing rows df.dropna() Remove rows with missing values
Fill missing values df.fillna(value) Replace missing values
Create a new column df['new'] = ... Add derived/computed values
Group data df.groupby('col') Group rows by a category
Aggregate data df.groupby('col').mean() Compute statistics per group
Set x-axis limits (plots) plt.xlim(min, max) Define start and end of the x-axis

Updated: