LM Edit

๐Ÿ› ๏ธ Working with CSV Data

In this section, youโ€™ll learn how to read, explore, and manipulate data from a CSV file using pandas.

CSV Table ๐Ÿ“ฅ Download CSV file

๐Ÿ“ฅ Import CSV into a DataFrame

import pandas as pd

url = "https://geomoer.github.io/moer-base-python/assets/tests/unit07/csv_example.csv"

df = pd.read_csv(url)

๐Ÿ“‹ Get Column Names

print(df.columns)

๐Ÿ” View Rows

df.head()      # First 5 rows
df.head(10)    # First 10 rows
df.tail(3)     # Last 3 rows

๐Ÿ”ข Access Specific Rows

Use .iloc[] to access rows by position:

df.iloc[0]     # First row
df.iloc[1:3]   # Rows 5 to 7

Access values from column

NumPy array that contains only the raw data from the โ€œNameโ€ column โ€” without index, formatting, or metadata.

df["Name"].values

Search for a Value in a Row

# This line takes row 1 of the DataFrame and converts it into a regular Python list.
row_list = df.iloc[1].tolist()

if 'Bob' in row_list:
    print("Found Bob in row 2!")

๐Ÿ” Loop Through All Rows

for index, row in df.iterrows():
    print("Row " + str(index) + ": " + row["Name"]) 
for index, row in df.iterrows():
    if "Alex" in row.to_string():
        print("Row " + str(index) + ": " + row["Name"])

โœ… Filter Rows by Condition


print(df["Name"]== "Anna") #looks like it's coming from a loop, but actually no loop is written. 
      # Thatโ€™s one of the most powerful features of Pandas - Vectorized operations.
      # Results:
      # 0     True
      # 1    False
      # 2    False
      # 3    False
      # 4    False
      # 5    False
      # 6    False
      # 7    False
      # 8    False
      # 9    False
      
      
df_anna = ""
if "Anna" in df["Name"].values:
    
    df_anna = df[df["Name"] == "Anna"] # Rows(true) where the value in column 'Name' is 'Anna'
    
    print("New DataFrame with Anna only:")
    print(df_anna)
else:
    df_anna = pd.DataFrame()  # Leerer DataFrame als Fallback

โš™๏ธ Use .apply() for Custom Actions


df["FullName"] = df.apply(lambda row: row["FirstName"] + " " + row["LastName"], axis=1)

print(df[["FirstName", "LastName", "FullName"]])


 # df.apply(...): Applies a function to each row of the DataFrame.

 # lambda row: ...: Defines a short function that combines two values from the row.

 # row["FirstName"] + " " + row["LastName"]: Concatenates the first and last name with a space in between.

 # axis=1: Means the function is applied row-wise (not column-wise).

 # df["FullName"] = ...: Stores the result in a new column called "FullName"

Updated: