| A | Assignment |
๐ Working with Books CSV
Please save your solutions for Exercises 1 to 3 in a single Python script named unit07__ex(1-3)code.py.
For Bonus Exercise, use a separate script named unit07__ex4code.py.
Save both scripts in the same unit07 folder, compress the folder into a .zip file, and upload it to ILIAS.
For more information, please visit the following link:
https://geomoer.github.io/moer-base-python/unit00/unit00-04_submission_guidelines.html
Make sure your code is clearly structured and includes comments where helpful.
Introduction
Use the online CSV file and follow the steps to explore the dataset using pandas.
๐ฅ CSV URL:
https://geomoer.github.io/moer-base-python/assets/tests/unit07/books.csv
๐ Task 1: Count Books in the Genre โFantasyโ
- Filter the DataFrame to find all books where the
"Genre"is"Fantasy". - Count how many such books there are.
- Print the number.
๐ Task 2: List All Books Before the Year 1950
- Filter the DataFrame to include only books with a
"Year"before 1950. - Print the titles and years of those books.
๐ก Hint: Donโt forget to convert "Year" to int before comparing.
๐ Task 3: Print All Unique Genres
- Print a list of all unique values in the
"Genre"column. - Then, print how many different genres there are in total.
๐ Task 4: Mark Old Books with apply()
- Use the
.apply()method withaxis=1to create a new column called"Old". - If the
"Year"of a book is before 1950, set โOldโ to โyesโ or โnoโ - Export the new DataFrame with the
"Old"columns.
๐ก Hint: Inside your function, make sure to convert "Year" to an integer before comparing.
๐ง Task 5 (Bonis):Titanic Dataset โ Beginner Exercises (Python & Pandas)
These exercises are designed to help you practice basic data analysis with the Titanic dataset using Pandas. Use the dataset from this URL:
https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv
Load it with:
import pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")
โ Exll passengers younger than 10
Filter the dataset and show all passengers whose age is below 10.
Hint: Use a boolean condition on the Age column.
โ Eount men and women
Find out:
- how many male passengers are in the dataset
- how many female passengers are in the dataset
Hint: Use value_counts().
โ Everage age per ticket class
Calculate the average age for each passenger class (Pclass).
Which class had the highest average age?
โ Eurvival rate by gender
Compute the survival rate for:
- male passengers
- female passengers
Hint: groupby("Sex")["Survived"].mean().
โ Eassengers whose name contains โSmithโ
Show all rows where the passengerโs name contains the string "Smith".
Hint: Use .str.contains("Smith") on the Name column.
Good luck! ๐ขโจ