It's April 15, 1912. The RMS Titanic has just sunk in the North Atlantic Ocean. As a data analyst for the White Star Line shipping company, you've been tasked with analyzing passenger data to understand survival patterns. Your findings will influence future safety regulations and procedures.
The dataset contains information about 891 passengers, including whether they survived, their class, age, fare paid, and family relationships aboard the ship.
https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv
createdb titanicdb
in pgAdmin create the database titanicdb,
Run the SQL script provided in titanic_table.sql
\COPY titanic FROM 'titanic.csv' WITH CSV HEADER;
titanic table → Import/Export DataUnderstanding who was on board
Exercise 1.1 - Show all passengers (just to see the data structure)
Exercise 1.2 - Find all survivors
Exercise 1.3 - Find all male passengers
Exercise 1.4 - Find all first-class passengers
Exercise 1.5 - Find children under 10 years old
Exercise 1.6 - Find passengers who embarked at Southampton (S)
Narrowing down the search
Exercise 2.1 - Find all women who survived
Exercise 2.2 - Find all first-class male passengers
Exercise 2.3 - Find children under 18 who did NOT survive
Exercise 2.4 - Find passengers who paid more than 50 for their fare
Exercise 2.5 - Find third-class passengers between ages 20 and 40
Exercise 2.6 - Find passengers with no cabin information (cabin IS NULL)
Finding the extremes
Exercise 3.1 - Show the 10 oldest passengers
Exercise 3.2 - Show the 5 most expensive tickets
Exercise 3.3 - List all children (age < 18) from youngest to oldest
Exercise 3.4 - Show the 10 cheapest first-class tickets
Exercise 3.5 - Find the 20 youngest survivors
Selecting specific information
Exercise 4.1 - Show only names and ages of all passengers
Exercise 4.2 - Show passenger names with their survival status and class
Exercise 4.3 - For women only, show name, age, and fare paid
Exercise 4.4 - Create a passenger list showing: name, sex, age for all children
Exercise 4.5 - Show name and embarkation port for first-class passengers
Making output readable
Exercise 5.1 - Show passenger names as "Passenger" and age as "Years"
Exercise 5.2 - Show survival status as "Rescued" (1) or not (0)
Exercise 5.3 - Display class as "Ticket_Class" and fare as "Price_Paid"
Combining information
Exercise 6.1 - Create a full description: name followed by their sex in parentheses
Exercise 6.2 - Combine age and sex into one column
Exercise 6.3 - Create a passenger summary: "Name - Class X"
Exercise 6.4 - Show port names with text: "Embarked from: X"
Dealing with missing data
Exercise 7.1 - Show all ages, replacing NULL with 0
Exercise 7.2 - Show cabin numbers, replacing NULL with 'No Cabin'
Exercise 7.3 - Display age, but show 'Unknown' for missing ages
Exercise 7.4 - Create a report showing fare, but display 0.00 for NULL fares
Converting data types
Exercise 8.1 - Convert survived (integer) to text for concatenation
Exercise 8.2 - Round ages to nearest integer and cast as INTEGER
Exercise 8.3 - Convert fare to text and add '$' prefix
Exercise 8.4 - Cast passenger_id to text for string operations
Putting it all together
Exercise 9.1 - Create a survivor report showing:
Exercise 9.2 - Create a first-class passenger manifest:
Exercise 9.3 - Find families (passengers with same last name):
Exercise 9.4 - Create a children's safety report:
Exercise 9.5 - Expensive tickets analysis:
Answer these questions about the tragedy
Exercise 10.1 - Which passenger paid the highest fare?
Exercise 10.2 - How many children under 10 were in third class?
Exercise 10.3 - List all passengers with the title "Master" (young boys)
Exercise 10.4 - Find all passengers traveling alone (sibsp = 0 AND parch = 0)
Exercise 10.5 - Who were the youngest and oldest survivors?
Challenge 1 - Create a formatted passenger card:
"PASSENGER CARD
Name: [name]
Class: [class] | Sex: [sex] | Age: [age or 'Unknown']
Fare: $[fare] | Cabin: [cabin or 'Not assigned']
Status: [SURVIVED or PERISHED]"
Challenge 2 - Find potential families by identifying:
Challenge 3 - Create age groups and analyze:
To check your query results:
\d titanic in psql to see column types