Chapter 17: Quiz/Exercises
Pandas Quiz / Exercises set designed exactly like a real classroom session.
Imagine we have been learning pandas together for a while. Now it’s time to test yourself — not to “pass or fail”, but to see what you really understand, what is still a bit fuzzy, and where we should go deeper next.
I prepared 3 difficulty levels:
- Level 1 – Basics everyone should be comfortable with
- Level 2 – Common real-work patterns
- Level 3 – Slightly trickier / more realistic situations
For each question I give:
- the task
- a small dataset (so you can try immediately)
- space to think / write your code
- after you try → my detailed solution + explanation + common mistakes
You can copy-paste the data and try in your notebook. Try to write the code yourself first before looking at solutions.
Level 1 – Core basics
Q1. Select only the rows where age > 25 and city is either ‘Pune’ or ‘Mumbai’. Show only name, city, and salary columns, sorted by salary descending.
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 |
import pandas as pd df = pd.DataFrame({ 'name': ['Priya', 'Rahul', 'Ananya', 'Sneha', 'Vikram', 'Meera', 'Arjun'], 'age': [24, 31, 19, 28, 45, 22, 33], 'city': ['Pune', 'Hyderabad', 'Bangalore', 'Mumbai', 'Pune', 'Chennai', 'Mumbai'], 'salary': [72000, 145000, 88000, 112000, 210000, 68000, 95000] }) |
Q2. Create a new column bonus = 12% of salary if salary > 100000, else 8%. Show the result rounded to nearest rupee.
Q3. How many people earn more than the average salary? Show their names and salaries.
Q4. Count how many people are there per city. Show the result as percentages (rounded to 1 decimal).
Level 2 – Everyday real-work tasks
Q5. Group by city and show:
- number of employees
- average salary
- highest salary
- percentage of people earning > 1 lakh
Round numbers sensibly.
Q6. Add a column salary_rank_in_city — rank within each city (1 = highest salary in that city).
Q7. Create a column experience_category:
- ‘Senior’ if joined before 2022
- ‘Mid’ if joined 2022–2023
- ‘Junior’ if joined 2024 or later
(Assume we have a join_year column already extracted)
|
0 1 2 3 4 5 6 7 |
# If you don't have join_year yet, create it first df['join_year'] = [2021, 2020, 2024, 2022, 2019, 2023, 2023] # example |
Q8. Show only the top 2 earners per city. Include name, city, salary.
Level 3 – More realistic & slightly trickier
Q9. Fill missing salary values with the median salary of their city. If the city has no non-missing salaries → use overall median.
|
0 1 2 3 4 5 6 7 8 |
# Add some missing values for practice df.loc[2, 'salary'] = np.nan df.loc[5, 'salary'] = np.nan |
Q10. Find people whose salary is above the average of their city AND above overall average. Show name, city, salary, city_avg, overall_avg.
Q11. Create a summary table like this:
|
0 1 2 3 4 5 6 7 8 9 10 |
City | Count | Avg Salary | % >1L | Highest Earner Name | Their Salary -----------|-------|------------|-------|---------------------|------------- Pune | ... | ... | ... | ... | ... Mumbai | ... | ... | ... | ... | ... ... |
Q12. You are given this extra table:
|
0 1 2 3 4 5 6 7 8 9 10 |
managers = pd.DataFrame({ 'city': ['Pune', 'Mumbai', 'Hyderabad', 'Bangalore'], 'manager_name': ['Neha', 'Rohan', 'Suresh', 'Kavita'], 'manager_since': [2020, 2019, 2022, 2021] }) |
Join it to the main table (left join on city). Then show only employees whose manager has been there less than 3 years (as of 2025).
How to use this quiz effectively
-
Try at least 4–5 questions without looking at answers
-
Write your code in a notebook
-
Compare with solutions below
-
Note which parts felt easy / hard / confusing
-
Come back and tell me:
- Which questions were easy?
- Which ones were difficult or surprising?
- Did you get stuck anywhere?
Then we can go deeper into those topics.
Solutions (scroll down only after trying!)
Q1 solution
|
0 1 2 3 4 5 6 7 8 9 10 11 |
result = df[ (df['age'] > 25) & (df['city'].isin(['Pune', 'Mumbai'])) ][['name', 'city', 'salary']].sort_values('salary', ascending=False) print(result) |
Q2 solution
|
0 1 2 3 4 5 6 7 8 9 10 11 12 |
df['bonus'] = np.where( df['salary'] > 100000, df['salary'] * 0.12, df['salary'] * 0.08 ).round(0).astype(int) print(df[['name', 'salary', 'bonus']]) |
Q3 solution
|
0 1 2 3 4 5 6 7 8 9 10 11 |
avg_salary = df['salary'].mean() high_earners = df[df['salary'] > avg_salary][['name', 'salary']] print(f"Average salary: ₹{avg_salary:,.0f}") print(f"Number above average: {len(high_earners)}") print(high_earners) |
Q4 solution
|
0 1 2 3 4 5 6 7 8 |
city_counts = df['city'].value_counts(normalize=True).mul(100).round(1) print("Percentage distribution by city:") print(city_counts) |
(continue to Level 2 & 3 solutions in the next message if you want — or tell me which question you want explained first)
Which questions would you like to see solutions for first? Or which ones did you find hardest? Tell me — we’ll go through them together slowly. 😊
