Chapter 8: Aggregate Functions & Grouping
This is a super important chapter because now we move from just looking at individual rows to analyzing the big picture! Aggregate functions help us answer questions like:
- How many students do we have?
- What is the average age?
- Who is the youngest/oldest student?
- How many students per gender?
And GROUP BY + HAVING let us group data and filter groups — exactly like creating Excel pivot tables!
Let’s first prepare good sample data with more variety so we can see aggregates properly. Run this in your my_coaching database (copy-paste all at once):
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
TRUNCATE TABLE students; INSERT INTO students (full_name, phone, email, date_of_birth, gender, join_date, is_active, notes) VALUES ('Priya Sharma', '9876543210', 'priya.sharma@gmail.com', '2008-05-12', 'F', '2025-01-05', TRUE, 'Regular'), ('Rahul Patel', '9123456789', 'rahul.p@gmail.com', '2007-11-20', 'M', '2025-01-10', TRUE, NULL), ('Sneha Joshi', '9988776655', 'sneha.joshi@yahoo.com', '2009-03-03', 'F', '2025-02-15', TRUE, 'Extra classes'), ('Aarav Khan', '9765432109', 'aarav.khan@outlook.com', '2006-08-15', 'M', '2025-03-01', TRUE, NULL), ('Kavya Iyer', '9898989898', 'kavya.iyer@gmail.com', '2010-04-22', 'F', '2025-03-10', TRUE, NULL), ('Rohan Mehta', '9012345678', 'rohan.mehta@live.com', '2005-12-01', 'M', '2024-12-20', FALSE, 'Relocated'), ('Ananya Gupta', '9223344556', 'ananya.gupta@gmail.com', '2009-07-18', 'F', '2025-02-28', TRUE, NULL), ('Vikram Singh', NULL, 'vikram.singh@proton.me', '2007-09-09', 'M', '2025-01-25', TRUE, 'No phone yet'), ('Meera Desai', '9876541234', 'meera.desai@hotmail.com', '2008-02-14', 'F', '2025-04-01', TRUE, NULL), ('Arjun Reddy', '9345678901', 'arjun.reddy@gmail.com', '2006-06-30', 'M', '2025-02-10', TRUE, NULL); |
Now we have 10 students — 5 female, 5 male, different join dates, some inactive, etc.
1. Aggregate Functions: COUNT(), SUM(), AVG(), MIN(), MAX()
These functions summarize many rows into one single value.
| Function | What it does | Example Use Case |
|---|---|---|
| COUNT() | Counts number of rows (or non-NULL values) | Total students, how many have phone |
| SUM() | Adds up all values in a column | Total fees (if we had a fees column) |
| AVG() | Average (mean) of values | Average age of students |
| MIN() | Smallest value | Youngest student’s birth date |
| MAX() | Largest value | Oldest student’s birth date |
Important:
- Aggregates ignore NULL values (except COUNT(*))
- You can use them with WHERE to filter first
Examples – Run these one by one!
A. Total number of students
|
0 1 2 3 4 5 6 7 8 |
SELECT COUNT(*) AS 'Total Students' FROM students; -- Result: 10 |
B. Number of students who have provided phone number
|
0 1 2 3 4 5 6 7 8 |
SELECT COUNT(phone) AS 'Students with Phone' FROM students; -- Result: 9 (Vikram has NULL phone) |
C. Number of active students
|
0 1 2 3 4 5 6 7 8 9 |
SELECT COUNT(*) AS 'Active Students' FROM students WHERE is_active = TRUE; -- Result: 9 |
D. Earliest and latest join date
|
0 1 2 3 4 5 6 7 8 9 |
SELECT MIN(join_date) AS 'First Join Date', MAX(join_date) AS 'Latest Join Date' FROM students; |
E. Youngest and oldest student (by birth date)
|
0 1 2 3 4 5 6 7 8 9 |
SELECT MIN(date_of_birth) AS 'Youngest Birth Date', MAX(date_of_birth) AS 'Oldest Birth Date' FROM students; |
F. Approximate average age (2026 – birth year)
|
0 1 2 3 4 5 6 7 8 9 |
SELECT AVG(YEAR(CURDATE()) - YEAR(date_of_birth)) AS 'Average Age (approx)' FROM students; -- Result: around 18–19 years |
2. GROUP BY – Grouping Rows & Aggregating Per Group
GROUP BY lets us split data into groups and apply aggregates per group.
Syntax:
|
0 1 2 3 4 5 6 7 8 |
SELECT column_to_group_by, AGGREGATE_FUNCTION(...) FROM table_name GROUP BY column_to_group_by; |
Examples – Real reports!
A. Number of students per gender
|
0 1 2 3 4 5 6 7 8 9 10 |
SELECT gender, COUNT(*) AS 'Number of Students' FROM students GROUP BY gender; |
Result:
|
0 1 2 3 4 5 6 7 8 9 10 11 |
+--------+--------------------+ | gender | Number of Students | +--------+--------------------+ | F | 5 | | M | 5 | +--------+--------------------+ |
B. Number of students who joined per month
|
0 1 2 3 4 5 6 7 8 9 10 11 |
SELECT MONTHNAME(join_date) AS 'Join Month', COUNT(*) AS 'Students Joined' FROM students GROUP BY MONTH(join_date) ORDER BY MONTH(join_date); |
C. Number of active vs inactive students
|
0 1 2 3 4 5 6 7 8 9 10 |
SELECT is_active, COUNT(*) AS 'Count' FROM students GROUP BY is_active; |
D. Average birth year per gender
|
0 1 2 3 4 5 6 7 8 9 10 |
SELECT gender, AVG(YEAR(date_of_birth)) AS 'Avg Birth Year' FROM students GROUP BY gender; |
3. HAVING Clause – Filtering Groups (vs WHERE)
| Clause | What it filters | When to use |
|---|---|---|
| WHERE | Filters individual rows before grouping | Filter before aggregation |
| HAVING | Filters groups after grouping | Filter based on aggregate results (COUNT, AVG, etc.) |
Important rule: WHERE comes before GROUP BY HAVING comes after GROUP BY
Example – Wrong vs Right
Wrong (won’t work):
|
0 1 2 3 4 5 6 7 8 9 |
SELECT gender, COUNT(*) FROM students WHERE COUNT(*) > 4 GROUP BY gender; |
Correct:
|
0 1 2 3 4 5 6 7 8 9 10 11 12 |
SELECT gender, COUNT(*) AS student_count FROM students GROUP BY gender HAVING student_count > 4; -- Shows only genders with more than 4 students (both F & M will show in our case) |
Real example – Genders with at least 5 students
|
0 1 2 3 4 5 6 7 8 9 10 11 |
SELECT gender, COUNT(*) AS student_count FROM students GROUP BY gender HAVING COUNT(*) >= 5; |
Another example – Join months with more than 2 students
|
0 1 2 3 4 5 6 7 8 9 10 11 |
SELECT MONTHNAME(join_date) AS join_month, COUNT(*) AS students_joined FROM students GROUP BY MONTH(join_date) HAVING students_joined > 2; |
4. ROLLUP – Adding Subtotal & Grand Total Rows
ROLLUP adds summary rows automatically — super useful for reports!
Syntax:
|
0 1 2 3 4 5 6 |
GROUP BY column WITH ROLLUP; |
Example – Students per gender with totals
|
0 1 2 3 4 5 6 7 8 9 10 |
SELECT gender, COUNT(*) AS student_count FROM students GROUP BY gender WITH ROLLUP; |
Result (beautiful!):
|
0 1 2 3 4 5 6 7 8 9 10 11 12 |
+--------+---------------+ | gender | student_count | +--------+---------------+ | F | 5 | | M | 5 | | NULL | 10 | ← Grand Total +--------+---------------+ |
Multiple levels (e.g., per gender and per active status):
|
0 1 2 3 4 5 6 7 8 9 10 11 |
SELECT gender, is_active, COUNT(*) AS count FROM students GROUP BY gender, is_active WITH ROLLUP; |
You’ll see:
- Rows for each gender + active combination
- Subtotal per gender
- Grand total at the bottom
That’s it for Chapter 8! 🎉 You can now create professional-looking reports with counts, averages, groups, and subtotals!
Homework for today (do it right now – very important!) Run these 5 queries and paste the output of each here:
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
-- 1. Total number of students and average age (approx) SELECT COUNT(*) AS total_students, AVG(YEAR(CURDATE()) - YEAR(date_of_birth)) AS avg_age FROM students; -- 2. Number of students per gender SELECT gender, COUNT(*) AS count FROM students GROUP BY gender; -- 3. Join months with more than 1 student SELECT MONTHNAME(join_date) AS month, COUNT(*) AS students FROM students GROUP BY MONTH(join_date) HAVING students > 1 ORDER BY MONTH(join_date); -- 4. Genders with at least 4 students SELECT gender, COUNT(*) AS count FROM students GROUP BY gender HAVING count >= 4; -- 5. Students per gender with grand total (using ROLLUP) SELECT gender, COUNT(*) AS count FROM students GROUP BY gender WITH ROLLUP; |
