Chapter 13: MongoDB Aggregation $group
1. What does $group actually do? (Simple teacher explanation)
$group collects documents that have the same value(s) in one or more fields (the grouping key), and then lets you calculate something about each group.
Think of it like:
- SQL: GROUP BY city + COUNT(*) + AVG(salary) + SUM(revenue)
- Excel: Pivot Table
- Real life: “Group all Valentine’s wishes by who sent them, then count how many each person sent and find the average rating per sender”
Core idea in one sentence:
$group partitions the documents into buckets (groups) based on the _id expression you give it, and then computes accumulators (sum, avg, count, min, max, push, etc.) inside each bucket.
2. Basic Syntax Pattern (Write this in your notebook!)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 |
{ $group: { _id: <grouping key expression>, // ← mandatory — what to group by fieldName1: { <accumulator>: <expression> }, fieldName2: { <accumulator>: <expression> }, // ... as many as you want } } |
- _id is required — this becomes the group identifier
- You can name the output fields anything you want (they become fields in the result documents)
- Accumulators almost always start with $ (except a few special cases)
3. Most Important Accumulators (The Tools Inside $group)
| Accumulator | What it computes | Input type | Typical output example | Very common use case |
|---|---|---|---|---|
| $sum | Sum of values | number | total revenue | revenue, likes, points, quantity |
| $avg | Average (arithmetic mean) | number | average rating | scores, prices, response time |
| $min / $max | Smallest / largest value | number, date, string | earliest join date, highest score | min/max price, date ranges |
| $first / $last | First / last value per group (in sort order) | any | most recent comment | latest entry per group |
| $push | Build array of all values | any | list of product IDs | collect all items, tags, users |
| $addToSet | Build array of unique values | any | unique categories | distinct values per group |
| $count | Count documents in group (MongoDB 5.0+) | — | number of orders | cleaner than $sum: 1 |
| $stdDevPop / $stdDevSamp | Population / sample standard deviation | number | rating variability | statistics, quality control |
4. Hands-on Examples – Let’s Build Real Pipelines
Use the same movieAnalytics2026 database from last class:
|
0 1 2 3 4 5 6 7 8 9 |
use movieAnalytics2026 // Quick reminder of data (you can re-insert if needed) db.movies.find().pretty() |
Example 1: Simplest — Count movies per year
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
db.movies.aggregate([ { $group: { _id: "$year", // group by year movieCount: { $sum: 1 }, // count documents in each group titles: { $push: "$title" } // collect all movie titles } }, { $sort: { _id: -1 } } // newest year first ]) |
Result might look like:
|
0 1 2 3 4 5 6 7 8 9 |
{ "_id": 2025, "movieCount": 1, "titles": ["Pushpa 2"] } { "_id": 2024, "movieCount": 1, "titles": ["Kalki 2898 AD"] } { "_id": 2023, "movieCount": 1, "titles": ["Oppenheimer"] } { "_id": 2022, "movieCount": 1, "titles": ["RRR"] } |
Example 2: Real business case — Revenue & average rating per country
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
db.movies.aggregate([ { $match: { revenue: { $exists: true } } }, // filter early { $group: { _id: "$country", totalRevenueCr: { $sum: "$revenue" }, avgRating: { $avg: "$rating" }, movieCount: { $sum: 1 }, highestRated: { $max: "$rating" }, bestMovie: { $max: { rating: "$rating", title: "$title" } } // trick to get document with max } }, { $project: { country: "$_id", totalRevenueCr: 1, averageRating: { $round: ["$avgRating", 1] }, movies: "$movieCount", topRating: "$highestRated", bestMovieTitle: "$bestMovie.title", _id: 0 } }, { $sort: { totalRevenueCr: -1 } } ]) |
Example 3: Group by multiple fields (composite key)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
db.movies.aggregate([ { $unwind: "$genres" }, // important: explode array first { $group: { _id: { country: "$country", genre: "$genres" }, count: { $sum: 1 }, avgRating: { $avg: "$rating" } } }, { $sort: { "avgRating": -1 } } ]) |
→ Groups like {India, Action}, {USA, Drama}, etc.
Example 4: Collect unique values + count per sender (Valentine style)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
// Assume we have a wishes collection db.wishes.aggregate([ { $group: { _id: "$from", wishCount: { $sum: 1 }, toPeople: { $addToSet: "$to" }, // unique recipients totalLikes: { $sum: "$likes" }, avgRating: { $avg: "$rating" } } }, { $sort: { wishCount: -1 } } ]) |
5. Quick Cheat Sheet Table – Your $group Reference Card
| Goal | _id expression | Useful accumulators | Typical preceding stage |
|---|---|---|---|
| Count per category | “$category” | $sum: 1 or $count: {} | $match |
| Total & average per group | “$region” | $sum: “$amount”, $avg: “$score” | $match |
| Collect all items | “$userId” | $push: “$productId” | $unwind if array |
| Collect distinct items | “$department” | $addToSet: “$employeeId” | — |
| Get latest document per group | “$sensorId” | $max: “$timestamp”, $last: “$$ROOT” | $sort before group |
| Group by year-month (date) | { $dateToString: { format: “%Y-%m”, date: “$createdAt” } } | — | — |
6. Mini Exercise – Try These Right Now!
- Count how many movies per genre (remember $unwind first)
- Find total revenue and movie count per year
- For each country, collect the list of movie titles (use $push or $addToSet)
- Find the average rating per country — round to 2 decimals
Understood beta? $group is the heart of analytics in MongoDB — once you get comfortable with it, you can answer almost any “how many”, “average”, “total per …” question directly in the database.
Next class options:
- How to get top N per group (with $sort + $group tricks)?
- $lookup + $group together (join then summarize)?
- Date grouping patterns (by day/week/month)?
- Or a small dashboard-style project?
Tell me what you want next — class is yours! 🚀❤️
Any part of $group confusing? Ask freely — we’ll practice more examples together 😄
