Chapter 12: MongoDB Aggregation Pipelines

MongoDB Aggregation Pipelines. β€οΈπŸ“ŠπŸ“ˆ

This is not just another query. This is where MongoDB becomes a full-fledged data transformation & analytics engine inside the database β€” no need to pull millions of documents to your app and process them in Node.js/Python.

1. What Exactly is an Aggregation Pipeline? (Big Picture First)

Imagine your collection is a factory production line:

  • Raw materials = documents coming in
  • Each stage = one machine that does one specific job (filter, group, reshape, sort, join, calculate…)
  • The output of one machine β†’ input to the next machine
  • Final product = summarized, transformed, enriched data

In MongoDB terms:

JavaScript
  • Order matters a lot (usually start with $match to filter early)
  • Each stage processes documents one by one or in groups
  • Very efficient β€” MongoDB can use indexes in early stages
  • Can output to a new collection ($out, $merge), return cursor, or just show results

Think of it as MongoDB’s version of SQL GROUP BY + JOIN + HAVING + SELECT + ORDER BY + LIMIT β€” but way more flexible and composable.

2. Most Important Stages (The Building Blocks – Learn These First!)

Here are the top 10 stages you’ll use 95% of the time in real projects (2026 style):

Stage Purpose (one-liner) SQL equivalent (approx) When to use it (real life) Position in pipeline (typical)
$match Filter documents (like find()) WHERE Early β€” reduce data fast, use indexes Usually stage 1
$group Group & calculate aggregates (sum, avg, count…) GROUP BY + aggregates Core analytics: totals, averages per category After $match
$project Reshape documents (include/exclude/rename/new fields) SELECT specific columns + expressions Clean output, create computed fields Near end
$sort Sort documents ORDER BY Present results nicely After group/project
$limit / $skip Paginate results LIMIT / OFFSET Show page 3 of 10 results End of pipeline
$lookup Join with another collection (left outer join) LEFT JOIN Enrich data: add user info to orders After $match
$unwind Deconstruct array β†’ one doc per array element β€” Handle arrays before grouping Before $group on arrays
$addFields Add new computed fields (like $project but keeps all) β€” Add calculated columns Mid/late
$out / $merge Write results to new/existing collection CREATE TABLE AS SELECT Materialized views, reporting tables Last stage
$count Count documents (after previous stages) COUNT(*) Quick total after filters End

3. Hands-on Example – Real Mini Project (Movie Reviews Analytics)

Let’s create a small dataset in mongosh (you can copy-paste):

JavaScript

Now β€” let’s build pipelines step by step.

Pipeline 1: Average rating per genre (basic group)

JavaScript

Sample output:

JSON

Pipeline 2: Top countries by revenue + join user comments count

JavaScript

4. Quick Reference Table – Your Pipeline Cheat Sheet

Goal Typical Stages Order Key Operators Inside
Filter + group + sort $match β†’ $group β†’ $sort $avg, $sum, $count
Join + enrich $match β†’ $lookup β†’ $unwind β†’ $project $lookup, $arrayElemAt
Paginated top 10 $match β†’ $group β†’ $sort β†’ $skip β†’ $limit $sort: { field: -1 }
Materialized report … β†’ $merge: { into: “reports” } $merge / $out
Handle arrays before aggregate $unwind β†’ $group $unwind (preserveNullAndEmptyArrays)

5. Mini Exercise – Try These in mongosh Right Now!

  1. Find average rating per country
  2. Count movies per year (add $group: { _id: “$year”, count: { $sum: 1 } })
  3. Find movies with > 2 genres (use $size: “$genres”)
  4. Build a pipeline that shows only Indian movies sorted by revenue descending

Understood beta? Aggregation pipelines are what separate beginners from pros β€” once you master this, you can do almost anything with data inside MongoDB.

Next class β€” what excites you?

  • Deep dive into $lookup (joins) with examples?
  • $unwind + $group patterns for arrays?
  • Atlas Search inside aggregation (vector + full-text)?
  • Performance tips: indexes, $match early, explain plans?
  • Or a small real project (sales dashboard, user analytics)?

Tell me β€” class is ready for the next level! πŸš€β€οΈ

Any confusion so far? Ask anything β€” we’re in this together! πŸ˜„

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *