Chapter 17: MongoDB Aggregation $match
1. What does $match actually do? (Plain teacher language)
$match is the filter stage inside the aggregation pipeline.
It works exactly like the query document you pass to find() or updateMany() — but inside the pipeline.
In other words:
$match keeps only the documents that satisfy the condition(s) you write and discards all others before they reach the next (usually more expensive) stages.
Most important sentence of the day (write it in capital letters):
Always put $match as early as possible in the pipeline — preferably as the very first stage — whenever you can.
Why? Because filtering early means:
- fewer documents go through expensive stages like $group, $lookup, $unwind, $sort
- MongoDB can use indexes (huge speed difference)
- less memory & CPU used
- queries finish in milliseconds instead of seconds/minutes
2. Syntax — Identical to find() queries
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
{ $match: { field: value, "nested.field": { $gt: 100 }, $and: [ … ], $or: [ … ], // … literally any query operator you already know } } |
→ Every single query operator you learned earlier ($eq, $gt, $in, $exists, $regex, $elemMatch, $and/$or, etc.) works exactly the same inside $match.
3. Hands-on Examples — Using Our Movie Collection
|
0 1 2 3 4 5 6 |
use movieAnalytics2026 |
Example 1: Simplest — only Indian movies
|
0 1 2 3 4 5 6 7 8 9 10 |
db.movies.aggregate([ { $match: { country: "India" } }, { $project: { title: 1, rating: 1, year: 1, _id: 0 } } ]) |
→ Only RRR, Kalki, Pushpa 2 remain — Oppenheimer is filtered out before anything else happens.
Example 2: Multiple conditions + comparison operators (very common)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
db.movies.aggregate([ { $match: { $and: [ { year: { $gte: 2022 } }, { rating: { $gte: 8.0 } }, { revenue: { $exists: true, $gt: 500 } } ] } }, { $sort: { rating: -1 } }, { $limit: 3 } ]) |
→ Only recent, well-rated, high-revenue movies go forward.
Example 3: Array conditions + $elemMatch (very powerful)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
db.movies.aggregate([ { $match: { genres: "Action", // contains "Action" genres: { $ne: "Romance" }, // does NOT contain "Romance" "comments.user": "Rahul" // at least one comment by Rahul } } ]) |
Better (more precise with $elemMatch):
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
{ $match: { genres: { $all: ["Action", "Drama"] }, // must have BOTH comments: { $elemMatch: { user: "Rahul", text: { $regex: "epic", $options: "i" } } } } } |
Example 4: $match after $unwind (when you must filter later)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
db.movies.aggregate([ { $unwind: "$genres" }, // explode array first { $match: { genres: "Sci-Fi", // only keep Sci-Fi entries year: { $gte: 2020 } } }, { $group: { _id: "$title", genresFound: { $addToSet: "$genres" } } } ]) |
→ Here $match cannot be first — because we need the array exploded before filtering on genres.
Teacher rule: If you need to filter on array elements → you usually have to $unwind first → then $match.
4. Performance Golden Rules — Write These Down!
| Rule # | Rule Text | Why it matters (2026 reality) |
|---|---|---|
| 1 | Put $match as stage 1 whenever possible | Uses indexes → can reduce 1M docs → 100 docs instantly |
| 2 | Use indexed fields in $match | Index scan vs collection scan (10–1000× difference) |
| 3 | Avoid $match on computed / $lookup fields early | Cannot use index → slow |
| 4 | $match + $sort + $limit together = optimized top-K query | MongoDB uses special “sort + limit” memory optimization |
| 5 | If you have to $unwind → try to $match after $unwind only when necessary | $unwind explodes data — filter before if possible |
5. Quick Cheat Sheet Table — Your $match Reference
| Goal | Typical $match filter | Can use index? | Position preference |
|---|---|---|---|
| Only recent documents | { createdAt: { $gte: ISODate(“2026-01-01”) } } | Yes | Stage 1 |
| High-rated Indian movies | { country: “India”, rating: { $gte: 8 } } | Yes (compound) | Stage 1 |
| Documents that have a field | { revenue: { $exists: true } } | Sometimes | Stage 1 |
| Filter after exploding array | { $unwind: “$tags” } → { tags: “urgent” } | No (post-unwind) | After $unwind |
| Complex OR conditions | { $or: [ {status:”active”}, {status:”pending”} ] } | Yes (if indexed) | Stage 1 |
| Regex search early (avoid if possible) | { title: { $regex: “^Pushpa”, $options: “i” } } | Rarely | Only if no better way |
6. Mini Exercise — Try Right Now!
- Show only movies from India with rating ≥ 8.0 (use $match first)
- Count how many movies have “Action” in genres (unwind → match → count)
- Get top 3 highest rated movies released after 2023 (match → sort → limit)
- Compare performance: run a pipeline with$match first vs without (feel the difference if you have more data)
Understood beta? $match is the gatekeeper of your pipeline — let only the important documents pass through early, and everything downstream becomes faster, cheaper, and happier.
Next class — what shall we do?
- Why $match + $sort + $limit is magical (covered briefly above)
- $match inside $lookup pipeline (joined filtering)
- Common mistakes with $match and how to fix them
- Or continue building our “Movie Dashboard” with all stages so far?
Tell me — class is ready for the next step! 🚀❤️
Any doubt about $match? Ask freely — we’ll debug it together 😄
