Chapter 20: MongoDB Aggregation $lookup
1. What is $lookup? (Big picture – very clear explanation)
$lookup is MongoDB’s version of SQL LEFT OUTER JOIN (with some extra superpowers).
It lets you:
For each document in the source collection (the left side), look up matching documents from another collection (the right side), and attach those matching documents as an array field in the output.
Key points teacher repeats three times:
- It is always a left outer join — every document from the source collection survives (even if no matches → you get empty array [])
- The joined data comes as an array (even if only one match — it’s still [ {…} ])
- It can be very expensive if not used carefully → indexes, small result sets, early $match are your best friends
- Introduced in MongoDB 3.2 (2015) → massively improved in 3.6+ (let pipeline, uncorrelated subqueries)
2. Basic Syntax – The Classic / Most Common Form (Equality Join)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 |
{ $lookup: { from: "otherCollectionName", // ← name of the collection you're joining to localField: "fieldInThisCollection", // ← field in current documents foreignField: "fieldInOtherCollection", // ← field in the joined collection as: "newArrayFieldName" // ← name of the new array field added } } |
After this stage → every document now has an extra field (the as name) which is an array of matching documents from the other collection.
3. Hands-on Example – Classic Use Case (Orders + Users)
Let’s create two small collections:
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
use shop2026 // Users collection db.users.insertMany([ { _id: 101, name: "Rahul", city: "Hyderabad", email: "rahul@hyd.in" }, { _id: 102, name: "Priya", city: "Secunderabad", email: "priya@sec.in" }, { _id: 103, name: "Amit", city: "Hyderabad" } ]) // Orders collection (source – we'll start from here) db.orders.insertMany([ { orderId: "ORD001", userId: 101, amount: 1250, items: ["biryani", "chai"], date: ISODate("2026-02-10") }, { orderId: "ORD002", userId: 102, amount: 890, items: ["idli", "vada"], date: ISODate("2026-02-12") }, { orderId: "ORD003", userId: 101, amount: 450, items: ["filter coffee"], date: ISODate("2026-02-14") }, { orderId: "ORD004", userId: 999, amount: 300, items: ["rose bouquet"] } // non-existing user ]) |
Now — enrich orders with user information:
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
db.orders.aggregate([ { $lookup: { from: "users", // join with users collection localField: "userId", // field in orders foreignField: "_id", // field in users as: "customer" // new field name (will be an array) } }, // Optional – very common: convert single-user array → single object { $addFields: { customer: { $arrayElemAt: ["$customer", 0] } // take first (and usually only) element } }, // Optional – show only what we care about { $project: { orderId: 1, amount: 1, date: 1, customerName: "$customer.name", customerCity: "$customer.city", _id: 0 } } ]) |
Sample output:
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
{ "orderId": "ORD001", "amount": 1250, "date": ISODate("2026-02-10T..."), "customerName": "Rahul", "customerCity": "Hyderabad" } { "orderId": "ORD002", ... "customerName": "Priya", "customerCity": "Secunderabad" } { "orderId": "ORD003", ... "customerName": "Rahul", "customerCity": "Hyderabad" } { "orderId": "ORD004", "amount": 300, "date": ..., "customerName": null, // ← no match → fields are null/missing "customerCity": null } |
→ Notice: order ORD004 (non-existing user) still appears → left outer join behavior
4. Advanced & Modern $lookup (MongoDB 3.6+ → Very Powerful)
Since MongoDB 3.6 you can use let + pipeline syntax — this is much more flexible:
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
{ $lookup: { from: "users", let: { userIdFromOrder: "$userId" }, // variables passed to sub-pipeline pipeline: [ { $match: { $expr: { $eq: ["$_id", "$$userIdFromOrder"] } } }, { $project: { name: 1, city: 1, _id: 0 } } // only bring needed fields ], as: "customer" } } |
Advantages of let + pipeline style:
- You can do non-equality joins (e.g., range, regex, $expr conditions)
- You can add $match, $project, $sort inside the lookup → much less data transferred
- Better performance in many cases
5. Quick Reference Table – $lookup Cheat Sheet
| Goal | Which syntax? | Key fields / options | Important notes |
|---|---|---|---|
| Classic equality join (user ← orders) | Classic (local/foreign) | from, localField, foreignField, as | Fastest when fields are indexed |
| Join + filter / project only needed fields | let + pipeline | let + pipeline array inside | Preferred in 2026 – more control |
| Non-equality join (e.g. date range) | let + pipeline | $expr inside sub-pipeline | Very powerful but slower |
| Handle single match (not array) | After $lookup | $addFields: { field: { $arrayElemAt: [“$field”, 0] } } | Almost always done |
| Avoid large arrays | Inside pipeline | $limit: 1, $project only needed fields | Critical for performance |
6. Mini Exercise – Try Right Now!
- Join orders with users using classic syntax → see the array
- Convert the customer array to single object using $arrayElemAt
- Show only orders from Hyderabad users (add $matchbefore or after lookup – compare)
- Try the let + pipeline version – project only name and city
Understood beta? $lookup is the stage that makes MongoDB feel like it has relationships — but remember: use it carefully, index the foreignField, filter early, project only what you need — or your database will cry on Valentine’s Day.
Next class options:
- How to index for $lookup (huge perf difference)
- Uncorrelated subqueries in $lookup (MongoDB 5.0+ magic)
- $lookup + $unwind + $group pattern (very common)
- Or let’s finally build a complete “E-commerce Order Report” with joins, counts, sorting, limits?
Tell me what you want next — class is ready! 🚀❤️
Any part of $lookup confusing? Ask anything — we’ll do more live examples together 😄
