Chapter 22: Indexing & Search
1. What is Indexing in MongoDB? (The Big Picture)
Imagine your collection is a huge library with 1 million books (documents).
Without indexing:
- To find all books written by “Rahul Sharma” → librarian has to check every single book one by one → COLLSCAN (Collection Scan) → very slow!
With indexing:
- The librarian has a sorted phonebook-style list (index) of authors → jumps directly to “Rahul Sharma” section → finds only those books → IXSCAN (Index Scan) → lightning fast!
Official simple definition (2026 style):
An index is a special data structure (usually B-tree) that stores a small portion of the collection’s data in an easy-to-traverse, sorted form. It allows MongoDB to avoid scanning every document for most queries.
Without indexes → queries scan entire collection → performance dies as data grows. With good indexes → queries return in milliseconds even on billions of documents.
2. Why Indexing Matters in 2026 (Real-World Reasons)
| Reason | Without Index (COLLSCAN) | With Proper Index (IXSCAN) | Typical speedup |
|---|---|---|---|
| 1 million documents | 500–5000 ms | <10 ms | 50–500× |
| Filter + sort on common fields | Full scan + in-memory sort | Uses index for both filter + sort | 100–1000× |
| Text / partial search | Full scan (very slow) | Text index or Atlas Search | 100–10,000× |
| Sharded cluster balance | — | Hashed indexes for even distribution | — |
| High write throughput | — | Too many indexes → slower writes | Balance needed |
3. Main Types of Indexes in MongoDB (with Examples)
MongoDB supports many index types — here are the ones you’ll use 95% of the time:
A. Single Field Index (Most Basic & Common)
|
0 1 2 3 4 5 6 7 8 9 10 |
// Create index on "email" field (ascending) db.users.createIndex({ email: 1 }) // Descending db.orders.createIndex({ createdAt: -1 }) // newest first |
→ Great for: equality (email: “rahul@hyd.in”), range (age: {$gt: 18}), sort.
B. Compound Index (Multiple Fields – Very Powerful)
Order matters — follows Equality → Sort → Range (ESR) rule.
|
0 1 2 3 4 5 6 7 8 9 10 11 12 |
// Best for queries like: city = "Hyderabad" AND sort by createdAt desc db.students.createIndex({ city: 1, createdAt: -1 }) // Supports prefix queries: // { city: "Hyderabad" } // { city: "Hyderabad", createdAt: { $gte: ISODate(...) } } // { city: "Hyderabad" } + sort({ createdAt: -1 }) |
→ Prefixes: MongoDB can use left-to-right parts of compound index.
C. Text Index (For Keyword / Full-Text Search – Legacy Style)
Only one text index per collection.
|
0 1 2 3 4 5 6 7 8 9 10 11 |
db.articles.createIndex( { title: "text", content: "text", tags: "text" }, { weights: { title: 10, content: 5, tags: 1 } } // title more important ) db.articles.find({ $text: { $search: "mongodb hyderabad" } }) |
→ Gives relevance score via { $meta: “textScore” }
But in 2026, most people prefer Atlas Search instead (more powerful — see below).
D. Hashed Index (Mostly for Sharding)
|
0 1 2 3 4 5 6 |
db.users.createIndex({ _id: "hashed" }) // classic for hashed sharding |
→ Distributes data evenly across shards — not for normal queries.
E. Other Common Types (Quick Overview)
- TTL Index — auto-delete old documents db.logs.createIndex({ expireAt: 1 }, { expireAfterSeconds: 0 })
- Geospatial — 2dsphere for location queries db.places.createIndex({ location: “2dsphere” })
- Multikey — automatic when indexing arrays
4. Atlas Search – The Modern Full-Text & Vector Search (Very Hot in 2026)
Regular indexes (B-tree) → great for exact matches, ranges, sorts Atlas Search → built on Apache Lucene (inverted index) → full-text, fuzzy, autocomplete, synonyms, vector search for AI embeddings
Key differences (2026):
| Feature | Regular Index (B-tree) | Atlas Search (Lucene-based) |
|---|---|---|
| Use case | Exact equality, range, sort | Full-text, partial match, relevance, vectors |
| Syntax | find({ name: “Rahul” }) | $search operator in aggregation |
| Speed on text search | Slow (regex / $text limited) | Very fast |
| Relevance scoring | No | Yes (BM25, custom) |
| Fuzzy / autocomplete | No | Yes |
| Vector / AI similarity | No | Yes (very hot in 2026) |
| Where available | All MongoDB | Only MongoDB Atlas |
Quick Atlas Search example (in Atlas UI → create Search Index first):
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
db.movies.aggregate([ { $search: { index: "default", // your search index name text: { query: "action hyderabad", path: ["title", "genres", "description"] } } }, { $limit: 10 }, { $project: { title: 1, score: { $meta: "searchScore" } } } ]) |
5. How to Create, View & Manage Indexes (Practical Commands)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
// List all indexes on collection db.students.getIndexes() // Create compound index db.students.createIndex({ age: 1, city: 1, "marks.math": -1 }) // Drop index db.students.dropIndex("age_1_city_1_marks.math_-1") // Explain plan – see if index is used db.students.find({ city: "Hyderabad", age: { $gt: 15 } }).explain("executionStats") |
Look for:
- stage: “COLLSCAN” → bad (full scan)
- stage: “IXSCAN” → good (using index)
6. Quick Summary Table – When to Use Which
| Your Query Pattern | Recommended Index Type | Example Command Snippet |
|---|---|---|
| Exact match on one field | Single field | { email: 1 } |
| Filter + sort on 2–3 fields | Compound (ESR order) | { status: 1, createdAt: -1 } |
| Keyword / full-text search (legacy) | Text index | { title: “text”, body: “text” } |
| Full-text + fuzzy + relevance + vectors | Atlas Search index | Create in Atlas UI → use $search |
| Shard key distribution | Hashed | { userId: “hashed” } |
| Auto-delete old logs | TTL | { expireAt: 1 }, { expireAfterSeconds: 0 } |
7. Mini Exercise – Try Right Now!
- Create compound index on { city: 1, age: -1 }
- Run find({ city: “Hyderabad” }).sort({ age: -1 }) → check .explain()
- Create text index on articles collection → try $text search
- (If on Atlas) Create Atlas Search index → try $search for “mongodb tutorial”
Understood beta? Indexing is the #1 performance lever in MongoDB — do it right and your app flies; ignore it and even simple queries crawl.
Next class — what do you want?
- Deep dive into Atlas Search + vector search (very hot in 2026 for AI apps)?
- Index intersection vs compound — when which wins?
- explain() deep dive + slow query log?
- Or performance tuning mini-project with real data?
Tell me — class is yours! 🚀❤️
Any doubt so far? Ask anything — no question is silly 😄
