Mastering MongoDB Indexing for Optimal Query Performance

Unlock optimal query performance in MongoDB by mastering indexing techniques. This comprehensive guide covers single-field, compound, and multikey indexes, explains the power of covering queries, and guides you through using `explain()` for performance analysis. Learn best practices to speed up your read operations, reduce database load, and build a more responsive application. Essential reading for any MongoDB developer focused on performance tuning.

Mastering MongoDB Indexing for Optimal Query Performance

MongoDB indexing gets interesting when the database is no longer small enough for lucky guesses. A query that felt instant in development can become painful in production once a collection has millions of documents, a dashboard adds sorting, or an API endpoint starts filtering by several fields at once.

The goal is not to index every field. That usually makes writes slower, consumes memory and disk, and still leaves important queries uncovered. The goal is to understand the handful of query shapes your application actually depends on, then build indexes that match those shapes.

Understanding MongoDB Indexes

At its core, an index is like an index in a book. Instead of reading the entire book to find a topic, you consult a sorted reference and jump close to the right page. MongoDB indexes help the query planner locate matching documents without scanning the whole collection. Without a useful index, MongoDB may perform a collection scan, examining documents one by one until it finds the matches.

Collection scans are not always evil. Scanning a tiny collection may be fine. Running an admin report once a month may be fine. But a collection scan inside a high-traffic request path is different. It competes with normal reads and writes, grows worse as data grows, and often shows up as unpredictable latency.

How Indexes Work

MongoDB commonly uses B-tree style indexes for normal field indexes. The important practical detail is that indexed values are stored in order. That ordering helps MongoDB with equality filters, range filters, and sorts when the query shape lines up with the index.

For example, an index on { email: 1 } is perfect for:

db.users.findOne({ email: "[email protected]" })

It is not useful for:

db.users.find({ lastLoginAt: { $lt: ISODate("2025-01-01") } })

That second query needs an index that starts with lastLoginAt, or it needs to scan.

When to Use Indexes

Indexes are most beneficial for fields that are frequently used in:

  • Query criteria (find(), findOne()): Fields used in the filter document of your queries.
  • Sort criteria (sort()): Fields used to order the results of your queries.
  • _id field: By default, MongoDB creates an index on the _id field, ensuring uniqueness and fast lookups by ID.

However, indexes also have a cost:

  • Storage space: Indexes consume disk space.
  • Write performance: Indexes need to be updated whenever documents are inserted, updated, or deleted, which can slow down write operations.
  • Memory pressure: Frequently used index pages compete for cache. Too many large indexes can make the working set harder to keep in memory.

Therefore, it's crucial to create indexes strategically, focusing on fields that will yield the most significant performance gains for your common read operations.

Creating and Managing Indexes

MongoDB provides the createIndex() method to create indexes and getIndexes() to view existing ones. The dropIndex() method is used to remove them.

Basic Index Creation

To create a single-field index, you specify the field name and the index type (usually 1 for ascending or -1 for descending order).

db.collection.createIndex( { fieldName: 1 } );

Example: Indexing a username field in ascending order:

db.users.createIndex( { username: 1 } );

Viewing Indexes

To see the indexes on a collection:

db.collection.getIndexes();

Example: Viewing indexes on the users collection:

db.users.getIndexes();

This will return an array of index definitions, including the default _id index.

On a busy collection, create indexes deliberately. Modern MongoDB versions support online index builds in many common cases, but index builds still consume CPU, disk I/O, and memory. On production systems, schedule large index builds during quieter periods and watch replication lag if you run a replica set.

Dropping Indexes

To remove an index:

db.collection.dropIndex( "indexName" );

You can find the indexName from the output of getIndexes(). Alternatively, you can drop an index by specifying the indexed field(s) in the same format as createIndex():

db.collection.dropIndex( { fieldName: 1 } );

Example: Dropping the username index:

db.users.dropIndex( "username_1" ); // Using index name
// OR
db.users.dropIndex( { username: 1 } ); // Using index definition

Before dropping an index, check whether anything still uses it:

db.users.aggregate([{ $indexStats: {} }])

This shows access counters since the server started. A counter of zero is a clue, not absolute proof. The server may have restarted recently, or the query may run only during a weekly job. For important systems, combine $indexStats, application code search, query logs, and a short observation period.

Compound Indexes

Compound indexes involve multiple fields. The order of fields in a compound index is critical. MongoDB uses compound indexes for queries that involve multiple fields in the filter or sort clauses.

When to Use Compound Indexes

Compound indexes are most effective when your queries frequently filter or sort by a combination of fields. The index can satisfy queries that match the fields in the same order as they are defined in the index or prefix of the index.

Example: Consider a collection of orders with fields like userId, orderDate, and status. If you frequently query for orders by a specific user and sort them by date, a compound index on { userId: 1, orderDate: 1 } would be highly beneficial.

db.orders.createIndex( { userId: 1, orderDate: 1 } );

This index can efficiently support queries like:

  • db.orders.find( { userId: "user123" } ).sort( { orderDate: -1 } )
  • db.orders.find( { userId: "user123", orderDate: { $lt: ISODate() } } )

However, it might not be as effective for queries that only filter by orderDate if userId is not also specified, or if the fields are in a different order.

Field Order Matters

The order of fields in a compound index determines which query patterns it can support well. A useful rule of thumb is equality fields first, then sort fields, then range fields. This is often called the ESR guideline: equality, sort, range. It is a guideline, not a law, but it prevents many bad index designs.

Suppose your orders page runs this query:

db.orders.find({
  tenantId: "t1",
  status: "paid",
  createdAt: { $gte: ISODate("2025-01-01") }
}).sort({ createdAt: -1 })

A reasonable index might be:

db.orders.createIndex({ tenantId: 1, status: 1, createdAt: -1 })

tenantId and status are equality filters. createdAt supports the sort and range. If you instead create { createdAt: -1, status: 1, tenantId: 1 }, MongoDB may still use it in some cases, but it is usually less aligned with this query.

For queries that sort results, the order of fields in the index should match the order of fields in the sort() operation for optimal performance. If a query includes both a filter and a sort, and the index matches the filter fields, it can also be used for sorting without a separate collection scan for sorting.

Compound indexes can also serve prefix queries. An index on { tenantId: 1, status: 1, createdAt: -1 } can help a query on tenantId alone, or tenantId plus status. It generally cannot help much with a query on status alone because status is not the leading field.

Covering Queries

A covering query is a query where MongoDB can satisfy the entire query by using only the index. This means that the index contains all the fields that are being queried and projected. Covering queries avoid fetching documents from the collection itself, making them extremely fast.

How to Achieve Covering Queries

To achieve a covering query, ensure that:

  1. You have an index that includes all the fields used in the query's filter.
  2. You include only those indexed fields (or a subset of them) in your projection.

Example: Consider an employees collection with fields name, age, and city. If you have an index { city: 1, age: 1 } and want to retrieve the names and ages of employees in a specific city, you can create a covering query:

db.employees.find( { city: "New York" }, { name: 1, age: 1, _id: 0 } ).explain()

In this query, city is in the index, and name and age are included in the projection. If the index also contained name and age, it would be a covering query.

Let's refine the index and query for a true covering query:

// Create an index that includes all fields needed for the query and projection
db.employees.createIndex( { city: 1, age: 1, name: 1 } );

// Now, a query that filters by city and projects name and age can be covered
db.employees.find( { city: "New York" }, { name: 1, age: 1, _id: 0 } )

When you run explain("executionStats") on this query, a covered plan should examine index keys without fetching full documents from the collection. In many explain plans, that means you will see an IXSCAN without a FETCH stage, and totalDocsExamined should be 0. Explain output varies by MongoDB version and query shape, so focus on the actual plan stages and examined counts rather than looking for one exact label.

Covering queries are useful for hot read paths such as autocomplete, small list views, or permission checks. They are less useful if the projection includes large fields, many fields, or fields that change constantly. Adding too many fields to an index just to cover a query can create a bulky index that hurts write performance.

Other Important Index Types

MongoDB offers various index types for specific use cases:

Multikey Indexes

Multikey indexes are automatically created when you index an array field. They allow you to query elements within arrays.

Example: If you have a products collection with a tags array field ["electronics", "gadgets"]:

db.products.createIndex( { tags: 1 } );

This index will support queries like db.products.find( { tags: "electronics" } ).

Arrays require extra care in compound indexes. A multikey index stores entries for array elements, which can increase index size quickly. MongoDB also has restrictions around compound multikey indexes when more than one indexed field can contain arrays in the same document. If your data model has several arrays and complex filters, test the exact query with representative data before assuming a compound index will behave the way a scalar-field index does.

Text Indexes

Text indexes support efficient searching of string content in documents. They are used for text search queries using $text operator.

db.articles.createIndex( { content: "text" } );

This allows for searches like: db.articles.find( { $text: { $search: "database performance" } } ).

Text indexes are useful for basic text search, but they are not a full search platform. If you need advanced relevance tuning, typo tolerance, faceting, highlighting, or language-specific search behavior, MongoDB Atlas Search or a dedicated search engine may be a better fit.

Geospatial Indexes

Geospatial indexes are used for efficient querying of geographical data using the $near, $geoWithin, and $geoIntersects operators.

db.locations.createIndex( { loc: "2dsphere" } ); // For 2dsphere index

Unique Indexes

Unique indexes enforce uniqueness for a field or a combination of fields. If a duplicate value is inserted or updated, MongoDB will return an error.

db.users.createIndex( { email: 1 }, { unique: true } );

For production user tables, normalize before enforcing uniqueness. Email addresses are a common example. If your application treats [email protected] and [email protected] as the same user, store a normalized field such as emailLower and put the unique index there. Do not rely on application code alone to prevent duplicates under concurrency.

Partial Indexes

Partial indexes index only documents that match a filter expression. They are useful when a query focuses on a subset of a collection.

db.orders.createIndex(
  { tenantId: 1, createdAt: -1 },
  { partialFilterExpression: { status: "open" } }
)

This can help if your application frequently reads open orders and closed orders make up most of the collection. The index is smaller because it excludes documents that do not match the partial filter. The query must include a compatible condition for MongoDB to use it.

TTL Indexes

TTL indexes automatically remove documents after a configured time. They are commonly used for sessions, temporary tokens, or short-lived events.

db.sessions.createIndex(
  { expiresAt: 1 },
  { expireAfterSeconds: 0 }
)

TTL deletion is not instant at the exact expiration time. MongoDB removes expired documents in the background. Use it for cleanup, not for precise security timing where a token must become invalid immediately. Your application should still check expiration during reads.

Performance Analysis with explain()

Understanding how MongoDB executes your queries is crucial for optimizing them. The explain() method provides insights into the query execution plan, including whether an index was used and how.

db.collection.find( {...} ).explain( "executionStats" );

Key fields to look for in the explain() output:

  • winningPlan.stage: Indicates the stage of the execution plan (e.g., COLLSCAN for collection scan, IXSCAN for index scan).
  • executionStats.totalKeysExamined: The number of index keys examined.
  • executionStats.totalDocsExamined: The number of documents examined.

A good execution plan will have totalDocsExamined close to or equal to the number of documents returned, and totalKeysExamined significantly less than the total number of documents in the collection. If totalDocsExamined is very high, or COLLSCAN is used, it suggests an index is missing or not being used effectively.

Here is the quick way I read an explain plan:

  1. Look for COLLSCAN. If this is a hot path and the collection is large, that is usually the first problem.
  2. Look for IXSCAN followed by FETCH. A fetch is normal when the query needs fields outside the index, but excessive document examination means the index is not selective enough.
  3. Compare nReturned, totalKeysExamined, and totalDocsExamined. Returning 20 documents after examining 25 keys is healthy. Returning 20 documents after examining 500,000 keys is not.
  4. Watch for in-memory sorts. If MongoDB has to sort a large result set after filtering, a compound index that supports the sort may help.

Use realistic filters when testing. An explain plan for tenantId: "demo" may not match a large tenant with millions of documents. Data distribution matters.

A Practical Index Design Walkthrough

Imagine an application with a tickets collection. Support agents use a queue page with these filters:

db.tickets.find({
  tenantId: "acme",
  status: "open",
  assigneeId: "u123"
}).sort({ updatedAt: -1 }).limit(50)

Start with the query shape, not the field list. The collection is multi-tenant, agents usually filter by status and assignee, and the UI sorts newest updates first. A practical index is:

db.tickets.createIndex({
  tenantId: 1,
  status: 1,
  assigneeId: 1,
  updatedAt: -1
})

Now consider another page: managers view all open tickets, regardless of assignee:

db.tickets.find({
  tenantId: "acme",
  status: "open"
}).sort({ updatedAt: -1 }).limit(100)

The previous index can use the prefix { tenantId, status }, but assigneeId sits before updatedAt, so it may not support the sort as well for this manager query. You may need a second index:

db.tickets.createIndex({
  tenantId: 1,
  status: 1,
  updatedAt: -1
})

That is a normal tradeoff. One index rarely serves every screen perfectly. The job is to support the important paths without creating a pile of overlapping indexes that all cost writes.

Best Practices for MongoDB Indexing

  • Index only what you need: Avoid creating indexes on fields that are rarely queried or sorted. Each index adds overhead.
  • Use compound indexes wisely: Order fields correctly based on query patterns. Consider the most selective fields first.
  • Aim for covering queries: If read performance is critical, design indexes to cover common read operations.
  • Monitor index usage: Regularly review index usage using explain() and db.collection.aggregate([{ $indexStats: {} }]) to identify unused or inefficient indexes.
  • Consider index selectivity: Indexes on fields with low cardinality (few distinct values) might not be as effective as those on fields with high cardinality.
  • Keep indexes small: Avoid including large fields or arrays in indexes unless absolutely necessary for covering queries.
  • Test your indexes: Always test the impact of new indexes on both read and write performance under realistic load conditions.
  • Remove redundant indexes carefully: If you have { a: 1, b: 1 }, a separate { a: 1 } index may be redundant for many workloads. Confirm usage before deleting.
  • Design around real screens and jobs: Indexes should map to application behavior: login lookup, queue page, report filter, background worker scan.
  • Revisit after schema changes: A new field, new sort order, or new tenant model can make an old index less useful.

What Good Indexing Feels Like

Good MongoDB indexing is usually quiet. The important queries examine roughly the amount of data they return. Sorts do not spill into expensive work. Writes are not burdened by a dozen speculative indexes. When a new feature adds a new query shape, you test it with explain("executionStats") before it becomes a production incident.

The practical habit is simple: collect the real query, design the smallest useful index for that query shape, test with representative data, and keep checking index usage over time. That habit will do more for MongoDB performance than memorizing every index type.