Monitoring MongoDB Performance: Key Commands and Metrics Explained
Learn to proactively monitor your MongoDB performance using essential shell commands. This guide details how to track connection status via `db.currentOp()` and `db.serverStatus()`, analyze slow queries using the profiling commands (`db.setProfilingLevel`), and interpret crucial metrics related to resource utilization and index health for optimal database tuning.
Monitoring MongoDB Performance: Key Commands and Metrics Explained
Effective database management hinges on robust monitoring. For MongoDB, a leading NoSQL document database, understanding performance metrics is critical for maintaining high availability and responsiveness. Slow queries, excessive resource consumption, or unexpected connection spikes can severely impact application performance.
When MongoDB slows down, the first useful question is not "is the database bad?" It is "what is the server doing right now, and is that different from normal?" The commands below are the ones I use for a first pass before changing indexes, resizing hardware, or blaming the application.
Essential Monitoring Commands in the MongoDB Shell (mongosh)
The primary interface for running these commands is the MongoDB Shell (mongosh), or the legacy mongo shell. All commands shown here are executed within this shell environment.
1. Understanding Current Connections: db.currentOp() and db.serverStatus()
Monitoring active connections is vital for preventing connection exhaustion and identifying long-running operations that might be blocking resources.
db.currentOp()
This command returns information about operations currently executing on the database. It is indispensable for identifying slow or blocking queries in real-time.
Usage Example:
To see all operations currently running:
db.currentOp()
To specifically look for operations running longer than a certain threshold (e.g., operations running for more than 5 seconds):
db.currentOp({"secs_running": {$gt: 5}})
The output includes details like op, ns (namespace), query, and secs_running.
db.serverStatus()
While this command provides comprehensive status information, its connections section is crucial for monitoring connection pooling and limits.
Key Metrics within serverStatus (Connections Section):
current: The number of active connections to the server.available: The number of available connections that can be established (based on the configured max).
db.serverStatus().connections
2. Analyzing Query Performance: db.getProfilingStatus() and db.setProfilingLevel()
MongoDB provides built-in profiling tools that log the execution details of database operations, making it possible to identify resource-intensive queries.
Profiling Levels
Profiling levels determine what operations are logged:
- 0 (Off): No operations are profiled.
- 1 (Slow Operations): Only operations slower than the configured threshold (
slowms) are profiled. - 2 (All Operations): All operations are profiled, which generates significant write load and should only be used briefly for targeted troubleshooting.
Checking Status
To see the current profiling level:
db.getProfilingStatus()
Setting the Level (Example)
To enable profiling only for slow operations (operations exceeding 100 milliseconds):
// Set slowms to 100 milliseconds (default is usually 100)
db.setProfilingLevel(1, { slowms: 100 })
Tip: Always return profiling to level 0 after you have gathered the necessary information to prevent performance degradation caused by excessive logging.
Viewing Profiled Slow Queries
Profiled operations are stored in the system.profile collection within the specific database being monitored. To view the 10 slowest queries in the last hour:
db.system.profile.find().sort({millis: -1}).limit(10).pretty()
3. Resource Utilization Metrics
Understanding how MongoDB utilizes CPU, memory, and I/O resources is essential for scaling decisions.
Memory and Storage Usage: db.serverStatus()
The globalLock and storageEngine sections within serverStatus provide deep insights into resource management.
Memory Indicators:
resident: Amount of physical memory the process is using.virtual: Total virtual memory allocated by the process.
db.serverStatus().globalLock
Lock Contention Monitoring
MongoDB uses internal locking mechanisms. Monitoring lock acquisition and waits helps identify concurrency bottlenecks.
Key Metrics in globalLock:
currentQueue.readers: Number of readers waiting for a lock.currentQueue.writers: Number of writers waiting for a lock.totalTime: Total time spent waiting for locks across all operations.
High values in currentQueue often indicate that indexes are missing or that write operations are excessively long, causing readers/writers to queue up.
4. Index Usage and Health: db.collection.stats()
Poorly utilized or missing indexes are the most common cause of performance degradation. The stats() command helps analyze index efficiency.
When run on a specific collection (e.g., users):
db.users.stats()
Key Metrics to Check:
totalIndexSize: The total disk space consumed by all indexes on that collection.indexSizes: A breakdown of space usage per index.- If an index is present but never used for reads, it is overhead that should be considered for removal.
5. Disk I/O and Throughput: db.serverStatus() (Network and Operations)
Monitoring network activity and the rate of operations gives a view into database throughput.
**Operations Rate (from opcounters):
opcounters tracks the total number of operations executed since the last server restart, categorized by type:
insert,query,update,delete,getmore,command.
By tracking changes to these counters over time (e.g., comparing two consecutive serverStatus calls), you can calculate the operational throughput (operations per second).
Example Comparison:
- Run
db.serverStatus().opcountersat time T1. - Run
db.serverStatus().opcountersat time T2. - Subtract T1 values from T2 values to get the total operations executed in that interval.
Best Practices for Proactive Monitoring
- Automation is Key: Relying solely on manual shell commands is inefficient. Integrate monitoring using tools like MongoDB Cloud Manager/Ops Manager or third-party monitoring solutions that query these endpoints automatically.
- Establish Baselines: Run commands when the system is healthy to establish a performance baseline. Any deviation from this baseline warrants immediate investigation.
- Focus on Latency: While operation counts are useful, prioritize latency metrics (like the time reported by profiling logs) over raw throughput when diagnosing end-user experience issues.
- Check Connections Frequently: In high-traffic applications, connection limits are often hit first. Monitor
db.serverStatus().connections.currentrelative to the configured maximum.
A Practical First-Pass Checklist
When someone says "MongoDB is slow," avoid jumping straight to index changes. Start with a short checklist and write down what you see.
Check whether the server is overloaded with active operations:
db.currentOp({
active: true,
secs_running: { $gt: 2 }
});
A few long-running operations may be normal for analytics jobs. A large pile of writes, collection scans, or blocked operations is different. Look for the namespace in ns, the operation type in op, and the query shape. If many operations are waiting behind one update or index build, the fix is not the same as a missing index on a read query.
Then check connections:
db.serverStatus().connections;
current rising quickly can mean an application connection pool was misconfigured, a deploy created too many workers, or clients are timing out and reconnecting. available near zero is an urgent signal because new clients may fail to connect. The right answer might be pool tuning in the app, not raising the server limit.
Next, check operation counters twice, a short interval apart:
const a = db.serverStatus().opcounters;
sleep(5000);
const b = db.serverStatus().opcounters;
printjson({
insertPer5s: b.insert - a.insert,
queryPer5s: b.query - a.query,
updatePer5s: b.update - a.update,
deletePer5s: b.delete - a.delete,
commandPer5s: b.command - a.command
});
Counters since startup are useful for long-term context, but differences over a known interval tell you what is happening now. If command traffic is high but queries are low, you may be looking at metadata checks, monitoring noise, or driver behavior rather than normal reads.
Using explain() Before Blaming Hardware
The profile collection can tell you which operations are slow. explain() helps you understand why a query is slow before you add CPU or memory.
db.users.find({ email: "[email protected]" }).explain("executionStats");
In the output, compare totalDocsExamined with nReturned. If MongoDB examines a huge number of documents to return one user, the query likely needs a better index or a different filter. If totalKeysExamined is high, an index exists but may not be selective enough for the query pattern.
For a compound query, index order matters:
db.orders.find({
accountId: "acct_123",
status: "open",
createdAt: { $gte: ISODate("2025-11-01T00:00:00Z") }
}).sort({ createdAt: -1 });
A useful index might be:
db.orders.createIndex({ accountId: 1, status: 1, createdAt: -1 });
That is not a universal rule. The best index depends on cardinality, sort order, and the full set of queries hitting the collection. The point is to make the database show you the execution plan instead of guessing.
Reading Profiling Data Without Overreacting
Profiling level 2 logs every operation and can add overhead on busy systems. Use it only for a short, targeted window. Level 1 with a reasonable slowms threshold is safer for finding slow operations.
db.setProfilingLevel(1, { slowms: 200 });
After collecting data, inspect the slowest entries:
db.system.profile.find(
{},
{
ns: 1,
op: 1,
millis: 1,
command: 1,
keysExamined: 1,
docsExamined: 1,
nreturned: 1
}
).sort({ millis: -1 }).limit(20).pretty();
One slow query does not always mean a production incident. A scheduled report, a cold cache after restart, or a rare maintenance task can show up near the top. Patterns matter more than a single sample. If the same query shape appears repeatedly and examines far more documents than it returns, you have a real tuning candidate.
Monitoring Replica Sets and Storage Pressure
For replica sets, performance is not only about the primary. A secondary that falls behind can affect failover confidence and read workloads if clients use secondary reads.
rs.status();
Look for members that are not healthy, unexpected state changes, or replication lag that does not recover. The exact acceptable lag depends on the application. A queue-like workload may tolerate a little delay. A dashboard that promises near-real-time reads may not.
Storage pressure needs the same context. db.serverStatus() can show storage engine and WiredTiger metrics, but disk-level tools still matter. If MongoDB is waiting on slow disks, shell commands inside the database will show symptoms rather than the root cause. Correlate with host metrics such as disk latency, filesystem usage, CPU steal, and memory pressure.
Turning Manual Checks Into Alerts
Manual commands are best during investigation. For normal operations, convert the useful signals into automated checks: connection usage, replication health, slow query rate, disk usage, page faults or cache pressure where available, and operation latency. Alert on sustained bad behavior, not every one-minute spike.
Good alerts include context. "MongoDB slow queries high" is less helpful than an alert that includes the database, collection, query shape, current rate, and a link to recent profile samples or dashboard panels. The goal is to shorten the first ten minutes of the incident.
What Not to Do During a Slowdown
Avoid making several changes at once. Adding an index, increasing connection limits, restarting the application, and changing pool sizes in the same incident may clear the symptom, but it leaves you with no idea which action helped. Make one change, watch the metric that should improve, and keep notes.
Be careful with killOp. It can be useful when one operation is clearly harmful, but killing random long-running operations can make application behavior worse. If the operation belongs to a migration, backup, index build, or reporting job, identify the owner before stopping it unless the database is already in serious trouble.
Do not treat serverStatus() as a single magic health score. It is a collection of counters and snapshots. A high value can be normal on a large busy system, and a low value can be bad on a small latency-sensitive system. The useful question is whether the value changed in a way that matches the user-facing problem.
Also separate database symptoms from deployment symptoms. A fresh release that changes query shape, opens larger connection pools, or starts a background migration can make MongoDB look like the root cause. Compare the timing of slow operations with deploys, job schedules, backups, and traffic changes before making a database-only fix.
MongoDB monitoring works best when you compare current behavior to a known baseline. db.currentOp(), db.serverStatus(), profiling, explain(), and replica set checks give you enough evidence to decide whether the problem is a query, an index, client connection behavior, replication, or the host underneath the database.