MySQL Slow Query Troubleshooting: A Step-by-Step Guide

A practical workflow for finding slow MySQL queries, reading plans, fixing indexes, and proving the change worked.

MySQL Slow Query Troubleshooting: A Step-by-Step Guide

MySQL slow query troubleshooting starts with one uncomfortable rule: do not guess from the query text alone. A query that looks ugly may be harmless because it runs once a day. A query that looks simple may be ruining the database because it runs thousands of times a minute, scans too many rows, or waits behind locks.

The useful workflow is boring in the best way. Capture real slow queries, group them by cost, inspect the execution plan, change one thing, and measure again. That keeps you from adding random indexes, changing global settings blindly, or blaming MySQL when the application is sending an avoidable query pattern.

I usually start with three questions:

  • Which query is hurting users, not just looking suspicious?
  • Is the time spent reading rows, sorting, waiting on locks, or waiting on the application?
  • Can I prove the fix with EXPLAIN, timing, and fresh slow log data?

Start with the slow query log

The MySQL slow query log records statements that cross the configured threshold. According to the MySQL manual, the log is disabled by default, long_query_time defaults to 10 seconds, and a statement normally needs to run at least that long and examine at least min_examined_row_limit rows before it is logged. If log_queries_not_using_indexes is enabled, MySQL can also log statements that do not use indexes for row lookups. That option is useful during diagnosis, but it can produce a lot of noise on busy systems.

A practical starting configuration looks like this:

[mysqld]
slow_query_log = 1
slow_query_log_file = /var/log/mysql/mysql-slow.log
long_query_time = 1
log_output = FILE

On many production systems, 1 second is a reasonable first pass. For a latency-sensitive API, you may temporarily lower it to 0.5 or 0.2. Do that with a plan and a disk-space watch. A high-traffic database can write a surprising amount of slow log data once the threshold drops.

You can check the active settings from a MySQL session:

SHOW VARIABLES LIKE 'slow_query_log';
SHOW VARIABLES LIKE 'slow_query_log_file';
SHOW VARIABLES LIKE 'long_query_time';
SHOW VARIABLES LIKE 'log_output';
SHOW VARIABLES LIKE 'min_examined_row_limit';
SHOW VARIABLES LIKE 'log_queries_not_using_indexes';

For a temporary investigation, you can enable the log without editing the config file:

SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1;

Remember that SET GLOBAL changes may not survive a restart unless you also persist them through your normal configuration process. On MySQL 8, some teams use SET PERSIST, but I still prefer committing the intended setting to config management so the next operator can see it.

If you turn on log_queries_not_using_indexes, consider also setting log_throttle_queries_not_using_indexes so one noisy endpoint does not flood the log. MySQL supports that throttle specifically because no-index logging can grow quickly.

Group the log before reading individual queries

Raw slow logs are repetitive. You may see the same query hundreds of times with different IDs. Reading the file top to bottom wastes time and makes rare scary queries look more important than common expensive ones.

Start with mysqldumpslow, which ships with MySQL installations in many environments:

sudo mysqldumpslow -s t -t 10 /var/log/mysql/mysql-slow.log

That asks for the top ten patterns sorted by query time. The exact flags vary by version and platform, so check mysqldumpslow --help if your command behaves differently. Useful sorts include total time, average time, lock time, and rows examined.

For production investigations, pt-query-digest from Percona Toolkit is often better because it gives richer grouping and percentile-style detail. The tool is not magic; it simply saves you from doing arithmetic by hand. The important thing is to rank by impact. A query that takes eight seconds once per night may be less urgent than a query that takes 120 milliseconds but runs 600 times per second.

When reading the grouped output, look for patterns:

  • High total time: likely user-visible or resource-heavy.
  • High count: often an application loop or missing cache.
  • High rows examined with low rows sent: usually an indexing or filtering problem.
  • High lock time: possibly a transaction, write contention, metadata lock, or DDL issue.

Do not assume high Rows_examined is always bad. Reporting queries and batch jobs sometimes scan intentionally. The question is whether the scan matches the job and whether it happens at the right time.

Reproduce one query safely

Pick one query pattern and get a real sample with parameters. If the slow log normalized literals, find the original query in application logs, APM traces, or the raw slow log entry.

Before running it manually, check the blast radius. A slow SELECT on a replica is usually safe. A slow UPDATE in production is not something to casually rerun. For write queries, inspect the plan and transaction pattern first, or test against a staging copy with realistic data.

A useful scratch note for each query looks like this:

Endpoint: GET /customers/123/orders
Query pattern: orders by customer and status, newest first
Observed: 1.8s average, 420k rows examined, 20 rows sent
Table size: 12M rows
Expected result size: one page of orders
Suspicion: missing composite index for customer_id, status, created_at

That note keeps the work tied to a real user path instead of a random SQL snippet.

Use EXPLAIN, then read it like an operator

Run EXPLAIN on the slow query:

EXPLAIN
SELECT id, customer_id, status, created_at, total
FROM orders
WHERE customer_id = 123
  AND status = 'paid'
ORDER BY created_at DESC
LIMIT 20;

For MySQL 8, EXPLAIN ANALYZE can execute the query and show actual timing information. Use it carefully on expensive queries because it really runs the statement. For plain SELECT queries in a controlled environment, it can be very helpful.

The columns I check first are type, possible_keys, key, rows, filtered, and Extra.

type tells you the access pattern. const, eq_ref, ref, and range are usually good signs. index means MySQL is scanning an index, which may still be too much work. ALL means a full table scan. A full scan is not automatically wrong on a tiny table, but it is suspicious on a hot table with millions of rows.

key shows the index MySQL chose. If possible_keys lists a promising index but key is different, the optimizer may think the other index is cheaper. That can happen because of poor selectivity, stale statistics, or an index that does not match the filter and sort together.

rows is an estimate, not a promise. If the estimate is wildly wrong, run ANALYZE TABLE during an appropriate maintenance window or review whether the data distribution is skewed.

Extra often tells the story. Using filesort means MySQL needs a separate sort step; it does not necessarily mean disk sorting, but it is worth checking when the result set is large. Using temporary often appears with grouping, distinct queries, or complex sorts. Using index can be good because the query is satisfied from the index without reading table rows.

Fix indexes with the full query shape in mind

The most common slow-query fix is not "add an index to the column in the WHERE clause." The better rule is: build an index that matches the way the query filters, joins, sorts, and limits rows.

For the orders query above, a single-column index on customer_id might help, but it may still leave MySQL sorting many rows for that customer. A composite index is often more useful:

CREATE INDEX idx_orders_customer_status_created
ON orders (customer_id, status, created_at);

If the query orders by newest first, MySQL can often scan the index in reverse. In MySQL 8 you can also define descending indexes when that fits a larger pattern:

CREATE INDEX idx_orders_customer_status_created_desc
ON orders (customer_id, status, created_at DESC);

Column order matters. Put equality filters first, then range or sort columns when that matches the query. For example, with WHERE customer_id = ? AND status = ? ORDER BY created_at DESC LIMIT 20, customer_id, status, created_at is usually more useful than created_at, customer_id, status.

Do not add every index that seems helpful. Indexes speed reads but slow writes and consume storage. If the table receives heavy inserts or updates, a new composite index has a real cost. Check existing indexes first:

SHOW INDEX FROM orders;

Sometimes the right answer is to replace two weak indexes with one better composite index, not keep all three.

Rewrite queries that block index use

Some slow queries are slow because they hide indexed values behind functions or patterns MySQL cannot use efficiently.

This version is common and painful:

SELECT *
FROM orders
WHERE YEAR(created_at) = 2026;

If created_at is indexed, wrapping it in YEAR() can prevent a normal range lookup. Write the predicate as a range instead:

SELECT id, customer_id, status, created_at, total
FROM orders
WHERE created_at >= '2026-01-01'
  AND created_at <  '2027-01-01';

The same idea applies to leading wildcard searches:

WHERE email LIKE '%@example.com'

A normal B-tree index cannot jump into the middle of a string. If suffix searches matter, you may need a generated column, a separate normalized field, or a search system designed for that use case.

Also watch for SELECT *. It looks harmless during development, but it can force MySQL to read table rows when a smaller projection could use a covering index. It also sends unnecessary data across the network.

Check locks when the query plan looks fine

A query can have a decent plan and still be slow because it is waiting. Slow log Lock_time can point you in that direction, but it does not explain every kind of wait. If users report random pauses, check active sessions:

SHOW PROCESSLIST;

On MySQL 8, the Performance Schema and sys schema views can give better detail, depending on how the server is configured. For a quick look, I often check for long-running transactions and blocked statements before changing indexes.

A real example: an UPDATE orders SET status = ? WHERE id = ? query should be fast. If it appears in the slow log with a primary key lookup, the problem may be a transaction that left the row locked while doing unrelated work. The fix is not another index. The fix is shortening the transaction and moving slow external calls outside it.

Metadata locks can create a similar trap. A migration running ALTER TABLE may wait on an old transaction, while new queries pile up behind the pending DDL. The slow query log will show symptoms, but the root cause is deployment behavior.

Tune server settings only after query work

Configuration matters, but it is easy to overuse as a first answer. If a query scans five million rows to return ten, increasing memory may only make the bad plan hurt less.

For InnoDB-heavy systems, innodb_buffer_pool_size is the first setting to review. On a dedicated MySQL server it is often set to a large share of memory, but the right value depends on what else runs on the host, dataset size, and workload. Do not copy a percentage blindly from a blog post.

Also check whether the database is waiting on disk. If the working set does not fit in memory, or the storage is saturated, even well-indexed queries can stall. Pair query review with host metrics: CPU, disk latency, IOPS, memory pressure, and connection count.

Connection pools can make slow queries look worse. If one endpoint fires too many slow statements, the pool fills, unrelated requests wait for connections, and the whole app feels broken. In that case, fixing the query is still the main job, but pool limits and timeouts determine how gracefully the system fails.

Prove the fix

After adding an index or rewriting a query, run EXPLAIN again. You want to see fewer estimated rows, a better chosen key, and fewer expensive extra steps. Then test the actual query with realistic parameters.

Do not stop at one fast run. Warm cache can hide problems. Try common, large, and awkward cases:

  • A customer with many orders.
  • A customer with no matching orders.
  • A date range that spans a busy period.
  • A status value that matches most rows.

Then watch the slow log after deployment. The best outcome is not "the query looked better in staging." The best outcome is that the query pattern disappears from the top offenders, CPU or I/O pressure drops, and the user path is faster.

MySQL slow query troubleshooting is mostly disciplined evidence gathering. Enable the log with sane thresholds, group the expensive patterns, inspect the plan, fix the query shape, and validate with fresh data. That habit prevents both under-fixing and over-fixing, which is exactly what you want when the database is already under pressure.