Common Jenkins Performance Bottlenecks and How to Fix Them
Struggling with a slow Jenkins instance? This comprehensive guide dives into common Jenkins performance bottlenecks, including memory leaks, disk space issues, and excessive logging. Learn to identify symptoms, understand root causes, and implement actionable solutions like JVM tuning, intelligent build history management, log optimization, and efficient pipeline coding. Discover essential monitoring tools and best practices to keep your CI/CD pipelines running smoothly, ensuring faster builds, responsive UI, and an overall more efficient software delivery process.
Common Jenkins Performance Bottlenecks and How to Fix Them
A slow Jenkins instance usually does not have one cause. The UI feels heavy, builds wait in the queue, agents disconnect, logs take forever to open, and someone says, "Jenkins is down again." Underneath that complaint, the problem is often one of a few ordinary bottlenecks: controller heap pressure, slow disk, overloaded agents, plugin trouble, bad pipeline behavior, or network delays to source control and artifact systems.
The fastest way to fix Jenkins performance is to separate controller problems from build problems. If the Jenkins UI, queue, and job pages are slow even when no builds are running, start with the controller. If the UI is fine but builds take too long, start with agents, workspaces, caches, and external systems.
Controller memory and garbage collection
The Jenkins controller is a Java process. It needs enough heap for job configuration, plugins, build metadata, queue state, and web requests. When the heap is too small, the controller spends too much time in garbage collection. When a plugin leaks memory or stores too much data in memory, increasing heap may only delay the next incident.
Symptoms include a slow UI, long pauses, OutOfMemoryError, frequent agent disconnects, or build queue delays that do not match available executors.
Check the process and logs first:
ps -o pid,rss,vsz,etime,cmd -C java
journalctl -u jenkins --since "2 hours ago" | grep -Ei 'OutOfMemory|GC overhead|heap|killed'
For a moderate controller, a heap such as 2 to 4 GB may be enough. Busy installations may need more. Do not blindly set heap to most of the machine's RAM. The OS still needs memory for filesystem cache, process overhead, and monitoring agents.
Typical service options look like this:
JENKINS_JAVA_OPTS="-Xms1g -Xmx4g -XX:+UseG1GC"
After changing JVM options, restart Jenkins during a maintenance window and watch behavior under normal load. If memory climbs steadily for days and never settles, take a heap dump and review recently updated or installed plugins. Keep plugins current, but avoid updating a large plugin set without a rollback plan.
Disk space and disk I/O
Jenkins uses disk constantly. JENKINS_HOME stores job config, build records, fingerprints, plugin data, secrets, logs, and sometimes too many artifacts. Agents use disk for workspaces, dependency caches, Docker layers, test reports, and temporary files.
A full disk is obvious. A slow disk is more annoying because nothing looks broken; everything just waits.
Check both capacity and latency:
df -h
du -sh /var/lib/jenkins/* 2>/dev/null | sort -h | tail
iostat -xz 1
If %util and await times are high during builds, the disk is a bottleneck. Common fixes are moving workspaces to faster storage, pruning old Docker layers, reducing artifact retention, and stopping jobs from archiving entire directories when only reports or packages are needed.
Set build discarding policies on jobs:
options {
buildDiscarder(logRotator(numToKeepStr: '30', artifactNumToKeepStr: '10'))
}
Be careful with manual cleanup in JENKINS_HOME. Do not delete random XML files or plugin directories while Jenkins is running. Use Jenkins retention settings, plugin-specific cleanup tools, and backups.
Too much work on the controller
One of the most damaging configurations is running builds on the controller. The controller should not be compiling, running tests, building Docker images, or doing large checkouts.
Set controller executors to 0 in most installations. Put builds on agents. If you already run controller builds, move them gradually and watch for hidden assumptions such as local tool paths, credentials bindings, or files that only exist on the controller.
Also check pipelines for controller-heavy Groovy code. Pipeline steps such as sh run on agents, but Groovy logic in the Jenkinsfile can run on the controller. Avoid reading huge files into Groovy variables, building massive maps, or doing large JSON processing in pipeline script. Use shell, Python, jq, or your build tool on the agent for heavy data work.
Agents are overloaded or mismatched
If queue time is high for one label, adding generic executors will not help. A job requiring linux && docker && large-memory needs that exact capacity.
Look at queue reasons and label usage. Then check the agent OS:
uptime
free -h
mpstat 1
iostat -xz 1
docker system df
If an agent is swapping, reduce executors or increase memory. If CPU is pegged and build duration rises during busy periods, reduce concurrency or add agents. If I/O wait is high, move caches and workspaces to faster storage or lower the number of concurrent jobs on that node.
For Kubernetes agents, resource requests matter as much as Jenkins executor count. A pod requesting too little CPU or memory may be scheduled onto an already busy node, then Jenkins sees a ready agent while the build crawls. For disposable pods, one executor per pod is usually easier to reason about than multiple executors sharing the same container.
Plugin problems
Plugins are one of Jenkins' strengths, and also a common source of performance trouble. A plugin can add page rendering cost, slow job loading, retain too much build history, or make external calls during normal UI actions.
When performance changes suddenly, ask what changed recently:
- Jenkins core upgrade.
- Plugin upgrade.
- New plugin installation.
- New global configuration.
- New pipeline library version.
Use "Manage Jenkins" health information, logs, and plugin changelogs. If the UI became slow after a plugin update, test rollback in a staging controller if you have one. Keep a backup of JENKINS_HOME and plugin versions before large upgrades.
Do not keep plugins "just in case." Each installed plugin adds maintenance surface. Remove unused plugins after checking job dependencies.
SCM and artifact repository delays
Many "Jenkins is slow" reports are really Git, package registry, container registry, or artifact repository problems.
Check build logs for repeated slow steps:
git fetch
mvn dependency:resolve
npm ci
docker pull
docker push
archiveArtifacts
If every job waits on dependency downloads, add a nearby proxy or cache. If git fetch is slow, check repository size, branch discovery, shallow clone settings, and network path from agents to the Git server. If Docker pulls are slow on ephemeral agents, use a registry mirror or BuildKit registry cache.
Keep the diagnosis honest: Jenkins schedules the work, but it cannot make a distant package registry fast.
Log and build history bloat
Large console logs slow page rendering and take storage. Jobs that print every test fixture, every HTTP response, or full debug logs during normal builds eventually make Jenkins painful to use.
Fix the job first. Reduce normal log verbosity and archive detailed logs as compressed artifacts only when needed. Keep console output focused on progress and failure context.
Then set retention:
options {
buildDiscarder(logRotator(daysToKeepStr: '30', numToKeepStr: '50'))
}
For compliance-heavy environments, move long-term artifacts and logs to an external storage system designed for retention, search, and lifecycle policies.
A practical incident path
When Jenkins is slow right now, use this order:
- Check whether the controller UI is slow.
- Check controller CPU, memory, GC symptoms, and disk.
- Check queue reasons and which labels are waiting.
- Check the busiest agents for CPU, memory, disk, and workspace growth.
- Compare recent plugin, job, and shared library changes.
- Read one slow build log and identify the repeated expensive step.
That path prevents random tuning. Increasing heap will not fix a saturated Docker agent. Adding executors will not fix a full disk. Pruning workspaces will not fix a plugin causing controller pauses.
Keep Jenkins maintainable
Healthy Jenkins installations have boring habits: controller executors set to zero, agents sized for their workload, build retention configured, dependency caches intentional, plugin updates tracked, and basic metrics exported to Prometheus, Grafana, CloudWatch, or whatever monitoring system the team already uses.
The best fix is often small and specific. Move Docker builds to dedicated agents. Cut a noisy job's log output. Add a Maven proxy. Reduce executors on a swapping node. Remove an unused plugin. Put retention on a job that has kept every build for years.
Jenkins performance improves when you stop treating it as one black box and start following the work: from queue, to controller, to agent, to filesystem, to network dependency, and back to the build log.
Example: the build is slow but Jenkins is fine
Suppose developers report that pull request checks take 25 minutes. The Jenkins UI is responsive. The queue is short. Agents are online. The slow log shows:
git fetch: 20 seconds
npm ci: 9 minutes
unit tests: 4 minutes
docker build: 10 minutes
archive artifacts: 1 minute
This is not primarily a Jenkins controller problem. The likely fixes are package caching, Dockerfile layer ordering, BuildKit cache, and maybe splitting tests. Increasing controller heap would change nothing.
Example: everything waits but agents are idle
If jobs are queued while agents appear idle, read the queue reason. You may find that jobs require linux && docker, while the idle agents only have linux. Or the jobs may be blocked by disableConcurrentBuilds, a lockable resource, or a cloud plugin that is failing to provision matching agents.
That kind of bottleneck is configuration, not raw capacity. Adding two more unmatched agents will not help.
Example: the controller slows down every afternoon
If the UI degrades at the same time each day, look for scheduled jobs: branch indexing, backups, large artifact cleanup, vulnerability scans, or nightly pipelines starting too early. Check controller CPU, heap, and disk I/O during that window. Also check whether many jobs start at exactly the same minute due to cron expressions like 0 2 * * *.
In Jenkins schedules, prefer hashed timing where possible:
H 2 * * *
That spreads jobs instead of starting everything at the top of the hour.
What good monitoring should answer
At minimum, monitoring should answer these questions without logging into the server:
- Is the controller process alive and responsive?
- How much heap is used, and how often is garbage collection running?
- How long are jobs waiting in the queue by label?
- Which agents are offline or repeatedly reconnecting?
- Are controller and agent disks close to full?
- Are builds getting slower over time for the same jobs?
You do not need a perfect dashboard to start. Even a few metrics and alerts for disk, heap, queue length, and agent availability will catch many failures before developers report them.