Mastering Git Garbage Collection for Peak Performance
Learn when to run git gc, what it cleans up, and how to avoid risky aggressive cleanup on active repositories.
Mastering Git Garbage Collection for Peak Performance
Git garbage collection keeps your repository from collecting loose objects, stale unreachable commits, and inefficient pack files forever. If your repo feels slow, takes too much disk space, or has had a lot of rebases and branch cleanup, git gc is one of the first maintenance tools to understand.
You usually do not need to run it every day. Git runs automatic maintenance during normal commands when certain thresholds are reached. Still, knowing what it does helps you avoid two common mistakes: ignoring a bloated repository for months, or running aggressive cleanup on a shared repo without understanding the impact.
What Git Garbage Collection Does
Git stores data as objects: commits, trees, blobs, and tags. New objects may start as loose files under .git/objects/. Over time, Git can pack many objects together into compact packfiles. Packed objects use disk more efficiently and are usually faster for Git to scan.
git gc performs several maintenance tasks, including:
- Packing loose objects into packfiles.
- Consolidating existing packfiles when useful.
- Removing unreachable objects that are old enough to prune.
- Cleaning up temporary files left by interrupted operations.
- Updating auxiliary data such as commit-graph files in modern Git setups when configured.
Unreachable does not always mean safe to delete immediately. A commit can become unreachable after a rebase, amend, reset, or deleted branch. Git normally keeps recently unreachable objects for a grace period so you have time to recover with git reflog.
Check Repository Size Before You Clean
Start by measuring the repository instead of guessing:
git count-objects -vH
Useful fields include count, size, in-pack, packs, and size-pack. A high loose object count can make everyday Git operations slower. A large size-pack may simply mean the repository has a lot of real history, large binary files, or vendor assets.
To inspect disk usage directly, run:
du -sh .git
If .git is huge because someone committed build artifacts or large archives, garbage collection alone may not solve the real problem. You may need to remove large files from future commits, move them to Git LFS, or rewrite history with a tool such as git filter-repo after coordinating with the team.
Run Normal Garbage Collection
For routine cleanup, use:
git gc
This is the safe default. It lets Git decide what maintenance work is worth doing and respects normal pruning rules.
You can ask Git to do automatic maintenance only if thresholds say it is needed:
git gc --auto
Most users do not need to call --auto manually because Git already does this behind the scenes. It is still useful in scripts where you want a low-cost cleanup pass without forcing a full repack every time.
If you want to remove old unreachable objects using the standard grace period, run:
git gc --prune=now
Use --prune=now carefully. It can remove recovery points that git reflog might otherwise help you find. Avoid it right after a complicated rebase, branch deletion, or reset unless you are certain you do not need the old objects.
Be Careful with --aggressive
git gc --aggressive tells Git to spend more CPU time trying to optimize object packing:
git gc --aggressive
It is not a magic speed button. On many repositories, the extra work gives little benefit compared with normal git gc, and it can take a long time on large histories. Use it only when you have measured a real repository-size or performance problem and can afford the maintenance window.
For day-to-day work, prefer plain git gc. If your repository regularly needs aggressive cleanup, the deeper issue is often large files, generated artifacts, or a workflow that creates lots of unreachable history.
Use Modern Git Maintenance for Ongoing Care
Recent Git versions include git maintenance, which can schedule background tasks such as prefetching, commit-graph updates, and incremental repacking depending on your platform and configuration.
To run maintenance once:
git maintenance run
To enable scheduled maintenance for your user account:
git maintenance start
Check your Git version and local documentation before relying on scheduled maintenance in automation, because the exact scheduler integration differs by operating system and Git build.
Practical Cleanup Workflow
A safe cleanup flow for a local repository looks like this:
git status
git count-objects -vH
git gc
git count-objects -vH
Make sure your working tree is clean before maintenance. Git can run garbage collection with local changes present, but a clean tree removes doubt if you need to troubleshoot afterward.
For a shared bare repository on a server, schedule maintenance during a quiet period. Avoid running heavy repacks during peak CI activity, because clone, fetch, and push operations may compete for disk and CPU.
When Garbage Collection Will Not Help
Garbage collection cannot fix every slow Git repository. It will not remove files that are still reachable in history. It will not make a monorepo small if the active history genuinely contains years of large assets. It will not repair a corrupt repository by itself.
If performance is still poor after normal cleanup, look for these causes:
- Large binary files committed directly to Git.
- Too many generated files tracked in the repository.
- Antivirus or filesystem indexing scanning
.giton every operation. - Slow network storage hosting the working tree.
- Very large working trees where sparse checkout may help.
Use git gc as maintenance, not as a substitute for repository hygiene. Run normal cleanup when object counts grow, avoid aggressive cleanup unless you have measured a need, and treat large tracked artifacts as a workflow problem to fix at the source.