Reduce Docker Image Size: A Practical Guide to Faster Builds
Make Docker images smaller and builds faster with multi-stage builds, cache-friendly Dockerfiles, and clean build contexts.
Reduce Docker Image Size: A Practical Guide to Faster Builds
Reducing Docker image size is not about chasing the smallest number on docker image ls. The useful goal is a build that is fast, repeatable, easy to patch, and small enough that it does not slow down CI, deployments, or local development.
A bloated image usually comes from a few ordinary habits: copying the whole repository too early, installing build tools into the runtime image, leaving package manager caches behind, using latest tags, and sending a huge build context to Docker. Fix those habits and most teams see cleaner builds without doing anything exotic.
Start by measuring the current image:
docker image ls my-app
docker history --no-trunc my-app:latest
docker history shows which Dockerfile instructions added size. If one RUN step adds a large amount, inspect that step before changing everything else. For a deeper look, tools such as dive can show files added and removed in each layer, but the built-in commands are enough for the first pass.
Pick a base image for compatibility first, size second
Base image choice matters, but the smallest image is not always the best image. Alpine is small because it uses musl libc and a minimal userspace. That works well for many Go, Node, Python, and utility images, but it can create friction with native extensions, prebuilt binaries, DNS behavior, or packages that expect glibc.
A practical order is:
- Use an official image for your runtime.
- Prefer a versioned tag such as
python:3.12-slimoverpython:latest. - Try
slimbefore jumping to Alpine if your app has native dependencies. - Use distroless or
scratchwhen you already know exactly what the runtime needs.
For example, a Python service with psycopg, image libraries, or cryptography packages may be easier to maintain on python:3.12-slim than python:3.12-alpine. The Alpine image may be smaller at first, but if you add compilers and compatibility packages to make dependencies build, the final result may not be simpler or much smaller.
For Go services, scratch or distroless can be excellent because a statically compiled binary may need very little at runtime. Even then, remember TLS certificates, timezone data, and non-root users. A tiny image that cannot make HTTPS requests is not a useful production image.
Use multi-stage builds to keep build tools out of production
Multi-stage builds are the most reliable way to reduce Docker image size for compiled apps and frontend builds. The builder stage can be large. It can contain compilers, package managers, test tools, and source files. The final stage should contain only what the app needs to run.
A Go example:
FROM golang:1.22 AS builder
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o /out/server ./cmd/server
FROM gcr.io/distroless/static-debian12
COPY --from=builder /out/server /server
USER nonroot:nonroot
ENTRYPOINT ["/server"]
The important part is not the exact base image. The important part is the boundary: source code, module cache, and compiler stay in builder; the final image gets the binary.
A Node app has a similar shape, but you need to separate development dependencies from production dependencies:
FROM node:22-slim AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
FROM deps AS build
COPY . .
RUN npm run build
FROM node:22-slim AS runner
WORKDIR /app
ENV NODE_ENV=production
COPY package.json package-lock.json ./
RUN npm ci --omit=dev && npm cache clean --force
COPY --from=build /app/dist ./dist
USER node
CMD ["node", "dist/server.js"]
For frontend assets served by nginx, the final image does not need Node at all:
FROM node:22-slim AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build
FROM nginx:1.27-alpine
COPY --from=build /app/dist /usr/share/nginx/html
That pattern removes node_modules, the source tree, and build tooling from the runtime image.
Make the Docker cache work for you
Docker cache invalidation is simple but unforgiving. If an instruction changes, Docker rebuilds that layer and every layer after it. If you copy the whole project before installing dependencies, every code edit can force a dependency reinstall.
This is slow:
FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]
This is usually faster:
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "app.py"]
The dependency layer only rebuilds when requirements.txt changes. Source changes only rebuild the later COPY . . layer.
The same rule applies to Maven, Gradle, npm, pnpm, Cargo, and Go modules. Copy dependency manifests first, install or download dependencies, then copy the rest of the source.
Keep the build context small with .dockerignore
Docker sends the build context to the daemon before it starts building. If your context includes .git, local node_modules, coverage reports, videos, screenshots, test databases, and build artifacts, your build is slower before the Dockerfile even runs.
A basic .dockerignore might look like this:
.git
.gitignore
.env
.env.*
node_modules
coverage
dist
build
target
.pytest_cache
__pycache__
*.log
.DS_Store
Be careful with broad ignores. If you ignore dist but your Dockerfile expects to copy a prebuilt dist, the build will fail. If you ignore .env, that is usually good for secrets, but your build should not depend on it. The build should receive required values through explicit build args or CI configuration.
You can see context size at the start of classic Docker build output, and BuildKit output also makes large context transfers visible. If the first step takes a long time before any Dockerfile instruction runs, check .dockerignore.
Clean package manager caches in the same layer
A frequent mistake is installing packages in one layer and deleting caches in a later layer. The files disappear from the final filesystem view, but they may still exist in image history.
For Debian and Ubuntu images:
RUN apt-get update && apt-get install -y --no-install-recommends ca-certificates curl && rm -rf /var/lib/apt/lists/*
For Alpine:
RUN apk add --no-cache ca-certificates curl
For Python:
RUN pip install --no-cache-dir -r requirements.txt
For npm:
RUN npm ci --omit=dev && npm cache clean --force
Do not cargo-cult cleanup commands. Use the cleanup that matches the package manager and base image. Also avoid apt-get upgrade in most application Dockerfiles. It makes builds less predictable and can drag in more changes than you intended. Patch by rebuilding regularly from maintained base images and pinning the image family you expect.
Prefer COPY unless you need ADD
COPY copies files from the build context. ADD has extra behavior: it can unpack local tar archives and fetch remote URLs in some cases. That extra behavior can surprise maintainers and make cache behavior harder to reason about.
Use COPY for normal application files:
COPY ./src ./src
Use ADD only when you intentionally want its archive extraction behavior. For remote downloads, prefer curl or wget in a RUN step where you can verify checksums and fail clearly.
Reduce what you install, not only what you delete
The best cache cleanup is not installing unnecessary files in the first place. Use --no-install-recommends with apt-get. Avoid installing editors, shells, package managers, and debugging tools in production images unless you have a reason.
That said, do not remove every tool if it makes operations harder. A production image can be minimal while still observable. For some teams, a slim image with a shell is a better tradeoff than a distroless image because on-call debugging is simpler. For other teams with strong logging, tracing, and ephemeral debug containers, distroless is a good fit.
The honest answer is workload-specific. Smaller images usually pull faster and expose fewer packages to patch, but maintainability still matters.
Use BuildKit cache mounts for dependency downloads
BuildKit can keep package caches outside the final image while reusing them across builds. This is different from leaving caches inside the image.
For npm:
# syntax=docker/dockerfile:1.7
FROM node:22-slim
WORKDIR /app
COPY package.json package-lock.json ./
RUN --mount=type=cache,target=/root/.npm npm ci
COPY . .
For apt, cache mounts can help in CI environments where repeated builds happen on the same builder, although many CI systems use fresh workers unless you configure persistent cache. BuildKit features are powerful, but keep the Dockerfile readable for the team that has to maintain it.
Check the final image contents
After changes, prove what improved:
docker build -t my-app:optimized .
docker image ls my-app
docker history my-app:optimized
Then run the image in the same way production runs it. A smaller image that fails because it lacks certificates, locale data, timezone files, or a writable directory is not an improvement.
Useful smoke tests include:
docker run --rm my-app:optimized --version
docker run --rm -p 8080:8080 my-app:optimized
curl -f http://localhost:8080/health
If the app writes files, test that path. If it calls HTTPS services, test TLS. If it runs as non-root, test permissions.
A practical optimization order
When I review an oversized Docker image, I use this order:
- Add or fix
.dockerignore. - Reorder the Dockerfile so dependency installation can be cached.
- Introduce multi-stage builds.
- Switch from full base images to
slim, Alpine, distroless, orscratchwhere compatible. - Remove package manager caches and development dependencies in the same layer they are created.
- Inspect with
docker historyordiveand repeat only where the evidence points.
This order avoids premature cleverness. It also keeps the Dockerfile understandable, which matters more than squeezing out the last few megabytes.