Troubleshooting Docker Containers: Common Startup Issues and Solutions

When a Docker container will not start, the fastest fix usually comes from resisting the urge to guess. A container is just a process with a filesystem, environment, network settings, and limits wrapped around it. If that process exits, Docker records why. Your job is to pull the evidence in the right order.

I usually start with three questions: did Docker create the container, did the main process start, and did something outside the process kill or block it? Those questions separate a bad image name from a broken command, a port conflict from an application crash, and a permission problem from a memory limit.

Start with the boring command that tells you the truth:

docker ps -a

Look at STATUS, PORTS, and NAMES. Created means Docker made the container but did not actually get it running. Exited (1) often means the application returned a normal error. Exited (127) commonly points to a missing command. Exited (137) often means the process was killed from the outside, frequently because of memory pressure. These codes are clues, not final answers, but they keep you from debugging the wrong layer.

Then read the logs:

docker logs --tail 100 <container>
docker logs -f <container>

If the container dies immediately, docker logs is usually more useful than rerunning the same docker run command. Application frameworks often print the exact missing environment variable, migration failure, invalid config file, or bind error before exiting.

For low-level state, inspect the container:

docker inspect <container> --format '{{json .State}}'
docker inspect <container> --format 'exit={{.State.ExitCode}} oom={{.State.OOMKilled}} error={{.State.Error}}'

That second command is worth memorizing. It tells you whether Docker saw an OOM kill, what exit code was recorded, and whether the runtime itself had an error.

If the container exits immediately

A container stays alive only while its main process stays alive. If the command finishes, Docker stops the container. That surprises people when they run scripts that start a daemon in the background and then return.

For example, this pattern often exits:

CMD service nginx start

The service command can start nginx and then finish. Docker sees the main process end and stops the container. The container-friendly pattern is to run the server in the foreground:

CMD ["nginx", "-g", "daemon off;"]

The same idea applies to Node, Python, Java, and worker processes. The command in CMD or ENTRYPOINT should be the long-running process, not a launcher that backgrounds the real work and exits.

If logs show command not found, no such file or directory, or exec format error, test the image interactively:

docker run --rm -it --entrypoint sh <image>

Some images do not include bash, especially Alpine and distroless-style images. Use sh first unless you know bash exists. Once inside, check the file path, permissions, and interpreter:

ls -l /app
which python || true
head -1 /app/start.sh

A script can exist and still fail with no such file or directory if its shebang points to a missing interpreter, such as #!/bin/bash in an image that only has /bin/sh. Another common cause is Windows line endings. If a shell script was edited on Windows, the invisible can make Linux look for /bin/sh .

If Docker says the port is already allocated

Port conflicts happen on the host side. In -p 8080:80, 8080 is the host port and 80 is the container port. If anything is already listening on host port 8080, Docker cannot bind it.

You may see an error like bind: address already in use or port is already allocated. Find the listener:

sudo lsof -i :8080
# or
sudo ss -ltnp 'sport = :8080'

On macOS, lsof is usually the easiest. On Linux servers, ss is often available by default. On Windows PowerShell, use:

Get-NetTCPConnection -LocalPort 8080

Then choose a different host port or stop the service that owns it:

docker run -d -p 8081:80 nginx

Do not change the container port unless the application inside the container actually listens on that new port. If nginx listens on 80 inside the container, -p 8081:80 is correct. -p 8081:8081 will fail from the browser if nothing inside the container is listening on 8081.

If the app starts but cannot find configuration

Many startup failures are missing environment variables. The image is fine, the command is fine, but the app expects DATABASE_URL, REDIS_URL, an API key, or a config file.

Check what Docker passed in:

docker inspect <container> --format '{{range .Config.Env}}{{println .}}{{end}}'

For Compose projects, inspect the resolved configuration rather than only reading docker-compose.yml:

docker compose config

This catches indentation mistakes, .env file surprises, and variables that expanded to empty strings. A real example: DATABASE_URL=${DATABASE_URL} looks harmless, but if the shell or .env file does not define it, your application may receive an empty value and fail during startup.

Be careful with secrets in logs and terminal history. For quick local debugging, passing -e NAME=value is fine. For shared systems, use your platform's secret mechanism or an environment file with controlled permissions.

If bind mounts or volumes cause permission errors

A container can fail at startup because it cannot read a config file, write a PID file, create a cache directory, or initialize a database directory. The logs usually say permission denied, read-only file system, or operation not permitted.

First inspect the mount:

docker inspect <container> --format '{{json .Mounts}}'

Then check which user the container runs as:

docker inspect <container> --format 'user={{.Config.User}}'

If user is empty, the image may run as root by default, but many production images set a non-root user. A host directory owned by your local UID may not be writable by UID 1000, 1001, or a service-specific user inside the container.

A practical debugging sequence is:

ls -ld ./data
docker run --rm -it -v "$PWD/data:/data" --entrypoint sh <image>
id
ls -ld /data
touch /data/test

Avoid solving every permission issue with chmod 777. It can hide the immediate problem while creating a worse one. Prefer matching ownership or using named volumes for application data:

docker volume create app_data
docker run -d -v app_data:/var/lib/app <image>

Named volumes are especially useful on Docker Desktop, where bind mounts cross a virtualization boundary and can behave differently from native Linux filesystems.

If the container was killed for memory

Exit code 137 is a strong hint that the process received SIGKILL. In Docker work, that often means the kernel or Docker Desktop killed it because memory ran out. Confirm with inspect:

docker inspect <container> --format 'exit={{.State.ExitCode}} oom={{.State.OOMKilled}}'

If OOMKilled is true, you have two jobs: give the process enough memory to start, and understand why it needed that much. Raising the limit may be the right production fix for a database or JVM service. For a small web service, it may reveal a bad default.

Java apps are a classic example. Older JVM behavior did not always fit container limits well, and even modern JVMs still need sensible -Xmx or percentage-based settings for predictable behavior. Node services may need --max-old-space-size in memory-constrained environments. Databases may need explicit cache settings.

For a one-off test:

docker run --memory=1g <image>

If you use Docker Desktop, also check the memory assigned to the Docker VM. A container limit cannot help if the VM itself is starved.

If the image never pulls or the build never produced an image

Sometimes there is no container problem because there is no usable image. If docker run fails before creating a container, verify the image separately:

docker image ls | grep my-app
docker pull my-registry/my-app:tag

For private registries, confirm authentication:

docker login <registry>

For local images, make sure the tag you run is the tag you built:

docker build -t my-app:dev .
docker run --rm my-app:dev

A common local mistake is building my-app:dev and running my-app:latest, which may point to an older image or nothing at all.

If networking is blamed but the service is not listening

When a browser cannot reach a container, people often jump to Docker networking. First prove the application is listening inside the container.

docker exec -it <container> sh
ss -ltnp || netstat -ltnp

If the app is bound to 127.0.0.1 inside the container, Docker port publishing will not help. The app must listen on 0.0.0.0 or the container's interface address. This is common with development servers. For example, many frameworks default to localhost and need a flag such as --host 0.0.0.0.

Then confirm the published port:

docker port <container>
docker ps --format 'table {{.Names}}	{{.Ports}}'

You want to see something like 0.0.0.0:8080->3000/tcp. If there is no published port, the service may work from another container on the same network but not from your host browser.

A reliable startup checklist

Use this order when you are stuck:

docker ps -a to see whether the container exists and how it exited.
docker logs --tail 100 <container> to read the application's own complaint.
docker inspect <container> to check exit code, OOM status, command, user, mounts, and ports.
docker run --rm -it --entrypoint sh <image> to test the image by hand.
Remove one variable at a time: first run without mounts, then without custom networks, then with only required environment variables.

That last step matters. A long docker run command with ports, volumes, env files, custom DNS, memory limits, and a custom entrypoint gives you too many suspects. Strip it down until the image starts, then add settings back until it breaks. The setting you just added is usually where the real problem lives.