Understanding Exit Codes: Effective Error Handling with $? and exit
Use Bash exit codes, $?, exit, set -e, and pipefail to make script failures clear and controlled.
Understanding Exit Codes: Effective Error Handling with $? and exit
When a Bash script fails, the exit code tells the caller what happened next: continue, retry, alert, or stop. Understanding exit codes, $?, and exit is the difference between automation that hides failures and automation that reports them clearly.
This guide shows how Bash tracks command status and how you can use that status for simple, reliable error handling.
The Concept of Exit Statuses
Every command or program executed in a Unix-like shell environment—whether it's a built-in command like cd, an external utility like grep, or another shell script—returns an integer value upon completion. This integer is the exit code, which signals the outcome of the operation to the calling process.
The Standard Convention
The convention for exit codes is universally recognized:
- 0 (Zero): Signifies success. The command executed exactly as expected, and no errors occurred.
- 1 to 255: Signify failure or specific error conditions. These non-zero values indicate that something went wrong. Higher numbers often correspond to specific types of errors (e.g., file not found, permission denied, syntax error), though the exact meaning depends on the specific program.
Note on Range: While exit codes are technically an 8-bit value (0-255), shell scripts usually only concern themselves with 0 for success and non-zero for failure. Exit codes greater than 255 are usually truncated or interpreted modulo 256 by the shell.
Inspecting the Last Exit Code: The $? Variable
The special shell variable $? (dollar question mark) is central to monitoring command status. Immediately after any command executes, the shell stores its exit code in $?.
How to Use $?
You must check $? immediately after the command you are interested in, as any subsequent command (even echoing the variable) will overwrite its value.
Example 1: Checking Success and Failure
# 1. A successful command
echo "Success test" > /dev/null
echo "Exit code for success: $?"
# 2. A failing command (e.g., trying to list a non-existent file)
ls /non/existent/path
echo "Exit code for failure: $?"
Expected Output:
Exit code for success: 0
ls: cannot access '/non/existent/path': No such file or directory
Exit code for failure: 2
Implementing Conditional Error Checking
Simply knowing the exit code isn't enough; the power comes from using this information to control script flow. This is typically done using if statements or short-circuit operators (&& and ||).
Using if Statements
This is the most explicit way to handle errors:
if grep -q "important data" logfile.txt;
then
echo "Data found successfully."
else
LAST_STATUS=$?
echo "Error: Grep failed with status $LAST_STATUS. Data not found."
# Consider exiting here if the script cannot proceed
fi
In the example above, grep -q suppresses output (-q) and returns 0 only if a match is found. The if structure checks the exit status automatically, but explicitly capturing $? inside the else block is useful for detailed logging.
Using Short-Circuit Logic (&& and ||)
For simple sequential checks, short-circuit operators provide concise error handling:
&&(AND): The command following&&only executes if the preceding command succeeded (returned 0).||(OR): The command following||only executes if the preceding command failed (returned non-zero).
Example 2: Concise Error Handling
# 1. Only run 'process_data' IF 'fetch_data' succeeds
fetch_data.sh && ./process_data.sh
# 2. Run 'send_alert' ONLY IF the primary operation fails
rsync -a source/ dest/ || echo "RSync failed on $(date)" >> /var/log/rsync_errors.log
Controlling Script Termination with exit
The exit command is used to immediately terminate the current shell script or function and return a specified exit status to the caller (which might be another script or the user's terminal).
Syntax and Usage
The syntax is simply exit [status_code].
If no status is provided, exit defaults to the status of the most recently executed foreground command. If you explicitly call exit 0 without running any command first, it returns 0.
Example 3: Exiting on Pre-Condition Failure
This script ensures a required configuration file exists before proceeding.
CONFIG_FILE="/etc/app/config.conf"
if [[ ! -f "$CONFIG_FILE" ]]; then
echo "Error: Configuration file not found at $CONFIG_FILE."
# Terminate script immediately with a specific error code (e.g., 20)
exit 20
fi
echo "Configuration loaded. Continuing script..."
# ... rest of script
exit 0
Best Practice: Using Meaningful Exit Codes
While 0 and 1 cover most basic cases, using different non-zero codes helps the calling script diagnose the exact problem:
| Code | Meaning (Example) |
|---|---|
| 0 | Success |
| 1 | General catch-all error |
| 2-10 | Syntax errors, argument parsing issues |
| 20 | Missing prerequisite (e.g., file not found) |
| 30 | Permission issue |
Making Scripts Fail Fast: The set Command
For maximum reliability in complex scripts, it is a strong best practice to enable error checking globally using the set command options at the top of your script:
#!/bin/bash
# Exit immediately if a command exits with a non-zero status.
set -e
# Treat unset variables as an error when substituting.
set -u
# Pipefail: Ensures that a pipeline's return status is the status of the rightmost command that exited with a non-zero status.
set -o pipefail
# (Optional but helpful) Print commands as they are executed for debugging
# set -x
# If any command below fails, the script stops immediately.
ls /valid/path && grep pattern file.txt && ./next_step.sh
# The following line will ONLY run if all preceding commands succeeded.
echo "All steps complete."
When set -e is active, many unhandled non-zero statuses stop the script before later commands run on bad assumptions. It has exceptions in conditionals, pipelines, and compound commands, so still handle expected failures explicitly.
For example, grep returns 1 when it finds no match. That may be a normal result, not a fatal error:
if grep -q "READY" status.txt; then
echo "Service is ready."
else
echo "Service is not ready yet."
fi
Takeaway
Check critical commands where they run, write errors to stderr, and exit with a non-zero status when the script cannot continue safely. Use set -euo pipefail for fail-fast scripts, but do not rely on it as your only error-handling strategy.