There is exactly one way the shell decides whether to take a branch, retry a command, exit early, run the next pipe stage, or trigger set -e: it looks at the exit status of the most recently executed command. Zero means success. Non-zero means failure. Everything else — every if, every while, every &&, every ||, every test in [ ] or [[ ]] — is a thin syntactic wrapper over that single primitive.
Most beginners write shell as if if and while were like the same constructs in Python or C. They are not. They are command-runners that branch on exit codes, and once you internalise that, every weird shell-conditional behaviour you’ve ever encountered makes complete sense.
This lesson covers the exit-code contract, the three test commands and when to use each, the precise difference between && / || chaining and if blocks (and why this difference matters for set -e), and the conditional idioms you’ll write thousands of times.
1. The exit code is the only truth
Every command — every binary, every shell built-in, every function, every pipeline — terminates with an integer exit code in the range 0–255. Conventionally:
- 0 means success. Always. Always. Always.
- non-zero means failure. The specific number sometimes encodes a category (1 = generic error, 2 = misuse, 126 = found-but-not-executable, 127 = command-not-found, 128+N = killed by signal N, 130 = killed by SIGINT (Ctrl+C), 137 = SIGKILL, 143 = SIGTERM), but the vast majority of code only cares about zero vs non-zero.
You can see the exit code of the last command with the special variable $?:
ls /tmp
echo "$?" # 0 — ls succeeded
ls /nonexistent
echo "$?" # 2 — ls failed with "no such file or directory"
false
echo "$?" # 1 — false always exits 1
true
echo "$?" # 0 — true always exits 0
true and false are real, executable commands (or shell built-ins, depending on the shell) whose only purpose is to provide guaranteed exit codes. They are useful in conditionals, infinite loops, and tests:
while true; do
echo "Running..."
sleep 1
done
Note: $? is reset by every command. If you need to use it more than once, capture it immediately:
my-tool
RC=$?
if (( RC != 0 )); then
echo "Failed with code $RC" >&2
exit "$RC"
fi
If you wrote if ((... != 0)) first and then tried to use $?, the if itself would have reset $?.
The colon command
There’s a built-in command called : (colon). It does nothing and always returns 0. It’s used as a no-op:
if [ -f /tmp/foo ]; then
: # do nothing if the file exists
else
touch /tmp/foo
fi
You’ll also see it as a portable way to expand variables for their side effects (e.g. with ${VAR:?error}):
: "${REQUIRED_VAR:?REQUIRED_VAR must be set}"
That single line, at the top of a script, is a clean way to fail-fast on missing required variables without polluting output.
2. if is a command-runner, not a boolean expression
This is the central concept. In C or Python, if (expr) evaluates expr to a Boolean and branches. In shell:
if COMMAND; then
THEN_BRANCH
else
ELSE_BRANCH
fi
if runs COMMAND, looks at its exit code, and if zero it runs the then branch, otherwise the else. COMMAND is a real, executable command. It can be grep, curl, [, [[, a function call, a pipeline, anything.
if grep -q "ERROR" /var/log/app.log; then
alert-pager
fi
That works because grep -q exits 0 if it found a match and 1 if it didn’t. The -q flag tells grep to be quiet (no output) and just return the exit code.
if curl -fsS https://api.example.com/health >/dev/null; then
echo "API is up"
else
echo "API is DOWN"
exit 1
fi
That works because curl -f exits non-zero on HTTP error responses (4xx/5xx). The -s is silent (no progress bar), -S shows errors even when silent (so set -e can pick them up). This is the canonical curl-in-script flag set: -fsS for “fail silently but show errors and exit non-zero on HTTP failure.”
The body of if can be a pipeline. The exit status of a pipeline is normally the exit status of the last command (we covered set -o pipefail in lesson 2):
if grep -E '^ERROR' /var/log/app.log | grep -q 'database'; then
echo "Database errors found"
fi
You can also chain multiple commands with ; or &&:
if cd /app && [ -f config.toml ]; then
./run.sh
fi
cd /app && [ -f config.toml ] succeeds only if both cd and the file test succeed. If cd fails, the && short-circuits and the whole condition is the failure code from cd.
The if-elif-else chain
if [[ "$1" == "start" ]]; then
start_service
elif [[ "$1" == "stop" ]]; then
stop_service
elif [[ "$1" == "restart" ]]; then
stop_service
start_service
else
echo "Usage: $0 {start|stop|restart}" >&2
exit 2
fi
Mechanically identical to nested if blocks. For more than two or three branches, prefer case (section 7).
3. The three test commands: test, [, and [[
This is the area that confuses beginners most. There are three commands that look like conditional expressions. They are not the same. They have different syntax, different operators, and very different behaviour around quoting and word splitting.
test EXPR — the original POSIX form
if test -f /etc/hostname; then
echo "exists"
fi
test is a real command (built into bash, but also a separate binary at /usr/bin/test). It evaluates the expression and exits 0 (true) or 1 (false). It’s POSIX-compliant — works in dash, ash, busybox sh, every shell ever.
[ EXPR ] — exactly the same as test, but uglier and prettier
if [ -f /etc/hostname ]; then
echo "exists"
fi
The [ is also a real command. It’s a binary at /bin/[ (or a shell built-in) whose name is literally [. It expects its last argument to be ]. The brackets are not syntax — they’re command-name and argument. That’s why you must have spaces around them:
[ -f /etc/hostname ] # CORRECT — '[' is the command, '-f' '/etc/hostname' ']' are arguments
[-f /etc/hostname ] # WRONG — bash looks for a command called "[-f"
[ -f /etc/hostname] # WRONG — the last argument is "/etc/hostname]" not "]"
For all practical purposes, [ EXPR ] and test EXPR are interchangeable. POSIX-compliant. Works everywhere.
The classic gotcha with [:
NAME=""
if [ $NAME = "alice" ]; then # expands to: if [ = alice ]; — syntax error, "test" gets confused
echo "hi alice"
fi
When NAME is empty, the unquoted $NAME expands to nothing (zero tokens), and [ = alice ] is a malformed test expression. The fix: always quote variables in [:
if [ "$NAME" = "alice" ]; then # expands to: if [ "" = alice ]; — works
echo "hi alice"
fi
This is one of many reasons to prefer [[, which we’ll cover next.
[[ EXPR ]] — bash’s superior form
if [[ -f /etc/hostname ]]; then
echo "exists"
fi
[[ is a shell keyword, not a command. It’s part of bash’s grammar. This is a fundamental difference: because it’s grammar, bash parses it specially and the rules inside are different.
Inside [[ ]]:
- Word splitting does not happen on variables.
[[ $NAME = "alice" ]]works correctly even whenNAMEis empty. - Pathname expansion does not happen.
[[ $FOO = *.txt ]]does pattern matching, not glob expansion. - The
&&and||operators work —[[ A && B ]]is “A AND B”. - The regex operator
=~works. <and>do lexicographic string comparison (no need to escape them).
Inside [ ]:
- Variables get word-split unless quoted.
&&and||don’t work — you’d have to write[ A ] && [ B ].- No regex match.
<and>would be interpreted as redirections by bash and need escaping (\<).
So which one should you use?
Use [[ ]] in bash scripts. Use [ ] (or test) only in POSIX-portable scripts — scripts whose shebang is #!/bin/sh and need to run on dash, ash, or busybox.
In this course, every example uses [[ ]] unless we’re explicitly discussing portability (lesson 31).
Operator reference
Here are the operators you’ll use most. They work in all three forms unless noted.
File tests (all forms):
| Operator | Meaning |
|---|---|
-e PATH |
exists (any type) |
-f PATH |
is a regular file |
-d PATH |
is a directory |
-L PATH |
is a symlink |
-r PATH |
readable |
-w PATH |
writable |
-x PATH |
executable |
-s PATH |
exists and is non-empty (size > 0) |
-p PATH |
is a named pipe (FIFO) |
-S PATH |
is a socket |
-N PATH |
modified since last read |
A -nt B |
file A is newer than B |
A -ot B |
file A is older than B |
A -ef B |
same file (same device + inode, follows links) |
if [[ -f /etc/passwd ]]; then echo "exists"; fi
if [[ -d /var/log ]]; then echo "is directory"; fi
if [[ -x /usr/bin/jq ]]; then echo "jq available"; fi
if [[ /etc/passwd -nt /etc/passwd.bak ]]; then echo "passwd has changed since backup"; fi
String tests:
| Operator | Meaning |
|---|---|
-z STR |
string is empty (length zero) |
-n STR |
string is non-empty |
STR1 = STR2 |
strings are equal (POSIX) |
STR1 == STR2 |
strings are equal (bash; in [[ ]], RHS is glob) |
STR1 != STR2 |
strings differ |
STR1 < STR2 |
lexicographically less than (only [[ ]]) |
STR1 > STR2 |
lexicographically greater than (only [[ ]]) |
NAME="alice"
if [[ -z "$NAME" ]]; then echo "name is empty"; fi
if [[ "$NAME" == "alice" ]]; then echo "exact match"; fi
if [[ "$NAME" == a* ]]; then echo "starts with a"; fi # glob match (no quotes on RHS!)
if [[ "$NAME" == "a*" ]]; then echo "literal a*"; fi # literal (RHS quoted = glob disabled)
The unquoted right-hand side of == inside [[ ]] is a glob pattern. This is enormously useful — you can do prefix and suffix matching without invoking grep. But beware: if you want a literal match against a string that might contain * or ?, quote the RHS.
Numeric tests:
| Operator | Meaning |
|---|---|
N1 -eq N2 |
equal |
N1 -ne N2 |
not equal |
N1 -lt N2 |
less than |
N1 -le N2 |
less than or equal |
N1 -gt N2 |
greater than |
N1 -ge N2 |
greater than or equal |
COUNT=5
if [[ "$COUNT" -gt 3 ]]; then echo "more than 3"; fi
if [[ "$COUNT" -eq 5 ]]; then echo "exactly 5"; fi
The -eq/-ne/-lt/-le/-gt/-ge operators work in [ ] and [[ ]]. They are integer-only. For floating-point comparison you need awk or bc.
An alternative for numbers: the (( )) arithmetic command:
if (( COUNT > 3 )); then echo "more than 3"; fi
if (( COUNT == 5 )); then echo "exactly 5"; fi
(( )) is the arithmetic command — separate from $(( )) which is arithmetic expansion. It evaluates the C-style expression and returns 0 (success) if the result is non-zero, 1 (failure) if zero. It’s the cleanest way to do numeric comparisons in bash.
i=0
while (( i < 10 )); do
echo "$i"
(( i++ ))
done
Inside (( )), you don’t prefix variables with $ — i works like in C. You can use +, -, *, /, %, **, ==, !=, <, >, <=, >=, &&, ||, !, bitwise operators, ++, --, ternary ? :. Use (( )) for numeric tests in bash. Use [[ ... -eq ... ]] for POSIX-only constraints.
Logical operators inside [[ ]]
if [[ -f "$FILE" && -r "$FILE" ]]; then
echo "file exists and is readable"
fi
if [[ "$NAME" == "alice" || "$NAME" == "bob" ]]; then
echo "alice or bob"
fi
if [[ ! -f "$FILE" ]]; then
echo "file does not exist"
fi
In [ ] you’d have to write [ -f "$FILE" ] && [ -r "$FILE" ] — two separate command invocations chained at the shell level. In [[ ]] it’s one expression. Cleaner, faster.
4. The =~ operator: regex matching in bash
[[ STRING =~ REGEX ]] performs ERE (Extended Regular Expression) matching. This is one of the most powerful shell features and most beginners don’t know it exists.
EMAIL="alice@example.com"
if [[ "$EMAIL" =~ ^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$ ]]; then
echo "looks like an email"
fi
VERSION="v1.2.3"
if [[ "$VERSION" =~ ^v([0-9]+)\.([0-9]+)\.([0-9]+)$ ]]; then
MAJOR="${BASH_REMATCH[1]}"
MINOR="${BASH_REMATCH[2]}"
PATCH="${BASH_REMATCH[3]}"
echo "Major: $MAJOR, Minor: $MINOR, Patch: $PATCH"
fi
BASH_REMATCH is a special array set by =~ after a successful match. Index 0 is the full match; indices 1, 2, 3 are the capture groups. This is enormously useful for parsing structured strings.
The single critical rule: the regex on the RHS must NOT be quoted. If you quote it, it becomes a literal string match:
if [[ "$VERSION" =~ "^v[0-9]" ]]; then ... fi # literal — matches strings containing "^v[0-9]"
if [[ "$VERSION" =~ ^v[0-9] ]]; then ... fi # regex — matches strings starting with "v" then digit
If you need a literal character class (e.g. . to match a literal dot), put it in a variable and use the variable unquoted:
PATTERN='^v[0-9]+\.[0-9]+\.[0-9]+$'
if [[ "$VERSION" =~ $PATTERN ]]; then echo "version-like"; fi
This pattern (capture into variable, expand unquoted) is the safest way to write complex regexes — it sidesteps every quoting subtlety inside [[ ]].
5. && and || chaining vs if blocks — a critical distinction
Bash gives you two ways to express “do B if A succeeds”:
A && B
vs.
if A; then B; fi
These are almost equivalent. The exit code is the same. The behaviour is the same in most cases. But there is one critical difference, and it interacts with set -e.
With set -e:
- Inside the condition of
if A; then ..., a failure ofAis expected and does not triggerset -e. Bash deliberately suppresseserrexitin conditional contexts. - In
A && B, a failure ofAdoes not triggerset -eeither, as long as the chain continues toB. But ifBfails, that does triggerset -e(in most bash versions).
This means the following two snippets behave differently under set -e:
set -e
# snippet 1: chain
my-tool && echo "ok" || echo "fail"
# snippet 2: if
if my-tool; then echo "ok"; else echo "fail"; fi
The chain form has a notorious gotcha: A && B || C does not mean “if A then B else C” the way it would in C. It means “if A succeeds and B succeeds, you’re done; if either A or B fails, run C.” So if B fails, C runs even though A succeeded. This is rarely what you want.
The rule: use && and || only for two-step chains. For three branches or anything more complex, use if.
The cleanest uses of chaining:
mkdir -p /tmp/cache && cd /tmp/cache # cd only if mkdir succeeded
command -v jq >/dev/null || { echo "jq not installed" >&2; exit 1; } # die if missing
These are short, single-purpose, and unambiguous. Anything more complex deserves a real if.
command -v is the right “is this binary installed?” check
You’ll see scripts that check for a binary like this:
which jq # WRONG-ISH — `which` is non-standard, output varies by distro, exit code unreliable
type jq # works but verbose output
hash jq # works but unintuitive
command -v jq # correct, POSIX, returns 0 if found
Use command -v jq >/dev/null as the canonical “is it installed and on PATH?” test.
6. The case statement — pattern matching done right
For three or more branches based on a single value, case is dramatically cleaner than nested if-elif-else:
case "$1" in
start)
start_service
;;
stop)
stop_service
;;
restart|reload)
stop_service
start_service
;;
status)
status_service
;;
*)
echo "Usage: $0 {start|stop|restart|status}" >&2
exit 2
;;
esac
The patterns on the left of each branch are globs — same syntax as *.txt, [Yy]es, etc. They’re not regexes. The | separates alternatives. The terminating ;; means “stop here, don’t fall through to the next pattern.”
Glob patterns let you do prefix/suffix matching without regex:
case "$FILE" in
*.tar.gz|*.tgz)
tar -xzf "$FILE"
;;
*.tar.bz2|*.tbz2)
tar -xjf "$FILE"
;;
*.zip)
unzip "$FILE"
;;
*)
echo "Unknown archive type: $FILE" >&2
exit 1
;;
esac
In bash 4+, you can also use ;& (fall through to next branch) and ;;& (continue evaluating subsequent patterns):
case "$INPUT" in
[yY]es)
echo "Got yes"
;;& # also evaluate the next pattern
[yY]*)
echo "Starts with y"
;;
esac
# If INPUT is "yes", you'd see both messages.
Use these sparingly — falls-through cases are confusing. The default ;; is almost always what you want.
Use case for argument dispatch
The classic shell pattern is parsing a CLI argument:
case "${1:-}" in
-h|--help)
show_help
exit 0
;;
-v|--version)
echo "1.0.0"
exit 0
;;
--verbose)
VERBOSE=1
;;
*)
echo "Unknown option: $1" >&2
exit 2
;;
esac
Lesson 17 (argument parsing) covers getopts and full long-option support; for now case "${1:-}" is the lightweight pattern.
7. Status propagation: how to make scripts return meaningful exit codes
A script’s exit code is the exit code of its last command, unless you call exit N explicitly. Best practice: make every script’s exit code meaningful.
#!/usr/bin/env bash
set -euo pipefail
main() {
if ! curl -fsS https://api.example.com/health >/dev/null; then
echo "Health check failed" >&2
return 1
fi
if ! check_disk_space; then
echo "Low disk space" >&2
return 2
fi
echo "All OK"
return 0
}
main "$@"
exit $?
The exit $? at the end is explicit — it says “exit with whatever code main returned.” Without it, the script would still exit with main’s code (because main was the last command), but the explicit form documents intent.
Convention for shell-script exit codes:
0— success1— generic failure2— usage error (wrong arguments)3–125— application-defined; pick a code per failure mode and document it126— file found but not executable127— command not found128+N— killed by signal N (130 = Ctrl+C, 137 = SIGKILL, 143 = SIGTERM)255— out of range; reserve for “we don’t know”
Avoid using codes 126, 127, and 128+ for your own application errors — those slots are reserved by the shell.
Inverting an exit code
If you want to negate an exit code (treat success as failure and vice versa):
if ! grep -q ERROR /var/log/app.log; then
echo "no errors found"
fi
The leading ! inverts the exit code. This is the cleanest “if NOT” form. Note that ! does not trigger set -e even if the underlying command fails — ! is documented as a context where errexit is suppressed. So this is safe:
set -e
if ! my-tool; then
recover
fi
If my-tool fails, ! inverts to 0, the if takes the then branch, and set -e does not fire.
8. trap ERR — catching errors anywhere in a script
Lesson 10 covers signal handling and trap in depth, but for conditionals it’s worth knowing now: bash supports a pseudo-signal called ERR that fires whenever a command exits non-zero (subject to the same suppression rules as set -e).
#!/usr/bin/env bash
set -euo pipefail
on_error() {
local lineno="$1"
local code="$2"
echo "ERROR at line ${lineno} with exit code ${code}" >&2
}
trap 'on_error ${LINENO} $?' ERR
# ... rest of script ...
This gives you a stack-trace-like behaviour. ${LINENO} is automatically set to the line number of the failing command. $? is the exit code. We’ll combine this with EXIT traps for cleanup in lesson 10.
9. Ten conditional idioms to memorise
# 1. File existence
[[ -f "$FILE" ]] && echo "exists"
# 2. Variable empty / non-empty
[[ -z "$VAR" ]] && echo "empty"
[[ -n "$VAR" ]] && echo "not empty"
# 3. String equality (exact)
[[ "$NAME" == "alice" ]]
# 4. String prefix / suffix (glob)
[[ "$FILE" == *.log ]]
[[ "$URL" == https://* ]]
# 5. Numeric comparison
(( COUNT > 3 ))
# 6. Regex match with capture
[[ "$VERSION" =~ ^v([0-9]+)\.([0-9]+) ]] && MAJOR="${BASH_REMATCH[1]}"
# 7. Command available?
command -v jq >/dev/null || { echo "jq required" >&2; exit 1; }
# 8. Run only if previous succeeded
make build && make deploy
# 9. Run only if previous failed
my-tool || retry
# 10. Default-or-die for required input
: "${DATABASE_URL:?DATABASE_URL must be set}"
These ten patterns cover 95% of every conditional you’ll write. Keep them in your fingers.
10. A complete, idiomatic example
#!/usr/bin/env bash
# verify-deployment.sh
# Smoke-tests a freshly-deployed service. Returns:
# 0 — all checks pass
# 1 — health check failed
# 2 — required env missing
# 3 — required tool missing
set -euo pipefail
IFS=$'\n\t'
# 1. Required tools
for tool in curl jq; do
command -v "$tool" >/dev/null || { echo "Missing: $tool" >&2; exit 3; }
done
# 2. Required environment
: "${SERVICE_URL:?SERVICE_URL must be set}"
: "${EXPECTED_VERSION:?EXPECTED_VERSION must be set}"
# 3. Health check
echo "Checking ${SERVICE_URL}/health..."
if ! curl -fsS "${SERVICE_URL}/health" >/dev/null; then
echo "Health check failed" >&2
exit 1
fi
# 4. Version check
VERSION="$(curl -fsS "${SERVICE_URL}/version" | jq -r '.version')"
if [[ "$VERSION" != "$EXPECTED_VERSION" ]]; then
echo "Version mismatch: got ${VERSION}, expected ${EXPECTED_VERSION}" >&2
exit 1
fi
# 5. Pattern check on response
ENVELOPE="$(curl -fsS "${SERVICE_URL}/api/info")"
if ! [[ "$ENVELOPE" =~ \"status\":[[:space:]]*\"ok\" ]]; then
echo "Bad info response" >&2
exit 1
fi
# 6. Optional latency check
LATENCY_MS="$(curl -fsS -o /dev/null -w '%{time_total}' "${SERVICE_URL}/health" | awk '{print int($1 * 1000)}')"
if (( LATENCY_MS > 500 )); then
echo "Warning: latency ${LATENCY_MS}ms exceeds 500ms threshold" >&2
fi
echo "All checks passed (version=${VERSION}, latency=${LATENCY_MS}ms)"
exit 0
Things to notice:
- Strict mode and IFS hardening at the top.
- Required tools checked with
command -vand a deliberate exit code. - Required env checked with
: "${VAR:?...}". - All variable expansions quoted.
curl -fsSfor fail-loud HTTP.[[ ... =~ ... ]]for pattern check on the response body.(( ))for the numeric latency comparison.- Distinct exit codes per failure mode.
- A success message before
exit 0.
This is what production-grade shell looks like. Every shell script you ship should be roughly this shape.
11. What you must internalise before lesson 4
Before moving on, make sure all of these are in your reflexes:
- What’s the difference between exit code 0 and non-zero? (0 = success, anything else = failure.)
- What does
if COMMAND; thenactually do? (RunsCOMMAND, looks at its exit code, branches on zero vs non-zero.) - When do you use
[,[[,(( )), andtest? ([[/((for bash;[/testfor POSIX portability;((for numeric.) - Why is
[[ $VAR == *.txt ]]glob and[[ $VAR == "*.txt" ]]literal? (Unquoted RHS of==inside[[ ]]is a glob; quoting disables glob.) - How do you do regex with capture groups? (
[[ STR =~ REGEX ]]then readBASH_REMATCH[1]etc.) - What’s the difference between
A && B || Candif A; then B; else C; fi? (The chain form runs C if either A or B fails; the if form runs C only if A fails.) - How do you check if a binary is installed? (
command -v BINARY >/dev/null.) - What’s the canonical fail-fast for a missing required env var? (
: "${VAR:?VAR must be set}".) - Why does
!not triggerset -e? (Bash explicitly suppresses errexit when the command is preceded by!or is the condition ofif/while/until.) - What’s a sensible exit-code policy for your scripts? (0 success; 1 generic; 2 usage; 3-125 application-defined; avoid 126, 127, 128+.)
If any of those felt fuzzy, re-read the relevant section. Lesson 4 (loops and substitution) is where conditionals start composing into real iteration patterns — and where set -e’s subtle exceptions inside while and until will come up again.
What’s next
Lesson 4 covers for, while, until, case, break, continue, the C-style for ((i=0;i<10;i++)) form, the mapfile/readarray idioms for line-by-line iteration without word-splitting bugs, and the most common iteration anti-pattern in shell (for f in $(ls)) and how to replace it. Bring everything from lessons 1–3.