Shell Lesson 4 of 42

Loops & Iteration: for, while, until, case, break/continue & Command Substitution Gotchas — How to Iterate in Shell Without Destroying Your Production Filesystem

Loops are where good shell scripts grow up and bad shell scripts destroy production. Almost every notorious shell-script disaster you’ve ever read about — files deleted by accident, partial deployments left in inconsistent states, log-rotation jobs that ate the wrong directory — boils down to one of three categories of loop bug:

  1. Word-splitting bugs: iterating over $(ls) or unquoted variables and getting wrong filenames when names contain spaces, tabs, newlines, or globs.
  2. Subshell bugs: cmd | while read line; do COUNT=$((COUNT+1)); done — the loop body runs in a subshell, so COUNT doesn’t update in the parent, and after the loop your counter is still zero.
  3. Empty-glob bugs: for f in *.log when no .log files exist — bash leaves the literal *.log as the iterator value, so you process a “file” called *.log.

Every one of these is a specific corollary of the lessons we covered in L2 (quoting and IFS) and L3 (exit codes). This lesson shows you the loop forms themselves, and shows you which iteration patterns are correct and which are landmines.

Read this carefully. Type the examples. Run them with weird filenames (spaces, newlines, leading dashes) and watch them break — then fix them with the patterns below.


1. The four loop forms in bash

Bash has four loop constructs. Three you’ll use constantly; one (select) you’ll see rarely.

for VAR in LIST; do BODY; done
for (( init; condition; update )); do BODY; done
while CONDITION; do BODY; done
until CONDITION; do BODY; done

The for-in form iterates over a list of words. The C-style for (( )) is bash-specific and gives you C-like loops. while runs the body as long as CONDITION exits zero (success). until is the inverse — runs the body until CONDITION exits zero. Both while and until evaluate CONDITION before each iteration.

There is no do-while form; the closest you can get is to put the work at the start of the body and break when done.


2. The for-in loop and the four ways to use it correctly

The for-in loop iterates over a list of words. The fundamental thing to remember: bash splits the list on IFS (whitespace by default) and processes glob patterns. You almost never want this for arbitrary input.

Form 1: explicit list of words

for fruit in apple banana cherry; do
  echo "$fruit"
done

Output:

apple
banana
cherry

This works because the words are literal — no variables, no globs, no IFS surprises. The list can also span multiple lines:

for service in \
  postgres \
  redis \
  nginx; do
  systemctl restart "$service"
done

The trailing backslashes continue the line. This is a perfectly fine pattern for short, fixed lists.

Form 2: glob expansion (the right way to iterate over files)

for log in /var/log/*.log; do
  [ -e "$log" ] || continue              # handle the no-match case
  echo "Processing $log"
  gzip "$log"
done

Bash expands /var/log/*.log to a list of matching paths. This is the correct way to iterate over files. Unlike for f in $(ls), glob expansion is byte-exact — filenames with spaces, tabs, newlines, glob characters in their names, all of them work correctly.

The [ -e "$log" ] || continue line handles the empty-glob case: when no files match, bash by default leaves the pattern as the literal value. So for log in /var/log/*.log with no logs would set log="/var/log/*.log" for one iteration, which is almost never what you want.

There are two ways to handle this:

Option A (per-loop) — check existence at the top of the body:

for log in /var/log/*.log; do
  [ -e "$log" ] || continue
  process "$log"
done

Option B (script-wide) — enable nullglob:

shopt -s nullglob          # bash-only; non-matching globs expand to nothing
for log in /var/log/*.log; do
  process "$log"
done
shopt -u nullglob          # restore default if needed

nullglob is bash-only. POSIX shells don’t have it. If you’re writing portable shell, use option A. We’ll cover nullglob, dotglob, globstar, and the glob options in detail in lesson 11.

Form 3: iterate over an array (the safe way to iterate over collected data)

LOGS=(
  /var/log/app.log
  "/var/log/with space.log"
  /var/log/another.log
)

for log in "${LOGS[@]}"; do
  echo "Processing: $log"
done

The "${LOGS[@]}" form expands to one quoted token per array element — preserving spaces, newlines, every byte exactly. This is the right way to iterate over a list of names you’ve collected from somewhere. Lesson 6 covers arrays in depth.

Form 4: iterate over $@ — the script’s positional parameters

for arg in "$@"; do
  echo "Got argument: $arg"
done

This is the canonical way to iterate over command-line arguments. The double-quotes are essential — for arg in $@ (unquoted) splits arguments on IFS again, breaking arguments that contain spaces. We covered this in L2. Always "$@".

You can also write for arg do ... done — bash treats a for with no in as iterating over "$@" automatically. Useful shorthand:

for arg do
  echo "Got: $arg"
done

The four wrong ways to iterate

These are the common mistakes. Each one breaks under specific inputs:

for f in $(ls /var/log)            # WRONG — splits on whitespace, breaks on filenames with spaces
for f in `ls /var/log`              # WRONG — same as above, with deprecated syntax
for f in $FILES                     # WRONG — splits on $IFS, breaks if FILES has weird chars
cat list.txt | while read f         # SUBTLE BUG — body runs in subshell; variables don't propagate (section 5)

Memorise: never iterate over the output of ls, never iterate over an unquoted variable expansion, never use a pipe to feed while read if you need to capture state. The right replacements are: globs, find -print0 + xargs -0, mapfile/readarray, and while read fed by input redirection not a pipe (section 5).


3. The C-style for (( )) loop

When you need a numeric counter, the C-style for is far cleaner than seq-and-for-in:

for (( i = 0; i < 10; i++ )); do
  echo "Iteration $i"
done

Inside (( )) you have full C arithmetic — increment (i++), decrement (i--), compound assignment (i += 5), bitwise, etc. No $ prefix on variables.

The three sections (init, condition, update) are all optional — for ((;;)) is a forever-loop, like while true:

for ((;;)); do
  echo "Press Ctrl+C to stop"
  sleep 1
done

Compared to the older POSIX-portable seq form:

# Portable but slower (forks seq)
for i in $(seq 0 9); do
  echo "Iteration $i"
done

# Bash-only, faster (no fork)
for ((i = 0; i < 10; i++)); do
  echo "Iteration $i"
done

The C-style form is faster (no seq subprocess), more flexible (custom step, descending counts), and more readable when you’re doing real arithmetic.

If you need to iterate by a non-default step:

for ((i = 100; i > 0; i -= 5)); do
  echo "Countdown: $i"
done

Or with bash’s brace expansion (lesson 11), for fixed ranges:

for i in {0..9}; do
  echo "$i"
done

for i in {0..100..5}; do        # step of 5; bash 4+
  echo "$i"
done

Brace expansion is faster than seq and doesn’t fork, but it’s expanded eagerly{0..1000000} builds the entire list in memory before the loop starts. For very large counts, use for ((;;)).


4. while and until

while and until are command-runners just like if. They run a command, look at its exit code, and decide whether to enter (or re-enter) the body.

COUNT=0
while (( COUNT < 5 )); do
  echo "Count: $COUNT"
  (( COUNT++ ))
done
i=0
until (( i >= 5 )); do
  echo "i: $i"
  (( i++ ))
done

while CONDITION; do BODY; done is read “while CONDITION is true (zero exit), run BODY.” until CONDITION; do BODY; done is “until CONDITION becomes true, run BODY.” Mechanically until X is exactly while ! X. Use while 99% of the time; until only when “until X happens” reads more naturally.

Infinite loops

while true; do
  echo "Forever"
  sleep 1
done

while :; do                # the colon command — minimal overhead, equivalent to true
  echo "Forever"
  sleep 1
done

while : is a tiny bit faster than while true (: is a built-in always; true is a built-in in bash but a separate binary in some shells). Rare to matter, but a common idiom.

Reading input line-by-line (the right way)

This is the most important while pattern in shell. To process a file (or any input) one line at a time:

while IFS= read -r line; do
  echo "Got line: $line"
done < input.txt

Three critical pieces:

This pattern is byte-exact: it preserves every character of the input, including spaces, tabs, leading dashes, embedded glob characters, embedded backslashes — everything.

Parsing structured input with read

read can split a line into multiple variables:

echo "alice 30 engineer" | while read -r name age role; do
  echo "Name: $name, Age: $age, Role: $role"
done

If you give read more variables than fields, the extras are empty. If you give read fewer variables than fields, the last variable gets everything that’s left over (including IFS-separated fields, joined back with spaces). This behaviour is sometimes useful, sometimes surprising.

For CSV-like input, set IFS for the read:

while IFS=',' read -r name age role; do
  echo "$name | $age | $role"
done < users.csv

For tab-separated files:

while IFS=$'\t' read -r col1 col2 col3; do
  echo "$col1 | $col2 | $col3"
done < data.tsv

The IFS=',' or IFS=$'\t' is set only for the read invocation — bash’s prefix-environment-variable syntax (lesson 1). The rest of the script’s IFS is unaffected.


5. The subshell trap: pipes and while read

This is one of the most subtle, surprising bugs in bash. Watch closely.

COUNT=0
echo "line1
line2
line3" | while read -r line; do
  COUNT=$((COUNT + 1))
done
echo "Count: $COUNT"

What does this print? You might expect 3. It actually prints 0.

The reason: in bash (and POSIX shells generally), each stage of a pipeline runs in its own subshell. The while loop on the right side of | is a subshell with its own copy of COUNT. When the loop ends, the subshell exits, and the parent’s COUNT is unchanged.

This is the most common shell bug in production scripts. It silently produces wrong results.

The fix: use input redirection instead of a pipe. Input redirection runs the body in the current shell, so variables persist:

COUNT=0
while read -r line; do
  COUNT=$((COUNT + 1))
done < <(echo "line1
line2
line3")
echo "Count: $COUNT"          # 3 — correct

The < <(echo ...) is process substitution (lesson 7) — <(cmd) produces a filename that bash makes readable as input. The outer < redirects that file into the loop’s stdin. Same effect as a pipe, but the while runs in the parent shell.

Or read from a real file:

COUNT=0
while read -r line; do
  (( COUNT++ ))
done < input.txt
echo "Count: $COUNT"          # works

Or use mapfile to read the whole file into an array first:

mapfile -t LINES < input.txt
COUNT="${#LINES[@]}"
echo "Count: $COUNT"

mapfile is bash 4+. The -t flag strips trailing newlines from each line. Lesson 6 covers mapfile and arrays in depth.

There are bash settings that change this behaviour:

shopt -s lastpipe            # bash-only; when in non-interactive mode,
                             # the last pipe stage runs in the current shell

With lastpipe enabled, cmd | while read line; do COUNT=...; done works as expected. But it’s a non-default setting and only works in scripts (not interactive shells), and only if job control is off. The portable, explicit fix is process substitution or input redirection.

The rule: never use cmd | while read if the loop body needs to update variables. Use < <(cmd) instead.

A real-world example: counting matches

# WRONG — the count is always 0
COUNT=0
grep -E '^ERROR' /var/log/app.log | while read -r line; do
  (( COUNT++ ))
done
echo "Errors: $COUNT"

# RIGHT — process substitution preserves variable scope
COUNT=0
while read -r line; do
  (( COUNT++ ))
done < <(grep -E '^ERROR' /var/log/app.log)
echo "Errors: $COUNT"

# BEST — let grep do the counting
COUNT="$(grep -cE '^ERROR' /var/log/app.log)"
echo "Errors: $COUNT"

The third form is fastest and clearest. Whenever you find yourself counting in a shell loop, ask whether grep -c, wc -l, or awk could do it for you. Shell loops over many lines are slow; built-in tools are fast.


6. break and continue

break exits the innermost loop. continue skips to the next iteration. Both can take an integer to operate on outer loops:

for i in 1 2 3; do
  for j in a b c; do
    if [[ "$j" == "b" ]]; then
      break        # exits the inner loop
    fi
    echo "$i $j"
  done
done
# Output:
# 1 a
# 2 a
# 3 a
for i in 1 2 3; do
  for j in a b c; do
    if [[ "$j" == "b" ]]; then
      break 2      # exits BOTH loops
    fi
    echo "$i $j"
  done
done
# Output:
# 1 a
for i in 1 2 3; do
  for j in a b c; do
    if [[ "$j" == "b" ]]; then
      continue    # skip this j, move to next j
    fi
    echo "$i $j"
  done
done
# Output:
# 1 a
# 1 c
# 2 a
# 2 c
# 3 a
# 3 c

break N and continue N work on N levels of nesting. Use break 2 instead of flag variables — it’s clearer and faster.


7. The case statement inside loops

case was introduced in lesson 3 as a multi-branch conditional. Inside a loop, it’s the cleanest way to dispatch on per-item values:

for arg in "$@"; do
  case "$arg" in
    -v|--verbose)
      VERBOSE=1
      ;;
    -q|--quiet)
      QUIET=1
      ;;
    -h|--help)
      show_help
      exit 0
      ;;
    --)
      shift
      break        # rest are positional args
      ;;
    -*)
      echo "Unknown option: $arg" >&2
      exit 2
      ;;
    *)
      POSITIONAL+=("$arg")
      ;;
  esac
done

This is a hand-rolled argument parser. getopts (lesson 17) does this more rigorously, but the hand-rolled form is often clearer and supports long options without extra work.

Inside a case arm, ;; ends the arm. ;& falls through to the next arm. ;;& continues evaluating subsequent patterns. We covered these in lesson 3.


8. Command substitution gotchas

Lesson 2 covered the basics of $(cmd). Inside a loop, the gotchas multiply.

Trailing newlines are stripped

NAME=$(echo "hello")
printf '%q\n' "$NAME"           # 'hello' — no trailing newline

$(...) strips all trailing newlines from the command’s output. This is usually what you want. But if the trailing newlines are significant (rare, but possible — e.g., a file’s exact byte content), you need to preserve them with a sentinel:

CONTENT=$(cat file.txt; printf x)
CONTENT="${CONTENT%x}"          # remove the sentinel

The trailing printf x adds a non-newline byte that isn’t stripped; then we remove it with parameter expansion. Niche but occasionally critical.

Output that contains globs

FILES_OUTPUT=$(ls /tmp)
for f in $FILES_OUTPUT; do      # WRONG: word-split on IFS, glob-expand
  echo "$f"
done

If /tmp has a file named *, the unquoted $FILES_OUTPUT will glob-expand and you’ll iterate over every file in your current directory instead. Never iterate over $(ls)-style output unquoted.

Multi-line output and IFS

TEXT=$(printf 'line1\nline2\nline3\n')
for line in $TEXT; do           # works ONLY because IFS includes \n by default
  echo "$line"
done

This works under default IFS (space tab newline), but if you’ve set IFS=$'\n\t' in strict mode, the lines split on newlines correctly — but if any line contains a tab, it splits there too. The robust replacement:

mapfile -t LINES < <(printf 'line1\nline2\nline3\n')
for line in "${LINES[@]}"; do
  echo "$line"
done

mapfile -t reads input line-by-line into an array, splitting only on newlines. It’s the modern, correct replacement for for line in $(cmd).


9. Iterating over files: the canonical patterns

This is the section everyone needs, copied from real production shell scripts.

Pattern A: glob (when files are on disk in a known location)

shopt -s nullglob              # optional but recommended
for f in /path/to/*.log; do
  process "$f"
done
shopt -u nullglob

Or, without nullglob:

for f in /path/to/*.log; do
  [ -e "$f" ] || continue
  process "$f"
done

Pattern B: find -print0 + while read -d ''

For deeply nested directories or filtered traversal:

while IFS= read -r -d '' f; do
  process "$f"
done < <(find /path -type f -name '*.log' -print0)

find -print0 separates filenames with NUL bytes (\0), which is the only character that can’t appear in a filename. read -d '' (empty delimiter = NUL) parses NUL-separated input. This is the only completely robust way to handle arbitrary filenames in a shell pipeline.

Pattern C: mapfile for line-oriented input

mapfile -t FILES < <(find /path -type f -name '*.log')
for f in "${FILES[@]}"; do
  process "$f"
done

Simpler than the find -print0 form, but breaks on filenames containing newlines. Acceptable for trusted inputs (your own deployment artifacts), risky for user-supplied data.

Pattern D: xargs for parallel processing

find /path -type f -name '*.log' -print0 | xargs -0 -P 4 -I {} process "{}"

xargs -0 parses NUL-separated input. -P 4 runs 4 in parallel. -I {} substitutes the filename for {}. This is the right pattern for “process N files concurrently.” Lesson 14 covers xargs and parallelism in depth.

Choosing between them


10. Real-world example: archive and compress old logs

#!/usr/bin/env bash
# rotate-logs.sh — compress logs older than N days, then delete after M days
set -euo pipefail
IFS=$'\n\t'

LOG_DIR="${LOG_DIR:-/var/log/myapp}"
COMPRESS_AFTER_DAYS="${COMPRESS_AFTER_DAYS:-7}"
DELETE_AFTER_DAYS="${DELETE_AFTER_DAYS:-30}"

# 1. Compress logs older than N days, but not already compressed
COMPRESSED_COUNT=0
while IFS= read -r -d '' f; do
  echo "Compressing: $f"
  if gzip -- "$f"; then
    (( COMPRESSED_COUNT++ ))
  else
    echo "Failed to compress: $f" >&2
  fi
done < <(find "$LOG_DIR" -type f -name '*.log' -mtime "+${COMPRESS_AFTER_DAYS}" -print0)

echo "Compressed ${COMPRESSED_COUNT} files."

# 2. Delete compressed logs older than M days
DELETED_COUNT=0
while IFS= read -r -d '' f; do
  echo "Deleting: $f"
  if rm -- "$f"; then
    (( DELETED_COUNT++ ))
  else
    echo "Failed to delete: $f" >&2
  fi
done < <(find "$LOG_DIR" -type f -name '*.log.gz' -mtime "+${DELETE_AFTER_DAYS}" -print0)

echo "Deleted ${DELETED_COUNT} files."

Things to notice:

This is the production-grade form. The naive form using for f in $(ls *.log) would have broken in three different ways on this filesystem: spaces, leading-dash filenames, and the no-match case.


11. The ten loop idioms you should have in muscle memory

# 1. Iterate over files in a directory (with empty-glob protection)
for f in /path/*.log; do
  [ -e "$f" ] || continue
  process "$f"
done

# 2. Iterate over command-line arguments
for arg in "$@"; do
  echo "$arg"
done

# 3. C-style numeric loop
for ((i=0; i<10; i++)); do
  echo "$i"
done

# 4. Read a file line by line
while IFS= read -r line; do
  echo "Got: $line"
done < file.txt

# 5. Read CSV
while IFS=',' read -r col1 col2 col3; do
  echo "$col1 | $col2 | $col3"
done < data.csv

# 6. Iterate over the output of a command (variables persist)
while IFS= read -r line; do
  (( COUNT++ ))
done < <(my-command)

# 7. Find with NUL-safe iteration
while IFS= read -r -d '' f; do
  process "$f"
done < <(find . -type f -print0)

# 8. mapfile into array
mapfile -t LINES < file.txt
for line in "${LINES[@]}"; do
  echo "$line"
done

# 9. Retry loop with exponential backoff
for ((i=1; i<=5; i++)); do
  if my-command; then
    break
  fi
  sleep $((2 ** i))
done

# 10. Forever loop with controlled exit
while :; do
  if [[ -f /tmp/stop ]]; then break; fi
  do_one_iteration
  sleep 5
done

Internalise these and 90% of your loop-writing time disappears.


12. What you must internalise before lesson 5

If any felt fuzzy, re-read. Lesson 5 (functions, scope, return) builds on all of this — every function is an exit-code-returning thing that often contains loops.


What’s next

Lesson 5 covers functions, local scope, the difference between return and exit, argument passing ($1, $@, $*), positional-parameter manipulation with shift, recursive functions, and the function-style for organising larger scripts (the main "$@" pattern). Bring everything from lessons 1–4.

shellbashloopsiterationforwhileuntilcasecommand-substitutionmapfilefundamentalslinuxposix
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments