Shell Lesson 14 of 42

Argument Parsing: getopts, getopt, Manual Parsing & the Long-Options Pattern — Building CLIs That Feel Like git

You’ve written enough scripts now to feel the friction of “first positional arg is the env, second is the tag, third is optional flag.” It works, until someone wants --dry-run. Or -v for verbose. Or sub-commands like mytool deploy vs mytool rollback. Suddenly your script needs proper argument parsing — the kind kubectl, git, aws, and every other production CLI has.

Bash gives you tools for this, but they’re not obvious:

By the end of this lesson, you’ll know when to use which, and you’ll have a copy-paste long-option parser that handles every edge case (combined flags, --key=value, --, optional vs required args).


1. Positional arguments — the baseline

Before parsing, recap what we have:

#!/usr/bin/env bash
set -Eeuo pipefail
[[ $# -ge 2 ]] || die "usage: $0 <env> <tag>"
ENV="$1"
TAG="$2"

This works for simple cases. It breaks down once you have optional arguments or flags.

./script staging v1.2.3 --dry-run        # how do we get --dry-run out?
./script -v staging v1.2.3                # how do we get -v?

You can hand-roll all of this with if [[ "$1" == "-v" ]]; then …, but it gets ugly fast. Use a parser.


2. getopts — the bash built-in

getopts is the POSIX-standard short-option parser, built into bash. It iterates through $@ once, recognising -x and -x value patterns.

Basic usage

#!/usr/bin/env bash
set -Eeuo pipefail

VERBOSE=0
CONFIG=""
DRY_RUN=0

while getopts ":vc:n" opt; do
  case "$opt" in
    v) VERBOSE=1 ;;
    c) CONFIG="$OPTARG" ;;
    n) DRY_RUN=1 ;;
    \?) echo "Invalid option: -$OPTARG" >&2; exit 2 ;;
    :)  echo "Option -$OPTARG requires an argument" >&2; exit 2 ;;
  esac
done
shift $((OPTIND - 1))

# Now positional args are in $@
echo "verbose=$VERBOSE config='$CONFIG' dry_run=$DRY_RUN remaining=$*"

Run:

$ ./script -v -c /etc/app.conf -n staging prod
verbose=1 config='/etc/app.conf' dry_run=1 remaining=staging prod

The option string explained

":vc:n"

To declare an optional argument, getopts has no syntax for it. You can fake it (covered later) or use long-options + manual parsing.

OPTIND — the index of the next argument

getopts updates OPTIND (Option InDex) as it consumes args. After the loop:

shift $((OPTIND - 1))

This shifts away the consumed flags, leaving the positional arguments in $@.

Combined flags (-vn)

getopts supports combined short flags:

$ ./script -vn config-file
# parses as -v, -n, then "config-file" is the positional

This is standard Unix behaviour and just works.

-c=value is NOT supported

Unlike GNU long-options, getopts wants a space:

./script -c /etc/app.conf      # CORRECT
./script -c=/etc/app.conf      # WRONG — getopts treats "=/etc/..." as the value

This is the #1 surprise for newcomers.

-- ends option parsing

The -- separator says “no more flags; rest are positional”:

./script -v -- -file-with-leading-dash.txt
# -v is parsed; then -- signals end; the dash-file is the positional

getopts handles this automatically.

Error handling — \? and :

In silent mode (option string starts with :):

Without leading :, getopts prints its own errors but still uses ? for both cases — you can’t tell them apart, and the message is “illegal option” — not very informative. Always use silent mode.

A more complete example with usage()

#!/usr/bin/env bash
set -Eeuo pipefail

readonly SCRIPT="${0##*/}"

usage() {
  cat <<EOF
Usage: $SCRIPT [-v] [-n] [-c CONFIG] <env> <tag>

  -v          verbose mode
  -n          dry run; don't actually deploy
  -c CONFIG   path to config file (default: \$HOME/.app.conf)

Examples:
  $SCRIPT staging v1.2.3
  $SCRIPT -v -n -c ./test.conf prod v1.2.3
EOF
  exit "${1:-0}"
}

VERBOSE=0
DRY_RUN=0
CONFIG="${HOME}/.app.conf"

while getopts ":vnc:h" opt; do
  case "$opt" in
    v) VERBOSE=1 ;;
    n) DRY_RUN=1 ;;
    c) CONFIG="$OPTARG" ;;
    h) usage 0 ;;
    \?) printf 'Invalid option: -%s\n\n' "$OPTARG" >&2; usage 2 ;;
    :)  printf 'Option -%s requires an argument\n\n' "$OPTARG" >&2; usage 2 ;;
  esac
done
shift $((OPTIND - 1))

[[ $# -eq 2 ]] || usage 2
ENV="$1"; TAG="$2"

echo "ENV=$ENV TAG=$TAG VERBOSE=$VERBOSE DRY_RUN=$DRY_RUN CONFIG=$CONFIG"

This is the production-grade getopts pattern. Use it when short options are enough.

getopts summary

Feature Supported
Short options -v yes
Short option with arg -c FILE yes
Combined flags -vn yes
Long options --verbose no
--key=value no
Optional argument no (workaround possible)
Sub-commands no (do it yourself)

When you need long options, switch to manual parsing or GNU getopt.


3. Manual long-option parsing — the canonical pattern

For full power and full portability (works on any bash), use a while loop with a case over $1, shifting as you go.

The template

#!/usr/bin/env bash
set -Eeuo pipefail

VERBOSE=0
DRY_RUN=0
CONFIG=""
ENV=""
TAG=""

usage() {
  cat <<EOF
Usage: ${0##*/} [OPTIONS] <env> <tag>

Options:
  -v, --verbose          verbose mode
  -n, --dry-run          don't actually deploy
  -c, --config FILE      path to config file
  -h, --help             show this help
EOF
  exit "${1:-0}"
}

while [[ $# -gt 0 ]]; do
  case "$1" in
    -v|--verbose)
      VERBOSE=1
      shift
      ;;
    -n|--dry-run)
      DRY_RUN=1
      shift
      ;;
    -c|--config)
      [[ $# -ge 2 ]] || { echo "missing value for $1" >&2; usage 2; }
      CONFIG="$2"
      shift 2
      ;;
    --config=*)
      CONFIG="${1#*=}"
      shift
      ;;
    -h|--help)
      usage 0
      ;;
    --)
      shift
      break
      ;;
    -*)
      echo "Unknown option: $1" >&2
      usage 2
      ;;
    *)
      # First non-option argument breaks out; rest are positional
      break
      ;;
  esac
done

# Remaining are positional
[[ $# -eq 2 ]] || usage 2
ENV="$1"
TAG="$2"

echo "ENV=$ENV TAG=$TAG VERBOSE=$VERBOSE DRY_RUN=$DRY_RUN CONFIG=$CONFIG"

This pattern handles:

This is what most production shell scripts use. Master this template and copy-paste it.

Variations

Multiple values: collect into an array

INCLUDE=()

case "$1" in
  -i|--include)
    INCLUDE+=("$2")
    shift 2
    ;;
  --include=*)
    INCLUDE+=("${1#*=}")
    shift
    ;;
esac

# Use:
for path in "${INCLUDE[@]}"; do …; done

Now --include foo --include bar --include baz builds up a list.

Counting occurrences (e.g. -vvv)

VERBOSE=0

case "$1" in
  -v|--verbose) VERBOSE=$((VERBOSE+1)); shift ;;
  -vv) VERBOSE=$((VERBOSE+2)); shift ;;
  -vvv) VERBOSE=$((VERBOSE+3)); shift ;;
esac

For real -v -v -v (separate args), the first case repeated handles it. For -vvv (combined), we need explicit cases or a parser that decomposes combined short flags (rare in shell — most scripts don’t bother).

Optional arguments to a flag

case "$1" in
  --color)
    # check if next arg is a recognised colour mode or another flag
    if [[ $# -ge 2 && "$2" != -* ]]; then
      COLOR="$2"
      shift 2
    else
      COLOR="auto"
      shift
    fi
    ;;
  --color=*)
    COLOR="${1#*=}"
    shift
    ;;
esac

This is messy — it’s why “optional argument to flag” is unusual in CLIs. Most tools just use --color=value (mandatory =) for optional values.

Boolean flags with --no- form

case "$1" in
  --color)    COLOR=1; shift ;;
  --no-color) COLOR=0; shift ;;
esac

Standard pattern. --color enables, --no-color disables explicitly. Useful when the default is configurable and the user wants to override.


4. GNU getopt — long options without rolling your own

GNU getopt (the external command, different from bash getopts built-in) handles long options for you. It’s not portable to macOS by default — macOS ships a BSD getopt that doesn’t support long options. You either:

Detection

# Check if we have GNU getopt
if ! getopt --test >/dev/null 2>&1; [[ $? -ne 4 ]]; then
  echo "this script requires GNU getopt; on macOS: brew install gnu-getopt" >&2
  exit 1
fi

GNU getopt’s --test option exits with status 4 specifically. BSD getopt doesn’t recognise --test and exits with a different status.

The pattern

#!/usr/bin/env bash
set -Eeuo pipefail

# Define short and long option strings
SHORT="vnc:h"
LONG="verbose,dry-run,config:,help"

# Run getopt to canonicalise; trap errors
PARSED=$(getopt --options="$SHORT" --longoptions="$LONG" --name "$0" -- "$@") || { usage 2; }

# Reset the positional args to the canonicalised form
eval set -- "$PARSED"

# Now parse normally with case (no need to handle --opt=value; getopt already split it)
VERBOSE=0; DRY_RUN=0; CONFIG=""

while true; do
  case "$1" in
    -v|--verbose) VERBOSE=1; shift ;;
    -n|--dry-run) DRY_RUN=1; shift ;;
    -c|--config)  CONFIG="$2"; shift 2 ;;
    -h|--help)    usage 0 ;;
    --)           shift; break ;;
    *)            echo "internal error" >&2; exit 2 ;;
  esac
done

ENV="$1"; TAG="$2"

The trick is eval set -- "$PARSED": getopt outputs a canonicalised, properly-quoted argument list, and eval set -- re-applies it as the new $@. After that, parsing is straightforward.

Why getopt is convenient: it handles --config=FILE for you (splits to --config FILE), supports option abbreviations (--ver matches --verbose if unambiguous), and groups short flags. You don’t need to handle the --config=* case or worry about -vn.

Why getopt is annoying:

For new scripts, manual parsing (Section 3) is usually preferable. Use getopt only if you’re Linux-locked and the script has many options.


5. Sub-commands — the git/kubectl style

Real CLIs have sub-commands: git commit, kubectl apply, aws s3 cp. Build this with a case on the first non-option argument:

#!/usr/bin/env bash
set -Eeuo pipefail

usage_main() {
  cat <<EOF
Usage: ${0##*/} [GLOBAL_OPTIONS] <command> [COMMAND_OPTIONS] [ARGS]

Commands:
  deploy <env> <tag>       deploy a tag to an env
  rollback <env>           rollback the most recent deploy
  status <env>             show current deployed version

Global options:
  -v, --verbose            verbose mode
  -h, --help               show help
EOF
  exit "${1:-0}"
}

# Parse global options
VERBOSE=0
while [[ $# -gt 0 ]]; do
  case "$1" in
    -v|--verbose) VERBOSE=1; shift ;;
    -h|--help) usage_main 0 ;;
    -*) echo "Unknown global option: $1" >&2; usage_main 2 ;;
    *) break ;;     # first non-option arg is the sub-command
  esac
done

[[ $# -ge 1 ]] || usage_main 2
COMMAND="$1"
shift

# Dispatch
case "$COMMAND" in
  deploy)   cmd_deploy "$@" ;;
  rollback) cmd_rollback "$@" ;;
  status)   cmd_status "$@" ;;
  -h|--help|help) usage_main 0 ;;
  *) echo "Unknown command: $COMMAND" >&2; usage_main 2 ;;
esac

Each cmd_* function does its own argument parsing:

cmd_deploy() {
  local env="" tag="" dry_run=0
  while [[ $# -gt 0 ]]; do
    case "$1" in
      -n|--dry-run) dry_run=1; shift ;;
      -*) echo "deploy: unknown option $1" >&2; exit 2 ;;
      *) break ;;
    esac
  done
  [[ $# -eq 2 ]] || { echo "deploy: usage: deploy <env> <tag>" >&2; exit 2; }
  env="$1"; tag="$2"
  echo "deploying $tag to $env (dry_run=$dry_run, verbose=$VERBOSE)"
}

cmd_rollback() { … }
cmd_status()   { … }

This scales to dozens of sub-commands. Most CLIs end up with a lib/cmd_*.sh file per command and an entry point that just dispatches.

Auto-discovery of sub-commands

Even cooler — discover sub-commands at runtime by looking for cmd_* functions:

list_commands() {
  declare -F | awk '$NF ~ /^cmd_/ { sub(/^cmd_/, "", $NF); print $NF }'
}

declare -F lists all defined functions; we filter for cmd_* and strip the prefix. Now myapp help can list known commands without hard-coding them.


6. The usage() function pattern

Every CLI needs a usage function. Conventions:

usage() {
  cat <<EOF
Usage: ${0##*/} [OPTIONS] <env> <tag>

Description:
  Deploy a tag to a Kubernetes namespace.

Options:
  -v, --verbose          verbose output
  -n, --dry-run          show what would happen, but don't deploy
  -c, --config FILE      path to config file (default: \$HOME/.app.conf)
  -h, --help             show this help

Arguments:
  env                    target environment: dev, staging, prod
  tag                    image tag in vMAJOR.MINOR.PATCH form

Examples:
  ${0##*/} staging v1.2.3
  ${0##*/} -v --dry-run prod v1.2.3
  ${0##*/} --config ~/.app.staging.conf staging v1.2.3

Exit status:
  0  success
  1  general error
  2  invalid usage
EOF
  exit "${1:-0}"
}

Conventions:

For really comprehensive CLIs, generate the usage from a structured definition:

declare -A OPT_DESC=(
  [v|verbose]="verbose output"
  [n|dry-run]="don't actually deploy"
  [c|config FILE]="path to config file"
  [h|help]="show this help"
)

That’s overkill for most scripts but useful for very large ones. Most stop at static here-docs.


7. Default values from environment

Production scripts often allow defaults to come from environment variables (so CI can set them without command-line clutter):

# Default to env var, fall back to literal default
ENV="${TARGET_ENV:-dev}"
NAMESPACE="${KUBE_NAMESPACE:-default}"

# CLI overrides env var; env var overrides hard-coded default
while [[ $# -gt 0 ]]; do
  case "$1" in
    -e|--env) ENV="$2"; shift 2 ;;
    -n|--namespace) NAMESPACE="$2"; shift 2 ;;
    *) break ;;
  esac
done

This precedence (CLI > env > default) is standard for almost every CLI — kubectl, terraform, aws, etc. Implement it consistently.

--config as a YAML/JSON file

For complex configs, accept a config file:

CONFIG_FILE=""
ENV=""
NAMESPACE=""

# Parse args (CONFIG_FILE may be set here)

# Load config if specified
if [[ -n "$CONFIG_FILE" ]]; then
  [[ -r "$CONFIG_FILE" ]] || die "cannot read config: $CONFIG_FILE"
  ENV=$(yq '.env' "$CONFIG_FILE")
  NAMESPACE=$(yq '.namespace' "$CONFIG_FILE")
fi

# Now apply env-var overrides
ENV="${TARGET_ENV:-$ENV}"
NAMESPACE="${KUBE_NAMESPACE:-$NAMESPACE}"

# CLI overrides come last (already applied during parse, since we set ENV directly there)

The precedence stack: hard-coded default → config file → environment variable → command line.


8. Validation after parsing

After parsing, validate:

[[ -n "$ENV" ]] || die "missing required: env"
[[ -n "$TAG" ]] || die "missing required: tag"
[[ "$ENV" =~ ^(dev|staging|prod)$ ]] || die "invalid env: $ENV"
[[ "$TAG" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]] || die "invalid tag: $TAG"
[[ -z "$CONFIG" || -r "$CONFIG" ]] || die "config not readable: $CONFIG"

Always validate after parsing, never during. Parsing should just fill in the variables; a separate validation phase checks that the combination is sensible. This separates “user typed a bad option” from “config file doesn’t exist.”


9. Common pitfalls

Forgetting shift in a case branch

while [[ $# -gt 0 ]]; do
  case "$1" in
    -v) VERBOSE=1 ;;             # MISSING shift — infinite loop
  esac
done

Always shift (or shift 2 for options-with-values, or break to stop). The loop iterates until $# is 0.

Using $1 after shift without re-checking $#

case "$1" in
  -c|--config)
    shift
    CONFIG="$1"        # if user wrote `-c` with no value, $1 is unset; -u fires
    ;;
esac

Always check $# first:

case "$1" in
  -c|--config)
    [[ $# -ge 2 ]] || die "missing value for $1"
    CONFIG="$2"
    shift 2
    ;;
esac

Not handling --

If your script accepts pass-through args (./script --verbose -- some-other-tool --its-flag), the -- separator is essential. Always include it:

case "$1" in
  --) shift; break ;;
esac

getopt vs getopts confusion

They are different programs:

The error “getopts unrecognized option” usually means you typed getopt when you wanted getopts. The reverse is also a common bug. Triple-check spelling.

$OPTARG not in your case

After case "$opt", $OPTARG is the value (for options that take one). Don’t forget the colon in the option string, or OPTARG is empty:

while getopts ":vc" opt; do      # missing colon after c
  case "$opt" in
    c) CONFIG="$OPTARG" ;;       # OPTARG is empty
  esac
done

The fix: getopts ":vc:" opt — the trailing : after c.

Subcommand args being parsed by global parser

./tool deploy -v staging v1.2.3
# If the global parser consumes -v before reaching `deploy`, the sub-command never sees it

Either:

Long-option parsing with = mid-value

--config=/etc/app/config.yaml

In your manual parser, ${1#*=} strips everything up to and including the first =, leaving /etc/app/config.yaml. Good.

--config=foo=bar

${1#*=} strips only to the first =, giving foo=bar. Also good.

But:

--config            # no value, no =

${1#*=} returns the whole $1 (no = to strip), so CONFIG="--config". Wrong. Detect this:

case "$1" in
  --config=*)
    CONFIG="${1#*=}"
    [[ -n "$CONFIG" ]] || die "missing value for --config"
    shift
    ;;
  --config)
    [[ $# -ge 2 ]] || die "missing value for --config"
    CONFIG="$2"
    shift 2
    ;;
esac

Always handle both forms explicitly.

Combined short flags with manual parser

getopts handles -vn (combined). Manual parsers don’t, by default. To support it:

# Decompose -vn into -v -n before parsing
ARGS=()
for arg in "$@"; do
  case "$arg" in
    -[a-zA-Z][a-zA-Z]*)
      # short flag combination — split each character into its own flag
      i=1
      while [[ $i -lt ${#arg} ]]; do
        ARGS+=("-${arg:$i:1}")
        ((i++))
      done
      ;;
    *)
      ARGS+=("$arg")
      ;;
  esac
done
set -- "${ARGS[@]}"

# Now parse normally — combined flags have been split

This is rarely necessary; most CLIs require separate flags (-v -n). Document it if you don’t support combined short flags.


10. Twelve idioms for daily use

# 1. getopts skeleton
while getopts ":vnc:h" opt; do
  case "$opt" in
    v) VERBOSE=1 ;;
    n) DRY_RUN=1 ;;
    c) CONFIG="$OPTARG" ;;
    h) usage 0 ;;
    \?) usage 2 ;;
    :)  echo "missing arg for -$OPTARG" >&2; usage 2 ;;
  esac
done
shift $((OPTIND - 1))

# 2. Manual long-option parser (template)
while [[ $# -gt 0 ]]; do
  case "$1" in
    -v|--verbose) VERBOSE=1; shift ;;
    -c|--config) CONFIG="$2"; shift 2 ;;
    --config=*)  CONFIG="${1#*=}"; shift ;;
    -h|--help)   usage 0 ;;
    --)          shift; break ;;
    -*)          echo "Unknown: $1" >&2; usage 2 ;;
    *)           break ;;
  esac
done

# 3. usage with optional exit code
usage() { cat <<EOF
Usage: ${0##*/} [-v] <env> <tag>
EOF
  exit "${1:-0}"
}

# 4. Default from env, override from CLI
ENV="${TARGET_ENV:-dev}"

# 5. Validate option arg
[[ $# -ge 2 ]] || die "missing value for $1"

# 6. Multi-value array
INCLUDE=()
case "$1" in
  -i|--include) INCLUDE+=("$2"); shift 2 ;;
esac

# 7. Boolean with --no- form
case "$1" in
  --color)    COLOR=1; shift ;;
  --no-color) COLOR=0; shift ;;
esac

# 8. Sub-command dispatcher
case "$COMMAND" in
  deploy) cmd_deploy "$@" ;;
  status) cmd_status "$@" ;;
  *)      die "unknown command: $COMMAND" ;;
esac

# 9. Auto-discover commands
list_commands() {
  declare -F | awk '$NF ~ /^cmd_/ { sub(/^cmd_/, "", $NF); print $NF }'
}

# 10. Detect GNU getopt
getopt --test >/dev/null 2>&1
[[ $? -eq 4 ]] || die "GNU getopt required"

# 11. Reset $@ from getopt output
PARSED=$(getopt --options="$SHORT" --longoptions="$LONG" -- "$@") || usage 2
eval set -- "$PARSED"

# 12. Validate environment after parsing
[[ "$ENV" =~ ^(dev|staging|prod)$ ]] || die "invalid env: $ENV"

11. What you must internalise before lesson 15


What’s next

Lesson 15: Logging Frameworks — syslog/journald, Structured Logs, Levels & Rotation. We’ll move beyond info/warn/die printing to stderr and look at proper logging: syslog integration, structured (key=value or JSON) logs, log levels with thresholds, output destinations (stderr/file/journald), rotation, and the canonical lib/log.sh you can drop into any script. After L15 your scripts will leave traces that downstream tools (Loki, Splunk, Elastic) can actually parse.

See you there.

shellbashgetoptsgetoptcliargument-parsinglong-optionssubcommandsproduction
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments