Shell Lesson 1 of 42

Shell Anatomy, In Depth: Bash, Zsh, Dash, BusyBox & the Process / Environment Model You Have to Internalise Before Anything Else

If you take only one thing away from this lesson, take this: the shell is a process, not a language, and most shell bugs are bugs of process and environment, not bugs of syntax.

Almost every confusing problem you’ll hit in a forty-year career — “why does my variable disappear when I run the script?”, “why does this work in my terminal but break in cron?”, “why is ~/.bashrc not being read?”, “why does sourcing the file change my prompt but executing it doesn’t?”, “why does the script work as my user but break under sudo?” — every one of these dissolves the moment you have a correct model of how the shell loads, what it inherits from its parent, and what it passes to its children.

This is why the very first lesson in this course is not “how to write a hello-world script.” It is “what is a shell, mechanically, in the operating system.” Get this right and the next forty lessons cost you a fraction of the effort. Get this wrong and you’ll spend the rest of your career writing shell that appears to work and quietly breaks in surprising places.

Read this lesson slowly. Type the examples. Skip nothing.


1. A shell is a process

When you open a terminal, your terminal emulator (iTerm2, Alacritty, GNOME Terminal, Windows Terminal) starts a child process — and that child process is your shell. On modern Linux and macOS, that’s typically /bin/bash, /bin/zsh, or /usr/bin/fish. On Alpine Linux containers it’s /bin/ash. On Debian’s /bin/sh symlink it’s /bin/dash. On a busy embedded device it’s /bin/busybox symlinked as /bin/sh.

You can see this for yourself. Open any terminal and type:

ps -p $$

You’ll get back something like:

  PID TTY           TIME CMD
36481 ttys001    0:00.05 -zsh

That $$ is a magic shell variable that contains the PID — the process ID — of the current shell process. The leading - in -zsh (the command column) is a convention that tells you this shell is a login shell (we’ll come back to that). The fact that there’s a ps entry at all is what matters here: the shell is a real, scheduled, kernel-tracked process, with a PID, a parent PID ($PPID), an environment, open file descriptors, a current working directory, a user/group identity, and a session.

Everything you do interactively or in a script lives inside that process — until it doesn’t. When the shell runs a command like ls, it does not execute the ls code inside itself. It does this:

  1. fork() — creates a copy of itself (a child process)
  2. exec() — replaces the child’s program with /bin/ls
  3. wait() — the parent waits for the child to finish
  4. The child exits with an integer exit code, which the parent reads as $?

This fork + exec + wait cycle is the entire mental model of a shell. Almost every shell concept in this course is some refinement of “what does the shell do before, during, or after that fork+exec, and what gets inherited or not?”

A few non-obvious consequences immediately follow:

Consequence 1. Variables you set in the parent shell are not automatically visible inside the child unless you explicitly export them into the environment. We’ll see this constantly:

my_var=hello
ls            # ls cannot see my_var
export my_var=hello
ls            # ls sees my_var in its environment

Consequence 2. Anything the child does — cd-ing, setting variables, opening file descriptors — is local to the child. When the child exits, those changes vanish. This is why people are confused that cd /tmp inside a script doesn’t change the directory of the calling shell. The script is a child process; when it exits, its cwd change exits with it.

Consequence 3. Some commands, called builtins, are not run via fork+exec. They are implemented inside the shell process itself. cd, export, read, set, [, [[, eval, exec, source (and its alias .) — these are builtins. If they weren’t, none of them could affect the shell’s own state, because of consequence 2. cd /tmp inside a script can’t change the parent’s directory, but cd /tmp inside the parent shell does change its directory because cd is a builtin running inside that very process.

Consequence 4. Sourcing a file (source ./script.sh or . ./script.sh) is fundamentally different from executing it (./script.sh or bash script.sh). Sourcing reads the file into the current shell process; the file’s commands run as if you typed them at the prompt. Executing it forks a child shell, runs the file in the child, and returns. This single distinction explains 30% of all “why doesn’t this work?” questions about shell.

You should be able to predict the output of every line in this transcript before reading the explanation:

$ x=outer
$ echo "$x"          # prints: outer
$ bash -c 'echo "$x"' # prints empty line — child shell didn't inherit x
$ export x=outer
$ bash -c 'echo "$x"' # prints: outer — now exported, child sees it
$ bash -c 'x=changed' # child changes x, then child exits
$ echo "$x"          # prints: outer — parent's x untouched
$ source <(echo 'x=sourced')
$ echo "$x"          # prints: sourced — sourcing ran in *this* shell

This is the whole game. Walk this line-by-line in your own terminal until you can predict every output.


2. There are many shells, and they are not interchangeable

Newcomers often write “the shell” or “Bash” as if they were the same. They are not, and the differences matter operationally.

The shells you will actually encounter:

Shell Origin Where you’ll find it Notes
bash Bourne Again SHell, GNU, 1989 The default on most Linux distros, the non-default on macOS since 10.15 Most feature-rich classical shell; the de facto target for “shell scripts”
zsh Z Shell, 1990 macOS default since 10.15 (Catalina); popular interactive choice Largely bash-compatible but with significant divergences (parameter expansion, globbing, arrays)
dash Debian Almquist SHell /bin/sh on Debian and Ubuntu POSIX-strict, deliberately minimal, much faster startup than bash
ash Almquist SHell Embedded systems, Alpine Linux, BusyBox Subset of POSIX; missing many features bash users assume
busybox sh BusyBox project Container base images, embedded Linux Even more minimal than ash
ksh Korn shell, 1983 AIX default, some BSDs Historically influential; many bash features come from ksh93
fish Friendly Interactive SHell Personal interactive use NOT POSIX-compatible. Never use as /bin/sh.

The two practical groupings you should keep in mind:

The single most common bug from this distinction:

#!/bin/sh
# A "shell script" — author tested with bash, deployed to Alpine.
arr=(a b c)            # FAILS in dash/ash — arrays are bash-specific
echo "${arr[1]}"       # FAILS — same reason
[[ -f /etc/foo ]]      # FAILS — [[ is bash-only

When deployed to an Alpine container where /bin/sh is ash, this script breaks immediately. The fix is one of:

  1. Use the right shebang: #!/bin/bash — then /bin/bash must exist on the target. On stripped-down containers, it might not.
  2. Stay POSIX: use [ -f /etc/foo ] instead of [[, use space-separated lists in for-loops instead of arrays, etc.
  3. Stay POSIX and prove it: run shellcheck --shell=sh script.shshellcheck will flag every bash-ism.

We’ll cover this in detail in the Tier 4 portability lesson. For now, the point is: be deliberate about which shell your script targets, and write the shebang to match.


3. Login vs interactive vs non-interactive

Bash and zsh distinguish three orthogonal modes a shell can be in. Confusion about these modes is the second-largest source of “it works in my terminal but breaks in cron” bugs.

The modes:

These are orthogonal. A shell can be:

Why does this matter? Because what files the shell sources at startup depends on which mode it’s in. Specifically (for bash):

Mode Files sourced (in order)
Login interactive /etc/profile, ~/.bash_profile or ~/.bash_login or ~/.profile (first found), then nothing else automatic
Non-login interactive /etc/bash.bashrc (Debian/Ubuntu only), ~/.bashrc
Non-interactive None of the above. Bash sources whatever $BASH_ENV points at, if anything.

Read that last row again: non-interactive shells do not source ~/.bashrc or ~/.bash_profile. This is why your aliases and functions defined in ~/.bashrc are invisible to scripts. This is why a cron job can’t find a CLI tool you installed via Homebrew — your interactive PATH extension lives in ~/.bashrc (or ~/.zshrc), and cron’s non-interactive shell never sources it.

For zsh the table is different and even more fragmented:

Mode Files sourced
Always (any zsh invocation) /etc/zshenv, ~/.zshenv
Login only /etc/zprofile, ~/.zprofile
Interactive only /etc/zshrc, ~/.zshrc
Login (after zshrc) /etc/zlogin, ~/.zlogin
Logout ~/.zlogout, /etc/zlogout

The ~/.zshenv is the only zsh dotfile guaranteed to be read by every zsh invocation, including non-interactive ones. This is where exported environment variables (PATH, EDITOR, etc.) belong if you want them visible to scripts run from cron or systemd. The mistake everyone makes: putting export PATH=… into ~/.zshrc or ~/.bashrc, then wondering why cron jobs can’t find kubectl.

A common pattern that side-steps this entirely: never rely on rc-file sourcing in production scripts. Set PATH explicitly at the top of the script, or use absolute paths to commands. This is part of the defensive-scripting playbook in Tier 3.

You can confirm what mode any shell is in:

# Inside a bash:
[[ $- == *i* ]] && echo "interactive" || echo "non-interactive"
shopt -q login_shell && echo "login shell" || echo "not login shell"
# Inside a zsh:
[[ -o interactive ]] && echo "interactive" || echo "non-interactive"
[[ -o login ]] && echo "login shell" || echo "not login shell"

If you cannot remember which dotfile to put what into, this is the rule of thumb that survives all four tables:

If you stick to those three buckets, you will never again have a “but it works in my terminal” bug.


4. The shebang line

The first two characters of an executable file, #!, are interpreted by the kernel (specifically by execve(2)) as a directive: “to run this file, exec the program named after these characters, passing the file’s path as an argument.” Everything you’ve ever seen at the top of a script — #!/bin/bash, #!/usr/bin/env python3, #!/usr/bin/perl -w — is consumed by the kernel, not by the shell.

This has practical consequences:

The shebang line determines which interpreter runs the script — not the file extension, not the calling shell. A file named foo.sh with #!/usr/bin/env python3 at the top, when executed (./foo.sh), runs as Python. A file named foo.py with #!/bin/bash runs as bash.

#!/bin/sh is not the same as #!/bin/bash. On Debian/Ubuntu, /bin/sh is dash. On Alpine, it’s ash. On RHEL, it’s bash but invoked in POSIX mode. If you write #!/bin/sh and use bash-isms, you’re playing roulette.

#!/usr/bin/env bash vs #!/bin/bash. The former asks the kernel to find bash via $PATH; the latter is a hardcoded absolute path. Use env if you expect users to have non-standard installs (Homebrew on macOS puts bash 5+ at /opt/homebrew/bin/bash, while /bin/bash on macOS is still bash 3.2 from 2007). Use the absolute path if you want to be explicit and you’re on a controlled fleet.

Scripts without a shebang are run by the calling shell, not by /bin/sh. So bash my_script runs my_script under bash regardless of any shebang; ./my_script (with no shebang) runs it under whatever your current shell is. This is fragile and you should never rely on it.

A common operational mistake worth highlighting: people write #!/bin/bash -e and expect set -e semantics. This works only when the script is invoked directly (./script); if invoked as bash script the -e is ignored. The portable fix is to put set -euo pipefail inside the script, on its own line. We’ll cover this in detail in Tier 3.

A subtler issue: the kernel only honours one argument in the shebang line on most Linux kernels (FreeBSD honours more). So #!/usr/bin/env python3 -u does not pass -u to Python; it passes the literal string python3 -u as the program name and fails. If you need flags, use set inside the script body instead.


5. The environment vs the shell’s variable space

A shell process has two spaces of named values:

  1. Shell variables: visible only inside this shell process. Set by name=value (no space around =).
  2. Environment variables: visible to this shell and to every child process forked from it. Promoted from shell variables by export name, or set+exported in one line by export name=value.

This is the same distinction as “local in the shell” vs “in the environment passed to fork+exec.” The kernel call execve(path, argv, envp) takes an envp (environment pointer) which is exactly the set of exported variables. Anything not exported lives only in the parent shell.

You can inspect either set:

# All shell variables (including unexported):
set | head -20
# Only environment variables (exported):
env | head -20
# Or:
printenv | head -20

You can also inspect another process’s environment:

cat /proc/$PID/environ | tr '\0' '\n'   # Linux only; needs permission

This is occasionally useful for debugging “what does cron actually pass to my script?” — find the script’s PID while it’s running, then read its /proc/PID/environ.

The unset builtin removes a variable from both the shell and the environment. The export -n flag un-exports without unsetting (rare, but useful).

A surprising rule that catches people: when you export a variable, you’re exporting the name, not the value. The value at the moment of export is irrelevant; subsequent assignments propagate automatically.

foo=hello
export foo
foo=world
bash -c 'echo "$foo"'   # prints: world — the export marked foo as exported, then later writes to foo update the env automatically

This works because the shell internally maintains a flag per variable, “is this exported?” When the shell forks for bash -c …, it serialises every exported variable’s current value into the child’s env block.


6. The exec builtin: replacing a process with another

The exec builtin does the kernel exec() system call directly on the current shell. It replaces the shell process with the named program — no fork, no return.

exec /usr/bin/htop      # this shell becomes htop; when htop exits, the terminal disconnects
exec </dev/null         # redirect stdin of THIS shell to /dev/null
exec 2>>/var/log/x.log  # redirect stderr of THIS shell to a log file (used a lot in scripts)

The first form (exec PROGRAM) is rare in scripts but common in wrapper-style setups: a wrapper script does some setup (env vars, ulimits, file descriptors) and then execs the real binary, so the wrapper does not stay around as a parent process consuming resources. This pattern is everywhere in container entrypoint scripts:

#!/bin/sh
# entrypoint.sh
set -e
# do some setup
chown -R appuser /var/data
# replace this shell with the real app — pid 1 in the container becomes the app
exec gosu appuser /usr/local/bin/myapp "$@"

If you don’t exec and just write gosu appuser /usr/local/bin/myapp "$@", then the shell stays around as PID 1, the app is PID 2, signals from the container runtime go to the shell instead of the app, and docker stop becomes a 10-second wait followed by SIGKILL. The exec is what makes this work cleanly.

The forms exec <fd>>file, exec <fd><file, exec <fd>>>file (redirect a file descriptor on the current shell) are crucial in script logging, locking, and error-handling. We’ll see these constantly in the I/O-redirection lesson.


7. Source vs run: the distinction that explains 30% of all bugs

There are exactly four ways to run a shell script, and they have different semantics:

./myscript        # Run as a separate process, with the shebang choosing the interpreter
bash myscript     # Run in a child bash, ignoring any shebang
source myscript   # Read INTO the current shell — no fork
. myscript        # Same as source — POSIX form

The first two fork a child process. The third and fourth do not. They evaluate the file’s contents as if you typed them at the current prompt.

Operational consequences:

A very common pattern that depends on this:

# project_env.sh — meant to be sourced, not executed.
export PROJECT_ROOT=/opt/myapp
export PATH="$PROJECT_ROOT/bin:$PATH"
alias logs='journalctl -u myapp'

You source this file at the top of your interactive session (source ~/project_env.sh) and your shell now has the project env. If you executed it with ./project_env.sh, the exports would happen in the child, the child would exit, and your prompt would be unchanged.

A useful convention: name files meant to be sourced without an extension, or with .env, and never make them executable. Files meant to be executed should have a #! shebang and be marked +x. This makes the intended usage visible at a glance.


8. The magic variables

Every shell exposes a small set of single-character or short variables that contain runtime metadata. Memorise these:

Variable Meaning
$0 Name of the script (or shell, in interactive use). Useful for usage() printing.
$1, $2, …, $9, ${10} Positional arguments. Note the brace requirement past 9.
$# Number of positional arguments.
$@ All positional arguments, as a list, when quoted ("$@").
$* All positional arguments, concatenated into one string with $IFS[0] between them, when quoted. Almost always wrong. Use "$@".
$$ PID of this shell.
$! PID of the most recently backgrounded job.
$? Exit code of the most recently completed foreground command.
$- Current option flags (himBHs-style). Test with [[ $- == *i* ]].
$_ Last argument of the previous command (interactive use mostly).
$PPID PID of this shell’s parent.
$RANDOM A random integer 0–32767 (not cryptographic).
$SECONDS Seconds since this shell started.
$LINENO Current line number — useful in error traps.
$BASH_SOURCE Array of source-file names (bash-specific; ${BASH_SOURCE[0]} is the file the script lives in).
$FUNCNAME Array of currently-executing function names (bash-specific).
$IFS Input Field Separator — the most dangerous variable in shell. We’ll cover it next lesson.

A pattern you’ll see in every well-written script:

#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SCRIPT_NAME="$(basename "${BASH_SOURCE[0]}")"

This canonicalises the script’s directory regardless of how it was invoked (./foo, /abs/path/foo, or cd /abs/path && ./foo). It’s how you safely reference sibling files (e.g. source "$SCRIPT_DIR/lib.sh") without depending on the user’s cwd.


9. The PATH and command lookup

When you type ls at the prompt, the shell does a search to find which file to fork+exec. The search rules, in order:

  1. Aliases (alias ls='ls --color=auto')
  2. Functions defined in the current shell
  3. Builtins (cd, read, export, …)
  4. Reserved words (if, for, …) — handled by the parser, not lookup
  5. Hashed commands — bash caches the resolved path of recently-used commands in a table; clear with hash -r
  6. $PATH search — for each :-separated directory in $PATH, in order, look for an executable file named ls

The type builtin tells you which of these will be used:

$ type cd
cd is a shell builtin
$ type ls
ls is aliased to `ls --color=auto'
$ type ls    # in a sub-shell where the alias isn't set
ls is /bin/ls
$ type -a ls
ls is aliased to `ls --color=auto'
ls is /bin/ls
$ type -P ls   # only the path, even if it's an alias/function
/bin/ls

Why does this matter? Because:

Never put . (the current directory) at the start of your $PATH. It’s a classic security mistake — if an attacker can drop a file named ls in a directory you cd into, they can hijack any command you type. Even putting . at the end is risky in shared systems.

A defensive script pattern: at the top of any production script, do

PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
export PATH

This pins PATH to a known-safe set regardless of what the calling environment provided. Cron and systemd timer scripts in particular benefit from this — the PATH they inherit is minimal and surprises people.


10. Shell options: set and shopt

Shells have two parallel option mechanisms:

You’ll see the canonical “strict mode” header in every well-written bash script:

#!/usr/bin/env bash
set -Eeuo pipefail
IFS=$'\n\t'

Reading right to left:

We will spend an entire lesson on this header and why every flag matters in Tier 3 (“Defensive Scripting”). For now, when you see set -Eeuo pipefail, recognise it as the production-grade preamble.


11. The complete file-load timeline for a typical session

Let’s walk through what actually happens when you SSH into a Linux server with a default bash setup. Step by step:

  1. sshd accepts your connection, authenticates you, and forks a child for your session.
  2. The child execs /bin/login (or directly /bin/bash --login for non-interactive ssh host 'cmd').
  3. bash starts as a login interactive shell.
  4. bash reads /etc/profile. This typically sources files in /etc/profile.d/*.sh (system-wide environment additions).
  5. bash looks for ~/.bash_profile. If it exists, sources it. Otherwise looks for ~/.bash_login, then ~/.profile.
  6. By convention, ~/.bash_profile ends with: [ -f ~/.bashrc ] && . ~/.bashrc — so that interactive shells also get the interactive niceties.
  7. Your prompt appears.
  8. You type bash to enter a sub-shell. This is non-login interactive.
  9. The sub-bash reads /etc/bash.bashrc (Debian/Ubuntu only; doesn’t exist on RHEL by default), then ~/.bashrc. Your aliases and functions are reloaded.
  10. You type a script invocation: ./myscript.sh (with #!/bin/bash).
  11. The sub-bash forks, the child execs bash to run myscript.sh. This is non-login non-interactive.
  12. The child sources nothing automatic — ~/.bashrc is not read. If you’ve defined an alias or function in ~/.bashrc and you used it in the script, the script will fail with “command not found.”

If you can trace this flow without notes, you have the foundation right. Almost every “works in terminal but fails elsewhere” question is somewhere on this path.


12. A worked example: writing your first hardened script

Now, putting all of the above together, here’s a small script that demonstrates the patterns you should be using from day one. We’ll dissect each line.

#!/usr/bin/env bash
# myscript.sh — example showing defensive shell skeleton.

set -Eeuo pipefail
IFS=$'\n\t'

readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
readonly SCRIPT_NAME="$(basename "${BASH_SOURCE[0]}")"

readonly PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
export PATH

usage() {
    cat <<USAGE
$SCRIPT_NAME — short description
Usage: $SCRIPT_NAME [-h] [-v] <input-file>
USAGE
}

main() {
    local input="${1:-}"
    [[ -z "$input" ]] && { usage; exit 64; }
    [[ -r "$input" ]] || { echo "ERROR: cannot read $input" >&2; exit 66; }

    echo "Running on $(hostname) as $(id -un); my PID is $$; parent is $PPID"
    echo "Script lives at $SCRIPT_DIR/$SCRIPT_NAME"
    echo "First line of input: $(head -n1 "$input")"
}

main "$@"

Walk through:

Save this, chmod +x myscript.sh, run ./myscript.sh /etc/hostname, and you have a script that follows every important convention. Every subsequent lesson in the foundation tier will refine some piece of this template.


13. What to do next

This lesson was deliberately theory-heavy. The next five Tier 1 lessons are mechanically focused:

By the end of Tier 1 you’ll be able to read and write any production shell script and understand exactly what it’s doing. Tier 2 then takes you into the I/O, pipeline, signal, and text-processing power tools that turn shell from “calculator with side effects” into “production glue language.”

Read this lesson again when you finish Tier 1. Almost every concept here will have re-surfaced at least once, and the second read will lock the model in for life.

Three diagnostic questions to test that you’ve internalised this lesson:

  1. You write a function my_cd() { cd "$1"; } in ~/.bashrc. You source ~/.bashrc. Then you run bash -c 'my_cd /tmp; pwd'. What does it print, and why?
  2. You set export FOO=1 in a parent shell. The parent runs a script that does unset FOO; bash -c 'echo "${FOO:-empty}"'. What prints? Why?
  3. A cron entry 0 * * * * my-script.sh runs every hour. The script uses kubectl, which is at /opt/homebrew/bin/kubectl. The script fails with “kubectl: command not found.” Why, and what’s the right fix?

If you can answer all three confidently, you’re ready for Lesson 2. If not, re-read the relevant section. The investment compounds.

shellbashzshdashbusyboxprocessenvironmentforkexecrc-filesfundamentalslinuxposix
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments