Shell Lesson 17 of 42

Network Operations: curl/wget Mastery, /dev/tcp Sockets, Retry-with-Backoff & Idempotent HTTP — When Your Script Talks to Other Machines

If your script is more than a sysadmin one-liner, it almost certainly hits the network. Pulling artefacts, calling APIs, posting webhooks, fetching secrets, syncing with health endpoints — networking is everywhere.

The problem: networks are unreliable. Connections drop. DNS times out. Servers return 502 mid-deploy. Cloud APIs rate-limit. A script that calls the network without proper handling is a script that fails 1% of the time, mysteriously, and gives no useful error.

By the end of this lesson:


1. The production curl invocation

A bare curl https://example.com works for one-shot. For scripts, the canonical incantation is:

curl --fail --silent --show-error --location \
     --connect-timeout 10 --max-time 60 \
     "$URL"

Or in short form:

curl -fsSL --connect-timeout 10 --max-time 60 "$URL"

Each flag earns its keep:

Memorise -fsSL. It goes on every curl in production.

Capturing both output and exit code

if ! out=$(curl -fsSL "$URL" 2>&1); then
  error "fetch failed: $out"
  exit 1
fi
echo "got: $out"

-fsSL ensures $? is non-zero on HTTP errors and the actual error message is on stderr (which we capture too with 2>&1).

curl -w for response metadata

curl -fsS -o output.json -w '%{http_code} %{time_total}\n' "$URL"
# 200 0.345

-w (write-out) prints metadata after the transfer. Useful values:

For a JSON-formatted line (great for structured logs):

curl -fsS -o /dev/null -w '{"code":%{http_code},"ttfb":%{time_starttransfer},"total":%{time_total}}\n' "$URL"

Capturing both body and status code

HTTP_CODE=$(curl -sS -o response.body -w '%{http_code}' "$URL")
case "$HTTP_CODE" in
  2*) info "ok ($HTTP_CODE)" ;;
  4*) error "client error ($HTTP_CODE)"; cat response.body >&2; exit 1 ;;
  5*) error "server error ($HTTP_CODE)" ;;
  *)  error "unexpected ($HTTP_CODE)" ;;
esac

Note: with -w '%{http_code}', we drop -f because we want the response body even on 4xx/5xx.

POST with JSON

curl -fsSL -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  --data '{"name":"alice","age":30}' \
  "https://api.example.com/users"

Or with data from a file:

curl -fsSL -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  --data @payload.json \
  "https://api.example.com/users"

Or via heredoc:

curl -fsSL -X POST \
  -H "Content-Type: application/json" \
  --data @- \
  "https://api.example.com/users" <<EOF
{"name":"alice","age":30}
EOF

--data-raw vs --data vs --data-urlencode

# Common gotcha: --data with content starting with @
curl --data '@hello' ...           # tries to read file "hello"
curl --data-raw '@hello' ...       # sends literal "@hello"

# URL-encode form values
curl --data-urlencode 'q=hello world' --data-urlencode 'lang=en' "$URL"
# sends: q=hello%20world&lang=en

Multipart upload (file uploads)

curl -fsSL -X POST \
  -F "name=Alice" \
  -F "avatar=@./avatar.jpg" \
  -F "metadata=@./meta.json;type=application/json" \
  "https://api.example.com/upload"

Each -F is one form field. @ prefix means “file from disk.” ;type=... sets the MIME type.

Headers in bulk

curl -fsSL \
  -H "Authorization: Bearer $TOKEN" \
  -H "User-Agent: kloudvin-deploy/1.0" \
  -H "X-Request-ID: $REQUEST_ID" \
  "$URL"

To remove a default header:

curl -H "User-Agent:" "$URL"     # empty value removes it

Auth — basic, bearer, OAuth

# Basic
curl -u user:pass "$URL"
curl --user user:pass "$URL"

# Bearer (most modern APIs)
curl -H "Authorization: Bearer $TOKEN" "$URL"

# Read auth from netrc
curl -n "$URL"            # uses ~/.netrc

Don’t put passwords in -u if they’re in $HISTFILE territory; use a file or env var:

curl --user "$USER:$(< /run/secrets/api-pass)" "$URL"

Unix sockets

For docker/containerd/local services:

curl --unix-socket /var/run/docker.sock http://localhost/v1.41/containers/json

Lets you talk to docker.sock directly without the docker CLI.

Stream large responses

# Save to file as it streams
curl -fsSL -o big.tar.gz "$URL"

# Pipe through another command line-by-line
curl -fsSLN "$URL" | jq -r '.events[]'

-N is “no buffering,” useful for streaming server-sent events or chunked responses.


2. wget — when to use it instead

wget is curl’s older cousin. It’s slightly better for file downloads:

wget -q https://example.com/big.tar.gz                # quiet
wget -c https://example.com/big.tar.gz                # continue interrupted download
wget -O renamed.tar.gz https://example.com/big.tar.gz # rename output
wget --tries=5 --timeout=30 https://example.com/file  # built-in retry
wget --no-check-certificate https://...               # skip TLS check (don't, except in tests)

-c (continue) is the killer feature: if the download is interrupted, retry with -c and it resumes from where it stopped. curl -C - does the same but is less polished.

For mirroring whole sites:

wget --recursive --level=2 --no-clobber --convert-links https://docs.example.com

Most scripts pick curl because it’s more flexible for API calls. wget is often pre-installed and excellent for “fetch this big file” tasks.


3. Bash’s /dev/tcp — networking without curl

Bash has a built-in TCP client! You can open /dev/tcp/HOST/PORT like a file:

# Test if port 80 is open on example.com
if (echo > /dev/tcp/example.com/80) 2>/dev/null; then
  echo "open"
else
  echo "closed"
fi

This doesn’t require curl, netcat, or anything else — just bash. Useful in minimal containers (alpine, distroless).

TCP port scanner

for port in 22 80 443 3306 5432 6379 8080; do
  if (echo > /dev/tcp/example.com/$port) 2>/dev/null; then
    echo "$port open"
  fi
done

Wait for a port to be open (with timeout)

wait_for_port() {
  local host=$1 port=$2 timeout=${3:-30}
  local elapsed=0
  while ! (echo > /dev/tcp/$host/$port) 2>/dev/null; do
    (( elapsed >= timeout )) && return 1
    sleep 1
    ((elapsed++))
  done
  return 0
}

wait_for_port db.internal 5432 60 || die "db never came up"

Better than sleep 30 && go — fail fast if the port stays closed.

Crude HTTP request

You can even speak HTTP directly:

exec 3<>/dev/tcp/example.com/80
echo -e "GET / HTTP/1.0\r\nHost: example.com\r\n\r\n" >&3
cat <&3
exec 3<&-

You wouldn’t do this in production (TLS is involved, headers are complicated), but it’s neat to know.

Limitations


4. Retry-with-exponential-backoff

Networks fail intermittently. Retry is essential. The canonical pattern:

retry() {
  local max=${1:-3}; shift
  local delay=1
  local attempt
  for ((attempt=1; attempt<=max; attempt++)); do
    if "$@"; then
      return 0
    fi
    if (( attempt < max )); then
      warn "command failed (attempt $attempt/$max); retrying in ${delay}s"
      sleep "$delay"
      delay=$((delay * 2))         # exponential: 1, 2, 4, 8, ...
    fi
  done
  error "command failed after $max attempts"
  return 1
}

# Usage
retry 5 curl -fsSL https://flaky-api.example.com/data

Improvements:

retry() {
  local max=${1:-3}
  local base_delay=${2:-1}
  local max_delay=${3:-60}
  shift 3
  local delay=$base_delay
  local attempt
  for ((attempt=1; attempt<=max; attempt++)); do
    if "$@"; then
      return 0
    fi
    if (( attempt < max )); then
      # Add jitter (0..delay/2) to avoid thundering herd
      local jitter=$(( RANDOM % (delay / 2 + 1) ))
      local sleep_for=$((delay + jitter))
      (( sleep_for > max_delay )) && sleep_for=$max_delay
      warn "attempt $attempt/$max failed; retrying in ${sleep_for}s"
      sleep "$sleep_for"
      delay=$((delay * 2))
    fi
  done
  return 1
}

retry 5 1 30 curl -fsSL "$URL"

This adds jitter (random delay) to avoid thundering herd when many clients retry simultaneously, and caps the max delay at 30s.

curl --retry — built-in retry

curl has its own retry:

curl -fsSL --retry 5 --retry-delay 2 --retry-max-time 60 "$URL"

For most cases, curl --retry 5 --retry-delay 2 --retry-max-time 60 --retry-all-errors is sufficient.

When does curl --retry not retry? On 4xx (client errors). 4xx means “you asked wrong” — retrying doesn’t help. The retry is meant for transient infrastructure issues.


5. HTTP idempotency

Retrying a GET is safe — GETs are idempotent by definition. Retrying a POST is dangerous: the first attempt might have succeeded server-side but the network dropped before the response. Retrying creates the same resource twice.

The cure: idempotency keys. The client generates a unique ID per logical operation; sends it in a header on every retry. The server deduplicates.

# Generate once per logical operation
IDEM_KEY=$(uuidgen)

# Use across all retries
retry 5 curl -fsSL -X POST \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: $IDEM_KEY" \
  --data "$PAYLOAD" \
  "$URL"

If the server sees the same Idempotency-Key twice, it returns the original response without re-creating. Standard at Stripe, AWS, GCP, etc. Always use idempotency keys for POSTs you might retry.

Generating UUIDs portably

# Linux + macOS modern: uuidgen
IDEM_KEY=$(uuidgen)

# /proc-based fallback (Linux only)
[[ -f /proc/sys/kernel/random/uuid ]] && IDEM_KEY=$(< /proc/sys/kernel/random/uuid)

# Pure bash (good entropy enough for idempotency keys)
random_uuid_v4() {
  local h
  h=$(printf '%04x%04x-%04x-%04x-%04x-%04x%04x%04x' \
    $RANDOM $RANDOM $RANDOM \
    $((RANDOM & 0x0fff | 0x4000)) \
    $((RANDOM & 0x3fff | 0x8000)) \
    $RANDOM $RANDOM $RANDOM)
  echo "$h"
}
IDEM_KEY=$(random_uuid_v4)

The pure-bash version isn’t cryptographically random but is fine for idempotency.


6. Wait-for-service-up

Common in scripts that depend on infrastructure: “deploy, then wait for the new pod to be healthy.”

wait_for_url() {
  local url=$1
  local timeout=${2:-60}
  local interval=${3:-2}
  local elapsed=0
  while (( elapsed < timeout )); do
    if curl -fsS -o /dev/null --connect-timeout 2 --max-time 5 "$url"; then
      return 0
    fi
    sleep "$interval"
    elapsed=$(( elapsed + interval ))
    debug "still waiting for $url ($elapsed/$timeout)"
  done
  error "timeout waiting for $url"
  return 1
}

wait_for_url "https://api.example.com/health" 120 5

Variants:

# Wait until response body matches expected
wait_for_response() {
  local url=$1 expected=$2 timeout=${3:-60} interval=${4:-2}
  local elapsed=0
  while (( elapsed < timeout )); do
    local body
    if body=$(curl -fsS --connect-timeout 2 --max-time 5 "$url" 2>/dev/null); then
      if [[ "$body" == *"$expected"* ]]; then
        return 0
      fi
    fi
    sleep "$interval"
    elapsed=$((elapsed + interval))
  done
  return 1
}

wait_for_response "https://api.example.com/version" '"version":"v1.2.3"' 120 5
# Wait for HTTP 200 specifically
wait_for_status() {
  local url=$1 expected=${2:-200} timeout=${3:-60} interval=${4:-2}
  local elapsed=0
  while (( elapsed < timeout )); do
    local code
    code=$(curl -sS -o /dev/null --connect-timeout 2 --max-time 5 -w '%{http_code}' "$url" 2>/dev/null || echo "000")
    if [[ "$code" == "$expected" ]]; then
      return 0
    fi
    sleep "$interval"
    elapsed=$((elapsed + interval))
  done
  return 1
}

7. Common patterns

Rate-limiting (sleep between requests)

for id in $(seq 1 1000); do
  curl -fsSL "https://api.example.com/items/$id"
  sleep 0.1   # max 10/s
done

Streaming JSON line-by-line (NDJSON / JSON Lines)

curl -fsSLN "https://api.example.com/stream" \
  | while IFS= read -r line; do
      jq -r '.event' <<<"$line"
    done

Following an SSE stream

curl -fsSLN -H "Accept: text/event-stream" "https://example.com/sse" \
  | while IFS= read -r line; do
      [[ "$line" == data:* ]] && echo "${line#data: }"
    done

Downloading with progress (interactive)

curl -L -o file.tar.gz "$URL"           # progress meter on
wget --progress=dot:giga -O file.tar.gz "$URL"

In scripts, leave progress off (-s).

Conditional GET (cache-aware)

# Save the ETag from a previous request
ETAG_FILE=/tmp/myresource.etag
curl -fsS -D - "$URL" -o resource.json \
  | awk '/^etag:/i { sub(/\r$/, ""); print $2 }' > "$ETAG_FILE"

# Next time, send If-None-Match
if [[ -f "$ETAG_FILE" ]]; then
  curl -fsS -H "If-None-Match: $(< "$ETAG_FILE")" "$URL" -o resource.json
fi

Server returns 304 (Not Modified) if the resource is unchanged — saves bandwidth.

Mutual TLS

curl --cert client.pem --key client-key.pem --cacert ca.pem "$URL"

For services that require client certs (like Kubernetes API directly).

HTTP/2 and HTTP/3

curl --http2 "$URL"      # require HTTP/2
curl --http3 "$URL"      # HTTP/3 (curl built with QUIC)

Default lets the server negotiate. Use these when you need to test specific behaviour.


8. Common pitfalls

Forgetting -f

Without -f, curl exits 0 on HTTP 500 and you process the error body as data. Always use -f (or capture -w '%{http_code}' and check explicitly).

Forgetting -L for HTTPS sites

Many sites redirect to a load balancer. Without -L, you get a 301 with empty body and think the site is broken.

Logging the URL with secrets in it

URL="https://api.example.com/data?token=$TOKEN"
info "fetching $URL"               # leaks $TOKEN to logs

Use Authorization headers instead, or sanitise:

info "fetching ${URL%%\?*}"

--data vs --data-raw and the @ problem

curl --data "$INPUT" "$URL"     # if $INPUT starts with @, treats as filename!
curl --data-raw "$INPUT" "$URL" # safe

Use --data-raw defensively.

Reading password from prompt

If the user runs the script with -u user: (no password), curl prompts. In a non-interactive script, that hangs. Provide both, or use env vars / netrc.

TLS mismatches

If a script worked yesterday but fails today with SSL certificate problem, the server’s cert may have rotated. Don’t add -k/--insecure as a permanent fix. Either:

Bash /dev/tcp and DNS caching

/dev/tcp/HOST/PORT resolves HOST every time you reference it. For high-frequency probes, you may want to resolve once:

ip=$(getent hosts example.com | awk '{print $1}')
(echo > /dev/tcp/$ip/80) 2>/dev/null

This avoids DNS overhead per attempt.


9. The lib/net.sh framework

Putting it together — drop this into any project:

# lib/net.sh — network helpers

http_get() {
  local url=$1; shift
  local extra_headers=("$@")
  local headers=()
  for h in "${extra_headers[@]}"; do headers+=(-H "$h"); done
  curl -fsSL --connect-timeout 10 --max-time 60 \
    --retry 3 --retry-delay 2 --retry-max-time 30 \
    "${headers[@]}" "$url"
}

http_post_json() {
  local url=$1
  local payload=$2
  shift 2
  local extra_headers=("$@")
  local headers=(-H "Content-Type: application/json" -H "Idempotency-Key: $(uuidgen)")
  for h in "${extra_headers[@]}"; do headers+=(-H "$h"); done
  curl -fsSL --connect-timeout 10 --max-time 60 \
    --retry 3 --retry-delay 2 --retry-max-time 30 \
    -X POST --data-raw "$payload" \
    "${headers[@]}" "$url"
}

wait_for_url() {
  local url=$1 timeout=${2:-60} interval=${3:-2}
  local elapsed=0
  while (( elapsed < timeout )); do
    curl -fsS -o /dev/null --connect-timeout 2 --max-time 5 "$url" && return 0
    sleep "$interval"
    elapsed=$((elapsed + interval))
  done
  return 1
}

port_open() {
  local host=$1 port=$2
  (echo > /dev/tcp/$host/$port) 2>/dev/null
}

retry() {
  local max=${1:-3} base=${2:-1} max_d=${3:-30}
  shift 3
  local delay=$base
  for ((i=1; i<=max; i++)); do
    if "$@"; then return 0; fi
    (( i < max )) || break
    local jitter=$(( RANDOM % (delay/2 + 1) ))
    local sleep_for=$((delay + jitter))
    (( sleep_for > max_d )) && sleep_for=$max_d
    sleep "$sleep_for"
    delay=$((delay * 2))
  done
  return 1
}

Use:

source "$(dirname "${BASH_SOURCE[0]}")/lib/net.sh"

http_get "https://api.example.com/users" "Authorization: Bearer $TOKEN"

port_open db.internal 5432 || wait_for_url "http://db.internal:5432" 60

retry 5 1 60 http_get "https://flaky.example.com/data"

10. Twelve idioms for daily use

# 1. Production curl flags
curl -fsSL --connect-timeout 10 --max-time 60 "$URL"

# 2. POST JSON
curl -fsSL -X POST -H 'Content-Type: application/json' --data "$JSON" "$URL"

# 3. Capture status code separately from body
HTTP_CODE=$(curl -sS -o response.body -w '%{http_code}' "$URL")

# 4. Bearer auth
curl -fsSL -H "Authorization: Bearer $TOKEN" "$URL"

# 5. Multipart file upload
curl -fsSL -F 'file=@./payload.bin' -F 'name=test' "$URL"

# 6. Wait for port to be open
while ! (echo > /dev/tcp/$HOST/$PORT) 2>/dev/null; do sleep 1; done

# 7. curl built-in retry
curl -fsSL --retry 5 --retry-delay 2 --retry-max-time 60 --retry-all-errors "$URL"

# 8. Idempotency key for POST retries
curl -fsSL -X POST -H "Idempotency-Key: $(uuidgen)" -d "$DATA" "$URL"

# 9. Time the request
curl -fsSL -o /dev/null -w '%{time_total}\n' "$URL"

# 10. Talk to docker.sock
curl --unix-socket /var/run/docker.sock http://localhost/v1.41/info

# 11. Stream and process line by line
curl -fsSLN "$URL" | jq -rc '.events[]' | while IFS= read -r evt; do …; done

# 12. Wait-for-URL with timeout
wait_for_url() { local u=$1 t=${2:-60}; for ((e=0; e<t; e+=2)); do curl -fsS "$u" >/dev/null 2>&1 && return 0; sleep 2; done; return 1; }

11. What you must internalise before lesson 18


What’s next

Lesson 18: File Operations at Scale — rsync, find -print0, Parallel-Safe Patterns & Atomic Writes. When you’re moving GBs across machines or processing thousands of files, naive cp -r and for f in * break. We cover rsync (every flag worth knowing), filename-safe patterns (revisited from Wave 1), atomic write/replace patterns that survive interruption, and the canonical “process N files in parallel” idiom. After L18 you’ll handle large filesystems with confidence.

See you there.

shellbashcurlwgethttpnetworkingretryidempotencydev-tcpproduction
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments