Containerization Containers

Docker, kubectl & Helm: The Practical Command Reference (Basic → Advanced)

A reference you can keep open in a second tab. Grouped by tool, ordered roughly basic → advanced.

Docker — images & containers

# Build & tag
docker build -t myapp:1.0 .
docker build -t myapp:1.0 --build-arg ENV=prod --target runtime .   # multi-stage target
docker buildx build --platform linux/amd64,linux/arm64 -t myapp:1.0 --push .  # multi-arch

# Run
docker run -d --name web -p 8080:80 --restart unless-stopped myapp:1.0
docker run --rm -it --env-file .env myapp:1.0 sh                    # ephemeral debug shell
docker run -v $(pwd):/app -w /app node:20 npm test                  # bind-mount + workdir

# Inspect & debug
docker ps -a                       # all containers
docker logs -f --tail 100 web      # follow logs
docker exec -it web sh             # shell into a running container
docker stats                       # live resource usage
docker inspect web | jq '.[0].NetworkSettings'

# Images & cleanup
docker images
docker image prune -a              # remove dangling/unused images
docker system prune -af --volumes  # reclaim everything (careful)
docker history myapp:1.0           # see layer sizes (find bloat)

A good production Dockerfile (multi-stage, non-root, cached)

# ---- build stage ----
FROM node:20-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev            # cache deps layer separately from source
COPY . .
RUN npm run build

# ---- runtime stage ----
FROM node:20-alpine AS runtime
ENV NODE_ENV=production
WORKDIR /app
RUN addgroup -S app && adduser -S app -G app
COPY --from=build /app/dist ./dist
COPY --from=build /app/node_modules ./node_modules
USER app                          # never run as root
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s CMD wget -qO- http://localhost:3000/health || exit 1
CMD ["node", "dist/server.js"]

Dockerfile rules of thumb: order layers least- → most-frequently-changed; copy lock files before source; use multi-stage to keep build tools out of the runtime image; pin base image tags; run as non-root; add a HEALTHCHECK; keep a .dockerignore (node_modules, .git, dist).

kubectl — the daily driver

# Context & config
kubectl config get-contexts
kubectl config use-context aks-prod
kubectl config set-context --current --namespace=payments   # stop typing -n

# Inspect
kubectl get pods -A -o wide
kubectl get pods -l app=payments --watch
kubectl describe pod payments-7d9 -n payments               # events at the bottom = gold
kubectl get events -n payments --sort-by=.lastTimestamp

# Logs & exec
kubectl logs -f deploy/payments -n payments --all-containers
kubectl logs payments-7d9 -n payments --previous            # crashed container's logs
kubectl exec -it deploy/payments -n payments -- sh
kubectl debug -it payments-7d9 --image=busybox --target=app # ephemeral debug container

# Apply / diff / rollout
kubectl apply -f k8s/ --recursive
kubectl diff -f k8s/                                        # preview before apply
kubectl rollout status deploy/payments -n payments
kubectl rollout undo deploy/payments -n payments            # roll back
kubectl rollout restart deploy/payments -n payments         # bounce pods (re-pull secrets)

# Scale & resources
kubectl scale deploy/payments --replicas=5 -n payments
kubectl top pods -n payments                                # needs metrics-server
kubectl get hpa -n payments

# Networking & access
kubectl port-forward svc/payments 8080:80 -n payments
kubectl auth can-i create deployments --as system:serviceaccount:ci:deployer

# Power moves
kubectl get pods -o jsonpath='{.items[*].metadata.name}'
kubectl get pod payments-7d9 -o yaml | kubectl neat        # clean YAML (krew plugin)
kubectl explain ingress.spec.rules                         # schema docs inline

Troubleshooting flow when a pod won’t start:

  1. kubectl get pod → status: ImagePullBackOff? CrashLoopBackOff? Pending?
  2. kubectl describe pod → read Events (image pull auth, scheduling, probes).
  3. Pendingkubectl describe node / check requests vs. capacity, taints, PVCs.
  4. CrashLoopBackOffkubectl logs --previous; check the liveness probe & command.
  5. ImagePullBackOff → registry auth (imagePullSecrets), tag typo, private registry firewall.

Helm — package & release management

# Repos & search
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
helm search repo postgres

# Render & inspect before installing (always)
helm template myrel bitnami/postgresql -f values.yaml | less   # see the YAML it will apply
helm install myrel bitnami/postgresql -f values.yaml --dry-run --debug

# Install / upgrade
helm install payments ./chart -n payments --create-namespace -f values.prod.yaml
helm upgrade --install payments ./chart -n payments -f values.prod.yaml --atomic --timeout 5m
#   --install  -> install if absent, else upgrade
#   --atomic   -> auto-rollback on failure
#   --wait     -> block until resources are Ready

# Lifecycle
helm list -A
helm history payments -n payments
helm rollback payments 3 -n payments         # revert to revision 3
helm uninstall payments -n payments

# Authoring a chart
helm create mychart        # scaffolds Chart.yaml, values.yaml, templates/
helm lint ./mychart
helm package ./mychart     # -> mychart-0.1.0.tgz

Chart layout:

mychart/
├── Chart.yaml          # name, version, appVersion, dependencies
├── values.yaml         # default config (override per env with -f)
├── templates/
│   ├── deployment.yaml # uses {{ .Values.* }} and {{ .Release.* }}
│   ├── service.yaml
│   ├── ingress.yaml
│   └── _helpers.tpl    # reusable template snippets (labels, names)
└── charts/             # vendored sub-charts (dependencies)

A templating snippet you’ll use constantly:

# templates/deployment.yaml
spec:
  replicas: {{ .Values.replicaCount }}
  template:
    spec:
      containers:
        - name: {{ .Chart.Name }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
          {{- with .Values.resources }}
          resources: {{- toYaml . | nindent 12 }}
          {{- end }}

Enterprise scenario

A payments platform team running EKS pushed a Helm upgrade that silently wedged production. The chart used helm upgrade --install payments ./chart --atomic --timeout 5m. The new revision changed a Deployment readiness probe path, pods never went Ready, and --atomic rolled back — but the rollback also timed out because the old ReplicaSet’s pods had already been terminated. Helm reported another operation is in progress, and the release was stuck in pending-upgrade. No helm upgrade would run again.

The constraint: --atomic rollback is itself a release operation, and if it exceeds --timeout you get a half-applied state plus a lock. The fix had two parts. First, clear the stuck lock and restore the last known-good revision directly:

helm history payments -n payments          # find last DEPLOYED revision (e.g. 41)
helm rollback payments 41 -n payments --wait --timeout 10m
kubectl rollout status deploy/payments -n payments

If helm rollback still refused because of the pending-upgrade status, they patched the release secret so Helm stopped treating it as in-flight:

kubectl get secret -n payments -l owner=helm,name=payments \
  --sort-by=.metadata.creationTimestamp
kubectl delete secret sh.helm.release.v1.payments.v42 -n payments  # the failed rev only

The durable lesson: never let --timeout be shorter than a realistic rollout, and gate the probe change behind helm template | kubectl diff -f - in CI so a bad probe path is caught before it ever reaches the cluster. They also added --wait-for-jobs and bumped timeouts to 10m on stateful releases.

Quick mental map

Keep kubectl diff and helm template/--dry-run in your muscle memory — previewing changes before applying them is the single habit that prevents the most production incidents.

DockerKuberneteskubectlHelmDockerfileCLI

Comments

Keep Reading