DevOps Architecture

The DevOps Architecting Ladder: From a Single Pipeline to an Internal Developer Platform

The most expensive DevOps mistake is not a broken build; it is building the wrong altitude of delivery platform for the team in front of you. One organisation reads too many platform-engineering conference talks and stands up Backstage, golden paths, ephemeral environments and a self-service portal to serve eight engineers who deploy a single monolith twice a week — then spends more effort maintaining the platform than the platform ever saves. Another runs a revenue-bearing product on a hand-clicked deploy from a senior engineer’s laptop, with no tests in the path and no way to roll back, and discovers the gap the Friday afternoon that engineer is on leave and the release is on fire. Same root cause: the delivery architecture was chosen by aspiration or inertia, not derived from how the team actually ships and what failure it can tolerate.

This lesson teaches DevOps delivery as a ladder — six rungs, each adding rigour, safety, or self-service capability over the last. You start with a single continuous-integration workflow that builds and tests on every push, and you climb only as far as the team, the risk, and the rate of change push you. For every rung we walk the same five questions: the scenario and its requirements (team size, deploy frequency, risk tolerance, compliance), the design and its components, the key decisions and trade-offs, and the question most DevOps articles skip — “when is this rung genuinely enough?” — so you know when to stop. We close with an explicit method for choosing your rung, because the right answer for most teams is not the top of the ladder; it is the lowest rung that meets the requirement, operated with discipline.

Five terms recur and are the levers that move you up the ladder. CI (Continuous Integration): every change is automatically built and tested against the mainline, fast and on every push. CD: Continuous Delivery keeps the mainline in a state that is always releasable (a human still clicks deploy), while Continuous Deployment removes even that click and ships every green build to production automatically. The DORA metrics: the four research-backed measures of delivery performance — deployment frequency, lead time for changes, change-failure rate, and time to restore service. GitOps: a model where the desired state of a system lives in Git and a controller continuously reconciles the running system to match it. IDP (Internal Developer Platform): a self-service layer that lets developers provision and ship through a paved road without filing tickets or learning the plumbing underneath.

Learning objectives

By the end of this lesson you will be able to:

Prerequisites & where this fits

You should already understand the building blocks the ladder assembles: what a CI/CD pipeline is (stages, jobs, steps), what an artifact and an artifact repository are, and the core deployment strategies — rolling, blue/green, and canary. The ladder is fundamentally about how many of those pieces you run, how they are wired together, and how much of the wiring developers have to see. If pipeline anatomy is not yet second nature, work through the DevOps fundamentals and CI/CD pipeline-design lessons first; if deployment strategies are hazy, read the deployment-strategies lesson before rung 3. It also helps to have built at least one pipeline yourself, because the lower rungs will feel familiar and the higher ones will make sense as additions to what you already run. This lesson sits in the Architecture module of the DevOps Zero-to-Hero course: the earlier lessons teach you to build each component well; this one teaches you how many to build, how to compose them, and where to stop.

Core concepts: what actually changes as you climb

Before the rungs, internalise the axes the ladder moves along. Every rung is a point in this space, and naming the axes lets you reason about a delivery setup you have never seen.

Axis What it means How it changes up the ladder
Automation depth How much of build → test → release → deploy → verify runs without a human Build+test only → add packaging+publish → add automated deploy → continuous reconciliation → security+verification woven in → full self-service
Path to production How a change actually reaches prod Manual deploy → one pipeline to one env → promotion across many envs with gates → Git commit reconciled by a controller → same, with security gates → a templated paved road
Safety mechanisms What stops or limits a bad change Tests catch obvious breakage → quality gates + artifacts → deploy strategies + approvals + rollback → drift correction + declarative rollback → policy-as-code + supply-chain + SLO gates → guardrails baked into the golden path
Feedback & visibility How you know delivery and the system are healthy Red/green build → build + scan reports → deploy notifications → reconciliation status → DORA + observability/SLOs → org-wide dashboards
Developer self-service How much a developer can do without help Push code → trigger a pipeline → request a deploy → open a PR to a Git repo → as before → click a template to get a whole conformant service

Two principles cut across all five axes and explain most of the ladder’s shape:

With those in hand, climb.

Rung 1 — A single CI workflow: build and test on every push

Scenario & requirements. A small team — one to a handful of engineers — is building an application and currently integrates by hope: someone merges, something breaks, and an afternoon is lost to “it worked on my machine.” There is no deployment requirement on the table yet, or deployment is still a manual step done rarely. The single concrete need is to stop broken code reaching the mainline: every change must be automatically built and tested so breakage is caught in minutes, not at the next manual release. Deploy frequency is low or irregular, risk tolerance for delivery is high (there may be no production yet), and there is no compliance pressure.

The design & components. One CI workflow triggered on every push and pull request: check out the code, install dependencies (with caching), build/compile, run the unit-test suite and a linter, and report a single red/green status back on the commit and PR. This lives as pipeline-as-code in the repository — a .github/workflows/ci.yml, a .gitlab-ci.yml, an azure-pipelines.yml, or a Jenkinsfile — and runs on hosted runners so there is nothing to operate. A branch-protection rule makes the green check required to merge, which is what turns a passive script into an actual integration gate.

Component Choice at this rung Why
Trigger On push + on pull request Catch breakage on every change, before and at merge
Runner Hosted (GitHub/GitLab/Azure-hosted) Zero infrastructure to operate; pay-per-minute
Steps Checkout → cache deps → build → unit tests → lint The minimum that makes “it builds and passes” trustworthy
Gate Branch protection: green CI required to merge Turns the workflow into an enforced integration gate, not a suggestion
Speed Dependency caching + fast unit tests CI must be fast (minutes) or people route around it

Key decisions & trade-offs. The defining decision is simply to automate integration at all — to make “does it build and pass the tests?” a machine’s answer on every change rather than a human’s at release time. The trade-off you deliberately accept is that this rung does nothing about deployment: getting the artifact to an environment is still manual, unversioned, and unsafe. That is fine when there is no production yet or releases are rare and low-stakes. The quiet trap is a slow or flaky CI: if the suite takes twenty minutes or fails randomly, developers disable the required check or merge around it, and you are back to integrating by hope with extra steps. Keep CI fast and deterministic from day one — it is the foundation every higher rung builds on, and a cracked foundation propagates upward.

When this rung is enough. Enough for a brand-new project before it has users, a learning or side project, a library whose “release” is a manual publish done rarely, or any codebase where the only current pain is broken merges. It is never enough the moment a real environment must be kept up to date from the mainline, or a release happens often enough that doing it by hand is error-prone or a bottleneck. The first time someone says “we need to ship this reliably and repeatedly,” you have left rung 1.

Rung 2 — Full CI/CD: quality gates, artifacts, and an automated deploy

Scenario & requirements. The application now has at least one real environment that must be kept current, and releases happen often enough that hand-deploying is slow, inconsistent, or risky. The team wants a single automated path from commit to a deployed environment, with confidence that what ships has passed more than unit tests, and with the ability to redeploy a known-good version if the latest one is bad. Requirements: deploy on the order of weekly-to-daily, a low-but-nonzero tolerance for delivery mistakes, and a need to know exactly what is running (which version, built from which commit).

The design & components. The rung-1 CI workflow is extended into a full CI/CD pipeline — source → build → test → scan → package → publish → deploy → verify:

Capability Mechanism What it buys
Catch more than syntax errors Coverage + integration + SCA/secret gates Bad changes stop at a gate, not in production
Know exactly what runs Immutable versioned artifact + repository Traceability from running version → artifact → commit
Test once, ship the same bits Build once, promote the artifact No “rebuilt for prod and now it’s different” surprises
Remove the manual deploy Automated deploy step (CD) Repeatable, fast, no laptop heroics
Authenticate without stored secrets OIDC federated identity to the cloud No long-lived cloud keys in the CI system
Keep the pipeline maintainable Reusable workflows / templates / shared libs The pipeline scales without copy-paste rot

Key decisions & trade-offs. The defining decision is introducing an immutable, promoted artifact and an automated deploy — the shift from “CI tells me it’s green” to “a pipeline gets a known artifact into an environment by itself.” A second decision is Continuous Delivery versus Continuous Deployment: auto-deploy every green build (fastest feedback, but demands real test confidence and easy rollback) or require a human click to release (a safety brake while confidence is still building). The trade-off you still accept is that this is typically one environment, or a couple wired ad hoc, with no formal promotion policy across many stages and no sophisticated deploy strategy — a deploy here is often a straightforward replace or rolling update. The classic trap is rebuilding the artifact per environment, which silently breaks the “test once” guarantee; build once, promote the same bits. Another is caring about scan reports but never gating on them — a scan that cannot fail the build is documentation, not a control.

When this rung is enough. Enough for a single application with one or two environments, a team shipping weekly-to-daily, where a straightforward automated deploy plus the ability to redeploy a previous artifact is sufficient safety. Most small products and internal services live here productively for a long time. The signal to climb is the arrival of multiple environments that need a governed promotion path (dev → test → staging → prod with approvals), or a risk profile that demands deploying without downtime or to a fraction of users first — both of which are rung 3.

Rung 3 — Multi-environment delivery: promotion, deploy strategies, and approvals

Scenario & requirements. You now serve real users, and a bad deploy is costly — downtime, errors, or a rollback scramble. You have multiple environments (dev, test/UAT, staging, production) and changes must flow through them in a governed, auditable order, with the right humans signing off before production. The deploy itself must avoid downtime and limit blast radius: you cannot take the service offline to release, and you want to expose a new version to a fraction of traffic before all of it. Requirements: daily-or-more deploys, a real (often contractual) expectation of availability, and an audit trail of who approved what to production and when.

The design & components. A multi-stage delivery pipeline that promotes one artifact through a chain of environments, with deploy strategies and approval gates:

Concern Rung 2 (single env) Rung 3 (multi-env)
Environments One, or a couple ad hoc A governed chain: dev → test → staging → prod
What moves An artifact to one place The same artifact promoted along the chain
Deploy style Replace / simple rolling Rolling / blue-green / canary, with fast rollback
Production gate Maybe a manual click Required-reviewer approval on protected envs (audited)
Release control Deploy = release Deploy and release decoupled via feature flags
Failure handling Redeploy previous artifact Automated rollback on failed health/canary gates

Key decisions & trade-offs. The defining decision is how each environment deploys and who must approve the jump to production — the move from “a pipeline deploys” to “a pipeline promotes, safely and with sign-off.” A core sub-decision is the deployment strategy (rolling is simplest and cheapest; blue/green gives instant switch and rollback at the cost of double capacity during cutover; canary gives the smallest blast radius and metric-driven confidence at the cost of more tooling and traffic-shaping). The trade-off you deliberately accept is added latency and process at the production boundary — approvals and progressive rollouts make each release a little slower, which is the price of not breaking users; keep lower environments fully automatic so the friction sits only where the risk is. The traps here are approvals that are rubber-stamps (an unread “approve” is theatre, not a control — pair it with automated gates that catch what humans won’t) and a rollback path that has never been tested (a rollback you have not exercised is a hope, not a capability).

When this rung is enough. Enough for the large majority of production applications: a real service with several environments, governed promotion, a downtime-free deploy strategy, audited production approvals, and tested rollback. Many serious products run here for years and never need more. Do not climb for novelty. The signals that justify the next rungs are specific and different from each other: running on Kubernetes at a scale where push-based deploys and config drift become the problem points to GitOps (rung 4); a security or compliance mandate that the pipeline itself enforce supply-chain and policy controls, or a need to measure and improve delivery performance, points to DevSecOps + DORA (rung 5); and many teams needing to ship independently without the platform team as a bottleneck points to an IDP (rung 6).

Rung 4 — GitOps to Kubernetes: Git as the source of truth, a controller as the deployer

Scenario & requirements. Your workloads run on Kubernetes (one cluster or several), and the push-based model from rung 3 is starting to hurt: the CI system holds cluster credentials and pushes changes in, the actual cluster state drifts from what anyone intended (someone kubectl applyed a hotfix at 2 a.m.), and reconstructing “what is supposed to be running” means reading pipeline logs. You want the desired state of every environment to live in Git, the running cluster to be continuously reconciled to match it, drift to be detected and corrected automatically, and rollback to be a Git revert. Requirements: Kubernetes as the runtime, multiple environments and possibly clusters, and a need for an auditable, declarative, single source of truth.

The design & components. A GitOps delivery model layered on the rung-3 pipeline:

Concern Push CD (rung 3) GitOps (rung 4)
Source of truth The pipeline run / whatever was last applied Git — the declared desired state
Who deploys CI pushes into the cluster An in-cluster controller pulls and reconciles
Cluster credentials Held by the CI system (external) Stay inside the cluster (inverted)
Drift Silent; discovered when something breaks Detected and (optionally) auto-corrected
Rollback Re-run pipeline with previous artifact git revert the commit
Auditability Pipeline logs Git history = the deploy history

Key decisions & trade-offs. The defining decision is inverting the deploy model from push to pull and making Git the single source of truth — the system now converges on what Git says rather than on whatever the last pipeline did. A real sub-decision is how strict to make reconciliation (auto-sync and self-heal give the strongest drift guarantees but mean all change must go through Git — no emergency kubectl edit; manual sync is gentler but lets drift persist). The trade-off you accept is a new component to operate and a model shift the team must learn — Argo CD or Flux is one more thing to run and secure, and “you change Git, not the cluster” is a genuine behaviour change. The traps are leaving a back door open (allowing direct kubectl writes alongside GitOps defeats the single-source-of-truth guarantee — drift creeps back) and treating GitOps as only for Kubernetes app manifests while infrastructure and policy still drift; the leverage compounds when everything the cluster needs is reconciled from Git. Note that GitOps is largely a Kubernetes-shaped tool — if you do not run Kubernetes, you typically stay on rung 3’s push model and reach the higher rungs through it.

When this rung is enough. Enough when you run Kubernetes and want declarative, drift-corrected, Git-audited delivery with credential-inverted security — which is most serious Kubernetes shops. It does not, on its own, add security supply-chain controls or delivery measurement, and it is orthogonal to the platform question: GitOps can serve one team or be the backend of a self-service platform. You climb to rung 5 when security/compliance demands that the pipeline enforce supply-chain and policy controls, or when you need to measure delivery to improve it; you climb to rung 6 when many teams need to self-serve.

Rung 5 — DevSecOps with observability and DORA: security in the path, performance measured

Scenario & requirements. Delivery is automated and (often) GitOps-driven, but two gaps now bite. First, security is not yet a first-class part of the path to production: scanning is ad hoc, the software supply chain is unverified, and there is no machine-enforced policy on what may ship — a real problem under regulatory pressure (SOC 2, ISO 27001, PCI, FedRAMP) or simply for a team that takes supply-chain attacks seriously. Second, you cannot tell whether delivery is getting better or worse, because you do not measure it. Requirements: shift security left into the pipeline as enforced gates, establish supply-chain integrity (signed, attested, provenance-tracked artifacts), enforce policy-as-code, and instrument both the pipeline and the running system so you can track the DORA metrics and SLOs and actually improve.

The design & components. A DevSecOps layer plus observability and DORA measurement woven through the rung-3/4 pipeline:

Concern Without rung 5 With rung 5 (DevSecOps + DORA)
Security scanning Ad hoc, advisory, easy to skip Severity-gated SAST/SCA/secret/image/IaC/DAST in the path
Supply chain Unverified artifacts SBOM + cosign signatures + SLSA provenance, verified at admission
Policy Documented, manually checked Policy-as-code (OPA/Conftest + Kyverno/Gatekeeper), unbypassable
Runtime visibility Logs when something breaks Metrics/logs/traces + SLOs + error budgets + alerts
Delivery performance Unknown / anecdotal Measured DORA metrics, reviewed and improved
Failure response Improvised MTTR tracked; rollback + observability shorten it

Key decisions & trade-offs. The defining decision is making security and measurement first-class, automated parts of delivery rather than bolt-ons — the pipeline now enforces trust and reports performance. The critical calibration is gate strictness and signal-to-noise: gate on severity and reachability, fail on real risk, and route the rest to a backlog — because the well-known failure mode of DevSecOps is so much false-positive noise that developers disable the security stage to ship (covered head-on in the DevSecOps lesson). The trade-off is up-front engineering and ongoing tuning — scanners, signing, SBOMs, policy bundles, and an observability stack are real investments that need maintenance, and badly-tuned gates do slow teams down. The traps: scanning without gating (advisory-only scans change nothing), collecting DORA as a vanity metric or a stick (DORA is a team diagnostic to find bottlenecks, not a leaderboard to punish individuals), and adding security gates without easy rollback and observability — security that finds problems but cannot recover from them is half a control.

When this rung is enough. Enough for security-conscious and regulated organisations that need enforced supply-chain integrity, policy-as-code, and the ability to prove and improve delivery performance — which is where most mature single- or few-team engineering organisations should sit. It is the right ceiling for many. The single signal that justifies the top rung is organisational scale: when many teams must each get a secure, observable, multi-environment, GitOps-or-CD pipeline, and replicating all of rung 1–5 by hand for every new team or service becomes the bottleneck. Productising that is rung 6.

Rung 6 — An Internal Developer Platform: golden paths, self-service, ephemeral environments

Scenario & requirements. The driver here is organisational scale and developer cognitive load, not a single missing capability. You have many teams and many services, and although you know how to build a secure, observable, GitOps-driven pipeline (rungs 1–5), doing it by hand for every new service is the bottleneck: every team reinvents the same pipeline, slightly differently; onboarding a new service is a multi-week ticket-driven slog; and developers spend their time wiring CI, registries, environments, and policy instead of writing features. The requirement is self-service with guardrails baked in — a developer should get a complete, conformant, production-ready service through a paved road in minutes, without the platform team in the loop and without being able to skip the controls. This rung is for platform-engineering at organisational scale; most organisations should be sure they belong here before climbing.

The design & components. An Internal Developer Platform that productises rungs 1–5 as self-service:

Concern Rungs 1–5 per team Rung 6 (IDP)
New service onboarding Hand-built pipeline; multi-week ticket slog A scaffolder template; minutes, self-service
Consistency across teams Each team’s pipeline is a snowflake Conformant by construction from one template
Infra provisioning Tickets to the platform/ops team Self-service behind policy-as-code guardrails
Test environments Shared, contended, or hand-made Ephemeral per-PR environments, auto-torn-down
Guardrails Re-applied per team, inconsistently Baked into the golden path; unbypassable
Org-wide visibility Per-team, fragmented Catalog + org-wide DORA/SLO dashboards

Key decisions & trade-offs. The defining decision is to productise delivery as a self-service platform — to treat the paved road as an internal product with the platform team as its owner and developers as its customers. The dominant trade-off is large up-front and ongoing investment for long-run leverage: an IDP is a real product needing a dedicated team, a roadmap, and maintenance, and it only pays off at scale — below roughly five or six teams (or a handful of services), the platform machinery costs more than the toil it removes, and you should stay on rung 5 and copy a good pipeline template by hand. The traps are building the platform nobody asked for (a portal demos beautifully and rots quietly into a catalog of stale entities if it does not solve real developer pain — let demand pull each capability), golden paths so rigid that teams route around them (a paved road must be genuinely easier than the alternative and allow escape hatches, or it becomes shelfware), and mistaking the portal for the platform (Backstage is a UI; the platform is the templates, the GitOps backend, the policy, and the provisioning underneath — the frontend is the least important part).

When this rung is enough — and when it is too much. The right rung only when many teams shipping many services makes the per-team cost of rungs 1–5 the binding constraint, and a self-service paved road demonstrably removes more toil than it creates. Otherwise it is over-engineering — which has its own failure mode: a platform a small organisation cannot staff or evolve becomes a half-built, abandoned tab that is worse than letting each team copy a solid pipeline. The honest test: if you cannot name the teams that will use the golden path next quarter and the specific toil it removes for them, you do not yet need rung 6.

How to choose your rung

The DevOps architecting ladder

The diagram above shows the six rungs side by side — what each adds, the problem it solves, and roughly when it fits — so you can locate your situation on the ladder at a glance rather than defaulting to the top.

The method is deliberately boring, because boring is how you avoid both expensive failure modes — over-building and under-building:

  1. Describe how you actually ship today, with numbers, before designing anything. State your team and service count, deploy frequency, the manual steps still in the path, your tolerance for a bad release, your runtime (Kubernetes or not), and any compliance constraint. If you cannot describe the current path concretely, you are not ready to choose a rung — go map it.
  2. Map each requirement to the lowest rung that satisfies it. “Stop broken merges” → rung 1; “one automated, traceable path to an environment” → rung 2; “governed multi-env promotion with safe deploys and approvals” → rung 3; “declarative, drift-corrected delivery on Kubernetes” → rung 4; “enforced security/supply-chain and measured delivery” → rung 5; “self-service for many teams” → rung 6. Take the highest rung any single hard requirement forces — but no higher.
  3. Default to the lowest sufficient rung, operated with discipline. A rung-3 pipeline run well beats a rung-6 platform a team cannot maintain. The lowest sufficient rung minimises engineering cost, operational toil, and the complexity that is itself a source of delivery failures.
  4. Separate the independent climbs. Three things move semi-independently: delivery rigour (rungs 1→2→3), the runtime model (rung 4 GitOps, which mostly applies if you run Kubernetes), and organisational scale (rung 6 IDP, driven by team count). Security and measurement (rung 5) layer onto whichever delivery model you run. Do not let “we should do GitOps” or “we need a platform” stampede you past the rigour you actually lack.
  5. Climb only on a concrete signal. The signals are specific: broken merges → rung 1; a real environment to keep current → rung 2; multiple environments, a costly bad deploy, or an audit requirement → rung 3; Kubernetes plus painful drift and credential exposure → rung 4; a compliance/supply-chain mandate or a need to measure delivery → rung 5; many teams bottlenecked on hand-built pipelines → rung 6. Re-run the assessment on material change (a compliance regime, a move to Kubernetes, a jump in team count), not continuously.

The unifying idea is that the component set is stable and only the composition and ownership change as you climb. Build, test, artifact, deploy, gate, observe — the same primitives appear at every rung; what changes is how many environments they span, whether a controller or a human drives them, whether security and measurement are woven in, and whether developers self-serve the whole thing. That stability of shape is what makes climbing incremental rather than a rewrite.

Hands-on lab: feel the bottom two rungs locally

You cannot stand up a full IDP on a laptop, but you can viscerally experience the difference between rung 1 (build-and-test only) and the rung-2 idea of an immutable, versioned artifact that you build once and could then promote — using only Git, Docker, and a tiny app. This lab is free, runs in minutes, and makes the abstract concrete. Requirements: Docker and Git installed (no cloud account, no CI service needed — we simulate the CI logic locally).

Step 1 — Rung 1: a local CI gate (build + test)

Create a trivial project with a test, and a script that is a rung-1 CI workflow — it builds and tests, and exits non-zero on failure (exactly what a required status check enforces).

mkdir -p /tmp/ladder-lab && cd /tmp/ladder-lab && git init -q
cat > app.py <<'EOF'
def add(a, b):
    return a + b
EOF
cat > test_app.py <<'EOF'
from app import add
def test_add():
    assert add(2, 3) == 5
EOF
cat > ci.sh <<'EOF'
#!/usr/bin/env bash
set -euo pipefail
echo "== CI: build/syntax check =="
python -m py_compile app.py test_app.py
echo "== CI: unit tests =="
python -m pytest -q
echo "CI PASSED (green) — this is what a required status check enforces"
EOF
chmod +x ci.sh
./ci.sh

Expected output: the syntax check passes, pytest reports 1 passed, and you see CI PASSED (green). Now break the test (assert add(2, 3) == 6) and re-run ./ci.sh — it exits non-zero and prints the failure. That red/green exit code is rung 1’s entire value proposition: a machine, not a human, decides whether the change is safe to merge.

Step 2 — Rung 2: build an immutable, versioned artifact

Package the app into a versioned container image tagged with the Git commit — the rung-2 idea that you build once and then promote the same bits, with full traceability from running version back to commit.

cd /tmp/ladder-lab
git add -A && git commit -q -m "rung-2 lab app"
cat > Dockerfile <<'EOF'
FROM python:3.12-slim
WORKDIR /app
COPY app.py .
CMD ["python", "-c", "from app import add; print('add(2,3)=', add(2,3))"]
EOF

SHA=$(git rev-parse --short HEAD)
docker build -t ladder-app:"$SHA" -t ladder-app:latest .
echo "Built immutable artifact ladder-app:$SHA (also tagged :latest)"
docker images ladder-app

Expected output: a Docker build that ends in Successfully tagged ladder-app:<sha>, and docker images listing the image under two tags pointing at one image ID — the commit SHA (the immutable, promotable identity) and latest (a moving pointer). That single artifact, identified by commit, is what a real rung-2 pipeline would publish to a registry and then promote unchanged through environments.

Step 3 — Validation: prove “build once, run the same bits”

“Deploy” the artifact by running it, then confirm the tagged image is the exact thing built — the guarantee that what you tested is what runs.

cd /tmp/ladder-lab
SHA=$(git rev-parse --short HEAD)
echo "== 'Deploying' the promoted artifact by digest-identity =="
docker run --rm ladder-app:"$SHA"
echo "== Proof the SHA tag and latest are the SAME image (built once) =="
docker inspect --format '{{.Id}}' ladder-app:"$SHA"
docker inspect --format '{{.Id}}' ladder-app:latest

Validation criterion: the container prints add(2,3)= 5, and the two docker inspect image IDs are identical — proving the commit-tagged artifact and the deployed one are byte-for-byte the same image, built once. That is the rung-2 promise (test once, ship the same bits) made concrete; a rung-1 setup has no artifact at all, so this guarantee simply does not exist below rung 2.

Cleanup

docker rmi ladder-app:latest ladder-app:"$(cd /tmp/ladder-lab && git rev-parse --short HEAD)" 2>/dev/null || true
rm -rf /tmp/ladder-lab

Cost note. This lab is entirely free: it runs in local Docker with no cloud resources, no hosted CI minutes, no registry, and nothing to bill. The real rungs cost money and effort — hosted CI/CD minutes and a registry (rung 2), multiple environments and load balancers for deploy strategies (rung 3), an Argo CD/Flux controller to operate (rung 4), scanners and an observability stack (rung 5), and a dedicated platform team for an IDP (rung 6) — which is exactly why you climb only as far as the requirement demands.

Common mistakes & troubleshooting

Symptom / mistake Cause Fix
A full IDP (Backstage, golden paths, ephemeral envs) built for one small team Architecture chosen by aspiration, not team scale; the platform now costs more than it saves Drop to the lowest rung that fits the team count; below ~5–6 teams, ship a solid rung-5 pipeline template by hand
Revenue-bearing product deployed by hand from one engineer’s laptop Under-built; “it works when I do it” mistaken for a delivery process Move to rung 2 (automated, artifact-based deploy) and then rung 3 (governed multi-env promotion); never single-person, unrepeatable releases
CI is slow or flaky, so developers merge around the required check Cracked foundation: long suites, no caching, non-deterministic tests Make rung-1 CI fast and deterministic (cache deps, parallelise, quarantine flaky tests) — every higher rung builds on it
Artifact rebuilt per environment, so prod differs from what was tested “Test once, ship the same bits” violated; rebuild instead of promote Build the immutable artifact once, publish it, and promote the same artifact through environments
Scanners and approvals exist but never block anything Advisory-only gates and rubber-stamp approvals — controls in name only Gate the pipeline on real (severity + reachability) findings; pair human approval with automated gates that catch what people won’t
“We do GitOps” but people still kubectl apply hotfixes A back door left open defeats the single-source-of-truth guarantee Lock down direct cluster writes; make Git the only path and enable drift detection / self-heal
DevSecOps stage commented out within a week of rollout False-positive noise burying developers; gates too blunt Calibrate by severity and reachability, route low findings to a backlog, fix the noisy rules — keep the build green for non-risk
DORA metrics tracked but delivery never improves (or morale drops) DORA used as a vanity number or a stick on individuals Use DORA as a team diagnostic to find and remove bottlenecks, not a leaderboard; review trends, act on them
Golden path exists but teams route around it The paved road is harder than the alternative, or has no escape hatch Make the golden path genuinely the easiest option; provide escape hatches; let real demand pull each capability in

Best practices

Security notes

The ladder has a security dimension that tracks its automation dimension, and it strengthens as you climb:

Interview & exam questions

Q1. What is the single most common DevOps delivery mistake, and how do you avoid it? Choosing the wrong altitude for the team — over-building (an IDP for one small team) or under-building (revenue deployed by hand from a laptop). Avoid it by describing how you actually ship today in concrete terms (team count, deploy frequency, manual steps, risk tolerance, runtime, compliance) and picking the lowest rung that satisfies those facts, operated well.

Q2. Distinguish Continuous Integration, Continuous Delivery, and Continuous Deployment. CI automatically builds and tests every change against the mainline, fast and on every push (rung 1). Continuous Delivery keeps the mainline always releasable and automates deploy up to a human click. Continuous Deployment removes the click and ships every green build to production automatically. Delivery requires real test confidence and easy rollback before it is safe; Deployment requires even more.

Q3. Why is an immutable, versioned artifact (built once, promoted) so important, and what breaks without it? It guarantees that what you tested is byte-for-byte what you deploy, with traceability from the running version back to a commit. Without it — if you rebuild per environment — production can silently differ from what passed the gates (different dependency versions, base images, build context), so your tests no longer mean what you think. Build once, publish, and promote the same artifact.

Q4. A scan runs in the pipeline but never fails the build. Is that a security control? No — it is documentation. A control must be able to stop a bad change. The fix is to gate on real risk (by severity and reachability) so exploitable findings block while low-signal ones go to a backlog. The same logic applies to rubber-stamp approvals: an unread “approve” is theatre, so pair human approval with automated gates.

Q5. What problem does GitOps solve that push-based CD does not, and what does it cost? GitOps makes Git the single source of truth and has an in-cluster controller continuously reconcile the cluster to it — giving drift detection/correction, git revert rollback, Git-as-audit-log, and credentials that stay inside the cluster (push CD holds cluster creds externally and lets drift go silent). It costs a new component to operate (Argo CD/Flux) and a behaviour change (“change Git, not the cluster”), and it leaks value if direct kubectl back doors remain open. It is largely a Kubernetes-shaped tool.

Q6. Name the four DORA metrics and explain how they should and should not be used. Deployment frequency, lead time for changes, change-failure rate, and time to restore service. They should be used as a team diagnostic to locate and remove delivery bottlenecks and to show that speed and stability rise together. They should not be used as a vanity number or as a stick to rank or punish individuals — that drives gaming and erodes the very behaviour you want.

Q7. Why is it a myth that adding gates and approvals necessarily slows delivery down? Because the DORA research shows elite performers deploy more often and fail less. Good automation makes each change small, tested, and reversible, which is both faster and safer; the gates catch risk cheaply rather than via slow manual review or production incidents. A gate only slows you down if it is implemented badly (noisy, blunt, or a rubber stamp) — well-built, safety and speed rise together.

Q8. Compare rolling, blue/green, and canary deployments and when you’d choose each. Rolling replaces instances incrementally — simplest and cheapest, but a bad version reaches some users and rollback is gradual. Blue/green runs two full environments and switches traffic at once — instant cutover and instant rollback, at the cost of roughly double capacity during the switch. Canary sends a small fraction of traffic to the new version and promotes on healthy metrics — smallest blast radius and metric-driven confidence, at the cost of more tooling and traffic-shaping. Choose by how costly a bad release is and how much infrastructure/tooling you can invest.

Q9. When does building an Internal Developer Platform pay off, and how is it different from climbing for delivery rigour? It pays off at organisational scale — many teams and services where hand-building a secure, observable pipeline per team is the bottleneck (roughly above five or six teams). It is a different axis from rigour: rungs 1–3 add delivery rigour, rung 4 is a runtime/GitOps choice, and the IDP (rung 6) productises rungs 1–5 as self-service. Team count drives it, not how good any single pipeline is — and below the threshold it costs more than it saves.

Q10. Why can a too-early platform or a fancy delivery setup be worse than a simpler one? Because its capability is bought with a large jump in complexity, and complexity is itself a leading cause of delivery failures. A small team that cannot staff or evolve an IDP ends up with a half-built, abandoned portal that is worse than a solid copied pipeline; a team that bolts on noisy security gates ends up disabling them. Reliability and speed come from operating a fit-for-purpose design well, not from owning a complex one.

Q11. How does the path to production change as you climb the ladder? Manual deploy (rung 1) → one automated pipeline to one environment with an immutable artifact (rung 2) → governed promotion of that artifact across multiple environments with deploy strategies and audited approvals (rung 3) → a Git commit reconciled into the cluster by a controller (rung 4) → the same path with enforced security/supply-chain gates and DORA measurement (rung 5) → a templated, self-service paved road any team can use (rung 6). The primitives are stable; what changes is automation depth, who drives the deploy, and self-service.

Q12. How do you decide when to stop climbing the ladder? Stop at the lowest rung whose capabilities and failure model satisfy your written situation. Climb only on a concrete signal — broken merges → rung 1; an env to keep current → rung 2; multiple envs / costly bad deploy / audit need → rung 3; Kubernetes + drift pain → rung 4; compliance/supply-chain or measurement need → rung 5; many teams bottlenecked → rung 6 — and re-assess on material change, not continuously. If you cannot name the requirement and the cost of not meeting it, you do not yet need the next rung.

Quick check

  1. Which three climbs on the ladder move semi-independently of each other, such that you can need one without the others?
  2. At which rung do you first get an immutable, versioned artifact that is built once and promoted, and why does it matter?
  3. What does GitOps (rung 4) change about who deploys and where the cluster credentials live, compared with push-based CD?
  4. Name the four DORA metrics, and state the one rule for using them well.
  5. Above roughly how many teams does building an Internal Developer Platform typically start to pay off, and what should you do below that threshold?

Answers

  1. Delivery rigour (rungs 1→2→3), the runtime/GitOps model (rung 4, mostly relevant if you run Kubernetes), and organisational scale / self-service (rung 6, the IDP). Security and measurement (rung 5) layer onto whichever delivery model you run.
  2. Rung 2. It guarantees that what you tested is byte-for-byte what you deploy (traceable from running version back to commit), because the artifact is built once and promoted unchanged rather than rebuilt per environment — the guarantee the entire upper ladder relies on.
  3. GitOps inverts the model from push to pull: an in-cluster controller (Argo CD/Flux) deploys by reconciling the cluster to Git, instead of the CI system pushing changes in. Consequently the cluster credentials stay inside the cluster rather than being held by the external pipeline — a real security improvement.
  4. Deployment frequency, lead time for changes, change-failure rate, and time to restore service. Use them as a team diagnostic to find and remove bottlenecks — never as a vanity number or a stick to rank or punish individuals.
  5. Around five or six teams (or many services); below that, skip the platform and ship a solid rung-5 pipeline template that teams copy by hand — the IDP machinery would cost more than the toil it removes.

Exercise

Take a delivery setup you know — at work, a side project, or a hypothetical “B2B SaaS with three product teams” — and produce a one-page rung-selection memo:

  1. Describe the current path first. State how a change reaches production today, the manual steps still in it, your team and service count, deploy frequency, your tolerance for a bad release, your runtime (Kubernetes or not), and any compliance constraint. If you have to guess, mark the guess — that itself is a finding.
  2. Place it on the ladder. For each requirement, name the lowest rung that satisfies it, then take the highest rung any single hard requirement forces. Note separately whether the GitOps (rung 4) and IDP (rung 6) climbs apply, since they are independent axes.
  3. Find the weakest link. Identify the single biggest gap between where you are and where you should be (e.g. “no immutable artifact,” “approvals are rubber stamps,” “no rollback path,” “security scans don’t gate”) and the one change that would close it.
  4. Justify the stop. Write one paragraph on why you are not climbing higher — which next-rung capability you are deliberately forgoing and why the residual cost is acceptable.
  5. Name the climb trigger. State the single concrete signal that would justify moving up one rung (e.g. “we move workloads to Kubernetes,” “a SOC 2 audit,” “team count crosses six”).

The goal is to practise the discipline the lesson teaches: facts first, lowest sufficient rung, fix the weakest link before adding altitude, an explicit reason to stop, and a defined trigger to climb.

Certification mapping

This lesson is delivery-architecture reasoning that underpins the practical DevOps exams rather than a single exam objective, and it maps across the cloud DevOps professional certifications:

The exams test the services and mechanisms; this lesson teaches the delivery-architecture judgement that decides which mechanisms a given situation needs — exactly the kind of “design a delivery process for these constraints” question senior DevOps interviews open with.

Glossary

Next steps

You can now place a delivery requirement on the ladder and defend where you stopped. Turn that judgement into something hiring managers can see by building the matching portfolio: continue to Real-World DevOps Portfolio Projects: From a First Pipeline to a Platform, whose project ladder mirrors this one rung for rung.

Then deepen the rungs that matter most for your situation:

devopsci-cdgitopsplatform-engineeringdevsecopsdora-metrics
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading