DevOps is the one discipline where the certifications and the day job overlap almost perfectly. A Kubernetes administrator who can pass the CKA can genuinely operate a cluster; an engineer who clears AZ-400 has, by necessity, designed real pipelines, branching strategies, and release gates. That is the good news. The awkward news is that “DevOps certification” is not one exam but a constellation — a foundation cert from the DevOps Institute, three rival cloud professional exams, three Kubernetes credentials, a Terraform exam, and a clutch of tool-vendor certs from GitHub and GitLab. Choosing the wrong order, or treating them as interchangeable, wastes months. Treated as a deliberate ladder, they compound: the Terraform Associate feeds every cloud DevOps exam, the cloud DevOps exam you sit feeds your interview story, and the Kubernetes certs are the differentiator that turns “I know CI/CD” into “I run the platform”.
This lesson is the capstone of the KloudVin DevOps Zero-to-Hero course: the consolidated, exam-by-exam prep kit. It is not a re-teach of DevOps from scratch — the rest of the course does that, from DevOps fundamentals through pipelines, deployment strategies, GitOps, and DevSecOps. It is the layer on top: for each credential, the objective-domain checklist with realistic weightings so your study time lands on the marks; a bank of practice scenario questions written in each exam’s idiom, with the reasoning for every right answer and why the tempting wrong ones are wrong; the topics candidates most reliably confuse across exams; and — the part most “cram guides” skip — a study order and plan that sequences the credentials so each one makes the next cheaper. The goal is to pass each on the first attempt and walk out genuinely more capable, not merely badged.
Learning objectives
By the end of this lesson you will be able to:
- Lay out the full DevOps certification landscape — vendor-neutral, cloud-specific, Kubernetes, IaC, and tool certs — and judge which belong on your ladder.
- Recite the objective domains and weightings for the headline exams (DOP-C02, AZ-400, GCP DevOps, CKA/CKAD/CKS, Terraform Associate, DevOps Foundation) and prioritise study where the marks cluster.
- Answer scenario-style questions in each exam’s idiom and reason about why distractors are wrong, not just recognise the right answer.
- Separate the commonly-confused topics that cut across exams — blue/green vs canary, CKA vs CKAD vs CKS, AZ-400’s branch policies vs environments, DOP-C02’s CodeDeploy vs CodePipeline, declarative vs imperative kubectl.
- Distinguish the performance-based Kubernetes exams (hands-on terminal) from the multiple-choice/scenario cloud exams, and prepare for each format correctly.
- Sequence a credential path with a realistic timeline so each certification reinforces the next instead of starting cold.
Prerequisites
This is the final lesson of the KloudVin DevOps Zero-to-Hero course, so it assumes you have worked through (or already know) the practice it certifies: CI/CD pipeline design, deployment strategies, GitOps, observability and DORA, and DevSecOps. If terms like quality gate, canary, OIDC service connection, GitOps reconciliation, or SBOM are unfamiliar, study the matching course lesson first — every exam here rewards hands-on muscle memory over flashcards, and a fortnight of building beats a month of reading. You should also have sat, or be comfortable with, the HashiCorp Terraform Associate prep kit from the IaC course, because Terraform fluency underpins the IaC domain of every cloud DevOps exam. A working knowledge of Git, containers, and at least one cloud is the floor; everything above that this lesson helps you certify.
The DevOps certification landscape: what exists and what it’s for
Before choosing, see the whole board. DevOps certs fall into five families, and they answer different questions an employer or interviewer might ask.
| Family | Certifications | What it proves | Format |
|---|---|---|---|
| Vendor-neutral foundation | DevOps Institute: DevOps Foundation, DevOps Leader, DevSecOps Practitioner, SRE Foundation | You understand DevOps culture, practices, and vocabulary (CALMS, DORA, value-stream, three ways) | Multiple choice |
| Cloud DevOps (professional) | AWS Certified DevOps Engineer – Professional (DOP-C02), Microsoft Certified: DevOps Engineer Expert (AZ-400), Google Cloud Professional Cloud DevOps Engineer | You can design and run CI/CD, IaC, monitoring, and incident response on that cloud | Multiple choice / multiple response / case-style |
| Kubernetes (CNCF) | CKA (Administrator), CKAD (Application Developer), CKS (Security Specialist), plus KCNA/KCSA entry-level | You can operate / develop on / secure Kubernetes — hands-on | Performance-based (live terminal) |
| Infrastructure as Code | HashiCorp Certified: Terraform Associate (003), Terraform Authoring & Operations Professional | You can write and operate Terraform | Multiple choice / fill-in |
| Tooling | GitHub Actions, GitHub Foundations, GitHub Advanced Security; GitLab Certified CI/CD Associate / Associate / Professional Services Engineer | You can build pipelines and use the platform’s features | Multiple choice (GitHub) / multiple choice + hands-on (GitLab) |
Two structural facts drive the whole study strategy. First, the cloud DevOps exams assume you already hold the associate-level cert for that cloud — DOP-C02 expects SysOps/Developer Associate knowledge, AZ-400 requires an associate prerequisite (AZ-104 or the Developer AZ-204) to award the Expert title, and the GCP DevOps exam assumes Associate Cloud Engineer fluency. You do not bolt a professional DevOps cert onto zero cloud knowledge. Second, the Kubernetes exams are a different sport: they are live, hands-on, kubectl-at-a-terminal exams with no multiple choice at all, so they need a fundamentally different prep style (drilling commands against a real cluster) than the recognition-and-recall cloud exams.
How to read the per-exam tables
For each exam below you get the objective domains with a weight band (a realistic “where the marks cluster” — published percentages where the vendor gives them, well-established bands where it doesn’t), the format and logistics, and the highest-leverage focus. Treat weightings as priorities for revision time, not gospel. The practice-question bank that follows is cross-cutting and ordered by exam.
DevOps Institute: DevOps Foundation & DevSecOps Practitioner
These vendor-neutral exams certify vocabulary and principles, not tooling. They are the easiest entry point and the right first rung if your background is light on DevOps theory, and their language (CALMS, the three ways, DORA, value-stream mapping) appears verbatim in interviews.
DevOps Foundation — objective areas and weight:
| Objective area | What it covers | Weight band |
|---|---|---|
| DevOps principles & the Three Ways | Flow, feedback, continual learning; systems thinking | High |
| Culture & CALMS | Culture, Automation, Lean, Measurement, Sharing | High |
| DevOps practices | CI/CD, IaC, ChatOps, Kanban, value-stream mapping | Medium |
| Business & technology frameworks | Agile, Lean, ITSM, safety culture, Theory of Constraints | Medium |
| Measurement & metrics | DORA’s four keys, the importance of feedback | Medium |
| People, teams & automation | Topologies, shared ownership, toolchains | Low |
Format: ~40 multiple-choice questions, ~60 minutes, ~65% pass mark, no formal prerequisite. Focus: the Three Ways (flow, feedback, continual learning) and CALMS are the spine — most questions trace back to one of them.
DevSecOps Practitioner layers security into that model: shifting security left, threat modelling, secure pipelines, SAST/DAST/SCA, supply-chain security, secrets management, and continuous compliance. It maps directly onto the DevSecOps pipeline lesson — if you have built a pipeline with SAST/DAST/SCA gates, you have seen most of the exam.
AWS Certified DevOps Engineer – Professional (DOP-C02)
The heavyweight of the AWS path: a professional-level, scenario-dense exam that assumes you already know AWS to associate depth and tests whether you can automate, deploy, monitor, and recover on it. AWS publishes the domain weightings, which makes prioritising easy.
| # | Domain | Weight | What it really tests |
|---|---|---|---|
| 1 | SDLC automation | 22% | CodePipeline/CodeBuild/CodeDeploy, CI/CD patterns, artifact management, testing in pipelines |
| 2 | Configuration management & IaC | 17% | CloudFormation (and Terraform/CDK), drift, reusable templates, Systems Manager |
| 3 | Resilient cloud solutions | 15% | Multi-AZ/Region, auto scaling, self-healing, deployment strategies (blue/green, canary) |
| 4 | Monitoring & logging | 15% | CloudWatch, X-Ray, EventBridge, log aggregation, metrics & alarms |
| 5 | Incident & event response | 14% | EventBridge rules, automated remediation, on-call, runbooks/SSM Automation |
| 6 | Security & compliance | 17% | IAM, secrets (Secrets Manager/SSM), Config rules, GuardDuty, automated compliance |
Format: ~75 questions (multiple choice + multiple response), 180 minutes, scaled score 750/1000 to pass. Prerequisite knowledge: SysOps Administrator or Developer Associate depth (no longer a hard requirement to sit, but assumed). Focus: SDLC automation (22%) is the biggest single domain — own the AWS Code* suite cold (especially the CodeDeploy deployment configurations: in-place vs blue/green, AllAtOnce/HalfAtATime/OneAtATime/Canary/Linear). The exam loves “choose the most operationally efficient / least operational overhead” — when two answers both work, pick the more managed, less custom one.
Microsoft Certified: DevOps Engineer Expert (AZ-400)
The Azure path’s capstone. Unusually, it is an Expert cert that requires a prerequisite associate — you must also hold AZ-104 (Administrator) or AZ-204 (Developer) for the Expert title to be awarded. AZ-400 itself is heavily weighted toward CI/CD and source control, and it is comfortably tool-aware of both Azure DevOps and GitHub (Microsoft owns both).
| Objective domain | Weight band | What it covers |
|---|---|---|
| Design & implement processes & communications | 10–15% | Traceability, work-item tracking, dashboards/metrics, stakeholder integration, ChatOps |
| Design & implement a source control strategy | 15–20% | Git branching (trunk-based, GitFlow), branch policies, pull requests, monorepo vs multi-repo, large files |
| Design & implement build & release pipelines | 40–45% | YAML multi-stage pipelines, agents/runners, templates, artifacts, environments, approvals & gates, deployment strategies, IaC (Bicep/ARM/Terraform), database deployment, secrets |
| Develop a security & compliance plan | 10–15% | Secrets (Key Vault), dependency/credential scanning, GitHub Advanced Security, governance/policy |
| Implement an instrumentation strategy | 5–10% | App Insights, Azure Monitor/Log Analytics, alerts, feedback loops |
Format: 40–60 questions, ~100 minutes, scaled 700/1000 to pass, often including a case study and drag-and-drop ordering items. Focus: pipelines dominate (~40%+) — be fluent in multi-stage YAML pipelines with environments and approvals, the difference between branch policies (enforced on the PR — required reviewers, build validation, linked work items) and environment approvals/gates (enforced at deploy time), and the self-hosted vs Microsoft-hosted agent trade-offs. Source-control strategy is the second pillar and easy marks if you know the branching models.
Google Cloud Professional Cloud DevOps Engineer
Google’s exam is distinctive: it is steeped in Site Reliability Engineering (SRE) — SLIs/SLOs/error budgets, toil reduction, blameless postmortems — far more than the AWS or Azure equivalents. If you can speak SRE fluently you are halfway there.
| Objective area | Weight band | What it covers |
|---|---|---|
| Bootstrapping & maintaining a DevOps culture (SRE) | High | SLIs/SLOs/error budgets, toil, blameless postmortems, reducing organisational friction |
| Building & implementing CI/CD pipelines | High | Cloud Build, Artifact Registry, Cloud Deploy, deployment strategies, GKE/Cloud Run targets |
| Applying SRE practices to services | High | Capacity, autoscaling, incident response, on-call, deployment safety |
| Implementing observability | Medium | Cloud Monitoring/Logging/Trace/Profiler, the four golden signals, dashboards & alerting |
| Optimising service performance & managing incidents | Medium | Debugging, root cause, mitigations, runbooks |
Format: ~50–60 multiple-choice/multiple-select questions, 120 minutes, no published pass percentage (≈70% typical). Focus: error budgets and SLO maths are the signature topic — know that an SLO is a target, an SLI is the measurement, and the error budget (1 − SLO) governs release pace (budget left → ship; budget spent → freeze and stabilise). The four golden signals (latency, traffic, errors, saturation) and Google’s specific tools (Cloud Build, Cloud Deploy, Artifact Registry, Cloud Monitoring) round out the marks.
Kubernetes: CKA, CKAD & CKS (the performance-based trio)
These CNCF exams are a different category entirely: fully hands-on, sat in a browser terminal against real clusters, with kubectl and the open Kubernetes documentation available. There is no multiple choice. Speed and command muscle memory decide passes, not recognition. All three are open-book (you may use kubernetes.io/docs), two hours each, ~66% to pass, with one free retake included.
| Exam | Audience | Signature domains | The line that separates it |
|---|---|---|---|
| CKA (Administrator) | Cluster operators | Cluster architecture/install/upgrade (kubeadm), etcd backup/restore, workloads & scheduling, services & networking, troubleshooting (~30%), storage, RBAC | Run and fix the cluster |
| CKAD (Application Developer) | App developers | Application design & build (multi-container pods, init/sidecar), deployment (rollouts/strategies), config (ConfigMaps/Secrets), observability (probes), services & networking | Build and configure workloads on a cluster |
| CKS (Security Specialist) | Security engineers | Cluster hardening, system hardening, minimise microservice vulnerabilities, supply-chain security, monitoring/logging/runtime (Falco), network policies, Pod Security | Secure the cluster (requires a current CKA to sit) |
CKA domain weightings (the published blueprint): Troubleshooting 30%, Cluster architecture/installation/configuration 25%, Workloads & scheduling 15%, Services & networking 20%, Storage 10%. Read that off: troubleshooting is the single biggest domain — a CKA who cannot diagnose a CrashLoopBackOff, a failed node, or a broken kubelet under time pressure will not pass.
The non-negotiable prep tactics for all three:
- Set up aliases and context immediately.
alias k=kubectl,export do="--dry-run=client -o yaml"so you can generate manifests fast:k run nginx --image=nginx $do > pod.yaml. - Generate, don’t hand-write, YAML. Use
kubectl create/runwith--dry-run=client -o yamlthen edit — typing manifests from scratch is too slow. - Master
kubectl explainfor field names you forget, and the docs search for the rest (the exam expects you to use the docs). - For CKA: drill
etcdctl snapshot save/restoreand akubeadm upgradeuntil automatic — these are guaranteed, high-mark, all-or-nothing tasks. - Always
kubectl config use-context <ctx>at the start of each question — every question may be on a different cluster, and solving the right task on the wrong cluster scores zero.
HashiCorp Terraform Associate (003)
Covered in full in its own prep kit, so here is the executive summary in the context of the ladder: it is the canonical IaC credential, it sits underneath every cloud DevOps exam (all three test IaC), and it is genuinely useful. The marks cluster in the mechanics — state (local vs remote, locking, plaintext secrets), the configuration constructs (count vs for_each, variables, functions, the dependency graph), and the outside-core-workflow commands (import, state mv/rm, -replace, -refresh-only). Format: ~57 questions, 1 hour, recognition/recall (no labs). Sit this first on the cloud-DevOps path — it pays off in the IaC domains of DOP-C02, AZ-400, and the GCP exam.
GitHub & GitLab tool certifications
The lightest-weight family, but increasingly asked for. GitHub Actions certifies pipeline authorship: workflow syntax, events/triggers, jobs/steps, runners (hosted vs self-hosted), reusable workflows and composite actions, secrets/variables and environments, the marketplace, and OIDC for keyless cloud auth (see the OIDC lesson). GitHub also offers Foundations (the platform basics) and Advanced Security (code/secret/dependency scanning). GitLab offers a tiered path — Certified CI/CD Associate (.gitlab-ci.yml, stages/jobs, runners, artifacts, environments, CI/CD variables) up through professional and services-engineer credentials, several including a hands-on component. These are short multiple-choice exams (GitHub) you can clear in a focused week if you already build pipelines daily.
Practice question bank (scenarios, with explained answers)
Written in each exam’s idiom and ordered by exam. Commit to an answer before reading the explanation, and pay as much attention to why the wrong answers are wrong as to the right one — that habit is what every one of these exams rewards.
Q1 — DevOps Foundation (culture)
A team measures deployment frequency and change failure rate but ignores lead time and MTTR. According to DORA, what is the risk?
- A. They are measuring too much and should drop two metrics.
- B. They have an incomplete picture — the four keys balance throughput (frequency, lead time) against stability (change failure rate, MTTR); ignoring lead time and MTTR hides whether speed is costing reliability.
- C. The four metrics are independent, so two are sufficient.
- D. DORA only requires deployment frequency.
Answer: B. DORA’s four keys come as two pairs — throughput (deployment frequency, lead time for changes) and stability (change failure rate, time to restore/MTTR). The whole point is the balance: optimising throughput while ignoring stability (or vice versa) produces a distorted, gameable picture. A inverts the principle — the metrics are deliberately minimal. C is wrong because the pairs are explicitly meant to be read together. D is simply false. The exam wants you to know the four keys and why there are four.
Q2 — AWS DOP-C02 (SDLC automation)
A web app on EC2 behind an ALB must deploy with zero downtime and the ability to instantly roll back to the previous version if errors spike. Which CodeDeploy configuration fits best?
- A. In-place deployment,
AllAtOnce. - B. Blue/green deployment, shifting traffic to a new fleet, with automatic rollback on CloudWatch alarm.
- C. In-place deployment,
OneAtATime. - D. Blue/green with
AllAtOnceand no alarms.
Answer: B. Blue/green stands up a new (green) fleet, shifts the ALB’s traffic to it, and lets you roll back instantly by shifting traffic back to the still-running blue fleet — exactly “zero downtime + instant rollback”. Wiring a CloudWatch alarm to trigger automatic rollback closes the loop. A takes the app down (AllAtOnce in place) — fails zero-downtime. C is zero-downtime-ish but in-place rollback means redeploying the old version (slow), not an instant switch. D is the trap: blue/green is right, but AllAtOnce with no alarm gives you no automatic rollback signal. The distinction the exam draws: blue/green = instant rollback via traffic shift; in-place = slower redeploy rollback.
Q3 — AWS DOP-C02 (incident response)
You need to automatically remediate any S3 bucket that becomes publicly readable, with the least operational overhead. What do you build?
- A. A nightly Lambda that scans all buckets.
- B. An EventBridge rule on the Config rule’s non-compliance event that triggers an SSM Automation document (or Lambda) to remediate.
- C. A CloudWatch dashboard for an engineer to watch.
- D. A GuardDuty finding emailed to the team.
Answer: B. The low-overhead, event-driven pattern is AWS Config (managed rule detects the public bucket) → EventBridge (fires on the non-compliance event) → SSM Automation/Lambda (remediates). It is real-time and fully automated. A works but is polling (a nightly scan), which is higher overhead and slower — the exam penalises polling when an event exists. C and D are detection/notification, not remediation — a human still has to act. The exam’s recurring lesson: prefer event-driven, managed, automatic over polling or manual.
Q4 — Azure AZ-400 (source control)
Your team must ensure every change to main is reviewed by two people, passes a build, and is linked to a work item, with no direct pushes. Where do you configure this?
- A. An environment approval check.
- B. A branch policy on
main(require minimum reviewers = 2, build validation, linked work items, no direct commits). - C. A YAML pipeline gate.
- D. A pull request template.
Answer: B. These are all properties of a branch policy in Azure Repos: minimum number of reviewers, build validation, check for linked work items, and blocking direct pushes to the protected branch — enforced on the pull request before merge. A is the trap that catches people: environment approvals/gates govern deployments at release time, not merges to a branch. C confuses a pipeline-level deploy gate with a repo policy. D is just boilerplate — a template does not enforce anything. AZ-400 leans hard on branch policy (merge-time) vs environment approval (deploy-time) — keep them separate.
Q5 — Azure AZ-400 (pipelines)
You want the same deployment steps reused across ten microservice pipelines, parameterised per service, with one place to update. What Azure Pipelines feature?
- A. Copy the YAML into each repo.
- B. A pipeline template (
extends/templatewith parameters) referenced by each pipeline. - C. A variable group.
- D. A separate classic release pipeline per service.
Answer: B. Templates (a shared YAML file with parameters, consumed via template: or extends:) are the DRY mechanism — one definition, parameterised per service, single point of update. A is the anti-pattern the question is steering you away from. C (variable groups) shares values, not steps — useful but doesn’t solve reuse of logic. D is the legacy classic approach AZ-400 expects you to move away from in favour of YAML-as-code. The exam rewards pipeline-as-code and templating over duplication.
Q6 — GCP DevOps (SRE)
A service has a 99.9% availability SLO over 30 days. Three weeks in, it has already consumed 90% of its error budget. What is the SRE-correct response?
- A. Raise the SLO to 99.95% to create more budget.
- B. Slow or freeze risky releases and prioritise reliability work until the budget recovers; resume normal release pace once it does.
- C. Ignore it — SLOs are advisory.
- D. Immediately page the entire team.
Answer: B. The error budget governs release pace: budget remaining means you can spend it on shipping features; budget nearly exhausted means you slow down, freeze risky changes, and invest in reliability until reliability recovers. That feedback loop is the core of Google’s SRE model and the heart of the exam. A is gaming the metric — changing the target to dodge the signal defeats the purpose. C misunderstands SLOs as advisory; the budget is meant to drive decisions. D is disproportionate — budget burn is a prioritisation signal, not necessarily an active incident. Know the chain: SLI (measure) → SLO (target) → error budget (1 − SLO, governs release decisions).
Q7 — CKA (etcd — performance-based, described)
You must take a snapshot of etcd and then restore the cluster from it. Which command takes the snapshot?
- A.
kubectl etcd snapshot save snapshot.db - B.
etcdctl snapshot save snapshot.db(with the CA cert, cert, and key flags, andETCDCTL_API=3) - C.
kubeadm etcd backup - D.
etcdctl backup --data-dir
Answer: B. etcd is backed up with etcdctl snapshot save <file>, supplying --endpoints, --cacert, --cert, and --key (and ensuring API v3). Restore is etcdctl snapshot restore <file> --data-dir <new-dir>, then point etcd’s static pod manifest at the restored data dir. A is not a real kubectl subcommand — etcd is managed with etcdctl, not kubectl. C is invented. D uses the deprecated v2 backup form, not the snapshot mechanism the exam tests. This is a near-guaranteed CKA task; the marks are all-or-nothing, so drill the exact flags until automatic.
Q8 — CKAD vs CKA (which exam — concept)
You need to certify that an engineer can build a multi-container pod with an init container and a sidecar, configure it via a ConfigMap, and add readiness/liveness probes — but not that they can upgrade a cluster or restore etcd. Which exam fits?
- A. CKA
- B. CKAD
- C. CKS
- D. KCNA
Answer: B. That description — designing/building and configuring workloads (multi-container pods, init/sidecar, ConfigMaps/Secrets, probes/observability) — is squarely CKAD (Application Developer). A (CKA) centres on operating the cluster (install/upgrade, etcd, nodes, troubleshooting) — explicitly excluded by “not upgrade or restore etcd”. C (CKS) is securing the cluster (and requires a current CKA). D (KCNA) is the entry-level multiple-choice associate, not a hands-on workload-building exam. The load-bearing distinction: CKA = run the cluster, CKAD = build workloads on it, CKS = secure it.
Q9 — CKS (security — concept)
You must block every pod in a namespace from running as root and from using privileged containers, using built-in Kubernetes mechanisms (no third-party admission controller). What do you apply?
- A. A NetworkPolicy.
- B. Pod Security Admission set to the
restrictedstandard on the namespace (labelpod-security.kubernetes.io/enforce: restricted). - C. An RBAC Role.
- D. A ResourceQuota.
Answer: B. Built-in Pod Security Admission enforces the Pod Security Standards (privileged/baseline/restricted); labelling a namespace pod-security.kubernetes.io/enforce: restricted blocks running as root, privileged containers, host namespaces, and more — no external controller needed (this replaced the removed PodSecurityPolicy). A controls network traffic, not pod privileges. C controls who can call the API, not what a pod may run. D caps resource consumption, not security context. CKS expects you to know PSA/PSS as the built-in replacement for the deprecated PodSecurityPolicy.
Q10 — Terraform Associate (state)
Three engineers share a Terraform config. Why is a remote backend strongly preferred over local state? (See the Terraform prep kit for the full set.)
- A. It makes
applyfaster. - B. It provides a shared source of truth and state locking, preventing concurrent writes from corrupting state.
- C. It is required to use modules.
- D. Local state can’t store outputs.
Answer: B. The defining reasons are a shared state and locking so two applies can’t corrupt the file. A is false (remote state can add latency). C is false (modules work with local state). D is false. The exam wants “team → remote + locking” without being tempted by plausible-but-wrong benefits. The Terraform Associate prep kit has a full bank of these.
Q11 — GitHub Actions (OIDC)
Your workflow deploys to AWS. To avoid long-lived secrets, how should it authenticate?
- A. Store an IAM access key and secret in repository secrets.
- B. Use OIDC: configure an IAM role trusting GitHub’s OIDC provider, and have the workflow request a short-lived token via
permissions: id-token: writeand the configure-credentials action. - C. Hard-code credentials in the workflow file.
- D. Use a personal access token.
Answer: B. GitHub Actions can present an OIDC token to the cloud; AWS (or Azure/GCP) trusts GitHub’s OIDC provider via a role/federation, and the job assumes the role for short-lived, keyless credentials — no static secrets to leak or rotate. The job needs permissions: id-token: write. A is the old, riskier pattern OIDC replaces. C is a serious anti-pattern (and a likely “obviously wrong” distractor). D is for Git/API auth, not cloud deploys. See the keyless deploys lesson for the full mechanism.
Q12 — Cross-cutting (deployment strategy)
You want to release a new version to 5% of users, watch error and latency metrics, and automatically promote or roll back based on those metrics. Which strategy is this?
- A. Blue/green.
- B. Recreate.
- C. Canary (progressive delivery) with automated metric analysis.
- D. Rolling update.
Answer: C. Routing a small percentage to the new version, evaluating metrics, and automatically promoting or rolling back is canary / progressive delivery (e.g. Argo Rollouts or Flagger). A (blue/green) shifts all traffic at once after validating the green environment — not a 5% subset with metric-driven promotion. B (recreate) tears down old then starts new (downtime). D (rolling) gradually replaces instances but without the metric-gated percentage control of a canary. The exam’s distinction: canary = small % + metric analysis + auto promote/rollback; blue/green = full switch with instant rollback.
That is a representative cross-section spanning every exam family. If you can answer these and explain the distractors, you are tracking well across the board. For each exam you actually sit, do at least one full-length timed practice paper (and for the Kubernetes exams, time yourself against a real cluster — pacing is everything).
Commonly-confused topics (the marks candidates drop across exams)
A small set of distinctions cut across multiple DevOps exams. Internalise these and you reclaim marks on every paper.
CKA vs CKAD vs CKS
| CKA | CKAD | CKS | |
|---|---|---|---|
| Role | Cluster administrator | Application developer | Security specialist |
| Centre of gravity | Install/upgrade, etcd, nodes, troubleshooting, RBAC | Building/configuring workloads, probes, ConfigMaps | Hardening, PSA/PSS, network policy, supply chain, runtime (Falco) |
| Prerequisite | None | None | Current CKA required |
| Signature task | etcd backup/restore, kubeadm upgrade |
Multi-container pod, rollout strategy | Restrict a pod, network policy, image scanning |
Mnemonic: Administer, build Deployments (apps), Secure. You must hold a current CKA to sit CKS — a frequently missed prerequisite.
Blue/green vs canary vs rolling
| Blue/green | Canary | Rolling | |
|---|---|---|---|
| Traffic | 100% switch after green is verified | Small % first, increase gradually | Replace instances batch by batch |
| Rollback | Instant (switch back to blue) | Shift % back / abort | Re-roll previous version (slower) |
| Cost | Two full environments briefly | One extra version, partial | No extra environment |
| Metric-gated promotion | No (manual cutover) | Yes (the defining feature) | Not inherently |
Every cloud DevOps exam tests this. The two facts that decide questions: blue/green = instant rollback via traffic switch, canary = small percentage with metric-driven auto-promote/rollback. Don’t conflate them.
AWS Code* services (DOP-C02)
| Service | Does | Confused with |
|---|---|---|
| CodeCommit | Managed Git repos | (being retired in favour of third-party Git) |
| CodeBuild | Build & test (compile, unit tests, package) | CodeDeploy |
| CodeDeploy | Deploy to EC2/Lambda/ECS (in-place, blue/green) | CodeBuild |
| CodePipeline | Orchestrates the stages end-to-end | CodeDeploy |
| CodeArtifact | Artifact/package repository | S3 |
The classic trap: CodePipeline orchestrates; CodeDeploy does the deployment; CodeBuild builds. Know which one owns deployment configurations (CodeDeploy) and which one sequences the workflow (CodePipeline).
AZ-400: branch policy vs environment approval
- Branch policy — enforced on the pull request, before merge: required reviewers, build validation, linked work items, comment resolution, blocking direct pushes. It protects the source.
- Environment approval / check / gate — enforced at deploy time: pre-deployment approvals, business-hours gates, query-a-monitor gates, exclusive lock. It protects the target.
Expect at least one AZ-400 question that hinges on putting the control in the right place (merge vs deploy).
SLI vs SLO vs error budget (GCP)
- SLI — the measurement (e.g. % of requests under 300 ms).
- SLO — the target for that SLI (e.g. 99.9%).
- Error budget — 1 − SLO (e.g. 0.1%), the allowed unreliability that governs release pace: budget left → ship; budget spent → freeze and stabilise.
If you can recite this chain and what the budget drives, you own the GCP exam’s signature topic.
Declarative vs imperative kubectl (CKAD/CKA)
- Imperative —
kubectl run,kubectl create,kubectl expose,kubectl scale: fast, one-off, great under exam time pressure (especially with--dry-run=client -o yamlto generate a manifest). - Declarative —
kubectl apply -fagainst version-controlled manifests: the production-correct, idempotent, GitOps-friendly approach.
The exam-savvy move: use imperative commands with --dry-run=client -o yaml to generate YAML quickly, then apply it — best of both, and the fastest way to score the performance-based exams.
The map shows the whole ladder at a glance: the vendor-neutral DevOps Foundation at the base, the Terraform Associate feeding the three cloud DevOps professional exams (DOP-C02, AZ-400, GCP DevOps), and the performance-based Kubernetes trio (CKA → CKS, with CKAD alongside) — with the prerequisite arrows that decide what you can sit when, so you read your own path off it rather than chasing badges in a random order.
Hands-on lab: build your study tracker and a free Kubernetes drill
You cannot practise the multiple-choice cloud exams in a lab (they are recognition/recall), but two things move the needle measurably and cost nothing: a structured study tracker so your prep is gap-driven, and a local Kubernetes cluster so the performance-based exams get the only practice that works — drilling commands against a real cluster.
Step 1 — a self-assessment tracker (any cloud exam). Create a simple table (a spreadsheet or a Markdown file) with one row per objective domain of the exam you’re targeting and columns for baseline score, target, and current. Populate the domains from the tables above. This turns “study DevOps” into “raise domain 5 from 60% to 85%”, which is the difference between drifting and passing.
Step 2 — a free local Kubernetes cluster (for CKA/CKAD/CKS). Install a local cluster with kind (Kubernetes-in-Docker) or minikube — both free, both run on your laptop:
# kind: spin up a throwaway cluster
kind create cluster --name cka-drill
# confirm and set up the exam aliases you'll rely on
kubectl cluster-info --context kind-cka-drill
alias k=kubectl
export do='--dry-run=client -o yaml'
Step 3 — drill the highest-yield CKA/CKAD patterns. Time yourself; the real exams are won on speed:
# Generate a pod manifest the fast (exam-correct) way, then edit + apply:
k run nginx --image=nginx $do > pod.yaml
# Create a deployment, scale it, expose it imperatively:
k create deployment web --image=nginx --replicas=3
k scale deployment web --replicas=5
k expose deployment web --port=80 --target-port=80
# A multi-container / probe edit (CKAD bread-and-butter): generate then add a readinessProbe
k create deployment api --image=nginx $do > api.yaml # then edit to add probes, apply
# Troubleshoot (CKA's biggest domain): inspect a broken pod
k get pods -A
k describe pod <name>
k logs <name> --previous
k get events --sort-by=.lastTimestamp
Step 4 — drill the CKA guaranteed tasks. Practise an etcd snapshot and a static-pod inspection until they are automatic (on a kind cluster the etcd pod lives in kube-system):
k -n kube-system get pod -l component=etcd
# Read the etcd flags you'd pass to etcdctl (endpoints, cacert, cert, key):
k -n kube-system describe pod -l component=etcd | grep -E 'listen-client|cert|key|trusted-ca'
Step 5 — Cleanup.
kind delete cluster --name cka-drill
Cost note: zero. kind/minikube run entirely on your machine — no cloud account, no spend. This is the only practice that prepares you for the performance-based Kubernetes exams, and it is free; keep a cluster around for the weeks before the exam and drill daily.
Common mistakes & troubleshooting
| Symptom (in study or on the day) | Cause | Fix |
|---|---|---|
| Studying a cloud DevOps exam with no cloud associate knowledge | Skipping the assumed prerequisite | Hold/learn SysOps-or-Developer (AWS), AZ-104/AZ-204 (Azure), ACE (GCP) first |
| Hand-typing YAML in a CKA/CKAD exam | Treating it like a written exam | Generate with --dry-run=client -o yaml, then edit — speed is the exam |
| Solving the right task on the wrong cluster (Kubernetes) | Forgetting to switch context | kubectl config use-context <ctx> at the start of every question |
| Confusing CodePipeline and CodeDeploy (DOP-C02) | Treating the Code* suite as one thing | Pipeline orchestrates; Deploy deploys; Build builds |
| Putting a merge control in an environment gate (AZ-400) | Conflating source and deploy controls | Branch policy = PR/merge; environment approval = deploy |
| Gaming the SLO to dodge a budget burn (GCP) | Misunderstanding error budgets | The budget drives release decisions; don’t change the target to hide the signal |
| Picking blue/green for a “5% + metrics” scenario | Conflating strategies | 5% + metric-gated auto-promote = canary, not blue/green |
| Sitting CKS without a current CKA | Missing the prerequisite | CKS requires an active CKA — schedule CKA first |
| Running out of time on any exam | No pacing practice | Time full-length practice papers; flag-and-return; for K8s, time against a real cluster |
| Memorising tools but not trade-offs | Surface-level prep | The pro exams test “most efficient / least overhead” — learn why one option wins |
Best practices for passing (and for the job)
- Build, then certify. Every exam here rewards hands-on experience: a real pipeline for AZ-400/DOP-C02, a real cluster for CKA/CKAD/CKS, a real Terraform project for the Associate. The portfolio you build in Real-World DevOps Portfolio Projects doubles as exam prep and interview evidence.
- Sequence the credentials (see the study order below) so each makes the next cheaper — don’t sit them in a random order.
- For multiple-choice exams, learn trade-offs, not trivia. The professional exams phrase questions as “most operationally efficient” / “least overhead”; when two answers work, the more managed/automated one usually wins.
- For performance-based exams, drill commands against a real cluster daily in the final fortnight. Set up your aliases and
--dry-runworkflow until they’re reflex, and practise the guaranteed tasks (etcd, upgrades) to automaticity. - Time-box every practice paper. Pacing removes day-of anxiety; flag-and-return is your friend on the recall exams, and ruthless task-triage (skip the hard ones, bank the easy ones first) is your friend on the K8s exams.
- Don’t neglect the “boring” domains. Monitoring/observability and incident response are 25–30% of the cloud exams combined and easy to under-study.
Security notes
Security is now a first-class domain in every DevOps exam, and several facts are both exam-testable and job-critical:
- Prefer keyless, short-lived credentials. OIDC federation (GitHub Actions/Azure DevOps/GitLab to the cloud) replaces long-lived static keys; this appears in DOP-C02, AZ-400, the GCP exam, and the GitHub Actions cert. See keyless deploys with OIDC.
- Secrets belong in a manager, never in code or logs. Secrets Manager/SSM (AWS), Key Vault (Azure), Secret Manager (GCP), and Kubernetes Secrets backed by encryption-at-rest (CKS) — all exams test that you don’t hard-code secrets and that you scope access tightly.
- CKS expects defence in depth. Pod Security Admission (the PodSecurityPolicy replacement), NetworkPolicies (default-deny), image scanning and supply-chain provenance, runtime detection (Falco), and least-privilege RBAC — know each layer.
- Terraform state holds secrets in plaintext. Generated passwords and connection strings land in state regardless of intent; use an encrypted, access-controlled remote backend (Terraform Associate + the IaC domains of every cloud exam).
- Shift security left, but gate at deploy too. SAST/DAST/SCA in the pipeline (DevSecOps Practitioner, AZ-400’s security domain, DOP-C02’s compliance domain) plus admission/policy controls at deploy — the DevSecOps lesson covers the full chain.
Interview & exam questions
A blend of likely exam items and the interview questions that probe the same knowledge — with concise answers.
- Name DORA’s four keys and why there are four. Deployment frequency and lead time (throughput) balanced against change failure rate and time-to-restore/MTTR (stability) — the pairing prevents optimising speed at the cost of reliability.
- Blue/green vs canary — when do you pick each? Blue/green for an instant, all-at-once cutover with instant rollback (two environments); canary for a gradual, metric-gated rollout with automatic promote/rollback (progressive delivery).
- In AWS, what’s the difference between CodePipeline, CodeBuild, and CodeDeploy? CodePipeline orchestrates the stages; CodeBuild builds/tests; CodeDeploy deploys (and owns the in-place vs blue/green deployment configurations).
- AZ-400: branch policy vs environment approval? Branch policy enforces controls on the pull request/merge (reviewers, build validation, work-item links); environment approvals/gates enforce controls at deploy time.
- What governs release pace in Google’s SRE model? The error budget (1 − SLO): budget remaining means you can ship; budget exhausted means freeze risky changes and prioritise reliability.
- CKA vs CKAD vs CKS? Administer the cluster (CKA), build/configure workloads on it (CKAD), secure it (CKS, which requires a current CKA). All three are hands-on/performance-based.
- Fastest way to author a manifest in a Kubernetes exam? An imperative
kubectl create/runwith--dry-run=client -o yamlto generate the YAML, then edit andapply— far faster than hand-writing. - How do you back up and restore etcd?
etcdctl snapshot save(with cacert/cert/key, API v3) to back up;etcdctl snapshot restore --data-dirthen repoint etcd’s static-pod manifest to restore. - How should a pipeline authenticate to the cloud without long-lived secrets? OIDC federation: the platform presents a short-lived OIDC token; the cloud trusts the provider and the job assumes a role — keyless, no static keys to rotate.
- In Kubernetes, how do you block privileged/root pods without a third-party admission controller? Pod Security Admission set to the
restrictedstandard on the namespace (the built-in replacement for the removed PodSecurityPolicy). - Why are cloud DevOps professional exams “most efficient” exams? Multiple answers often work; the exam wants the most operationally efficient / least-overhead / most-managed option — so learn trade-offs, not just feature existence.
- Where does the Terraform Associate fit in a DevOps credential path? Underneath the cloud DevOps exams — Terraform/IaC is a core domain in DOP-C02, AZ-400, and the GCP exam — so it’s the efficient first IaC credential to bank.
Quick check
- Which Kubernetes certification requires you to already hold a current CKA?
- A scenario routes 5% of traffic to a new version and auto-promotes based on metrics — which deployment strategy is it?
- In Google’s SRE model, what does the error budget govern?
- In AWS, which service orchestrates the CI/CD stages, and which one performs the deployment?
- What is the fastest exam-correct way to produce a Kubernetes manifest under time pressure?
Answers
- CKS (Certified Kubernetes Security Specialist) requires an active CKA to sit.
- Canary / progressive delivery — a small percentage with metric-gated automatic promotion or rollback (blue/green is a full switch, not a 5% subset).
- The release pace: budget remaining → ship features; budget exhausted → freeze risky changes and prioritise reliability work. (Error budget = 1 − SLO.)
- CodePipeline orchestrates the stages; CodeDeploy performs the deployment (CodeBuild builds/tests).
- An imperative
kubectl create/runwith--dry-run=client -o yamlto generate the YAML, then edit andkubectl applyit.
Exercise
Plan a deliberate credential ladder rather than chasing badges, then execute the first rung:
- Pick your ladder (day 1). Choose your primary cloud and write your sequence from the study-order table below — e.g. Terraform Associate → AZ-400 → CKA → CKS. Put target dates on each.
- Self-assess the first exam (day 2). Build the domain tracker from the Lab for your first target exam and sit a cold practice paper (or, for a Kubernetes exam, attempt a timed task set on a kind cluster). Record score per domain.
- Target the gaps (week 1–2). For every domain under ~80%, do the matching KloudVin course lesson and the relevant hands-on drill. For cloud exams, focus on the heaviest-weighted domains; for K8s, focus on troubleshooting and the guaranteed tasks.
- Drill the confusions (final week). Re-derive the “commonly-confused” tables from memory: CKA/CKAD/CKS, blue/green/canary/rolling, the AWS Code* suite, branch policy vs environment approval, SLI/SLO/error budget. If you can reproduce them, you own them.
- Dress rehearsal & book. Sit a second timed practice (same time of day as your booked slot). Aim for ≥85% (cloud) or finishing the task set with time to spare (K8s). Book the exam.
Deliverable: a written credential-ladder plan with dates, plus a filled-in score-by-domain tracker for your first exam from steps 2 and 5 — proof your prep is gap-driven, not vibes-driven.
Certification mapping
This whole course leads here. The table places each credential on the DevOps ladder and notes what it certifies and its prerequisite.
| Certification | Tier | Certifies | Prerequisite (assumed or required) |
|---|---|---|---|
| DevOps Institute — DevOps Foundation | Foundation | DevOps culture, practices, DORA, the Three Ways | None |
| DevOps Institute — DevSecOps Practitioner | Foundation+ | Shift-left security, secure pipelines, supply chain | DevOps Foundation helpful |
| HashiCorp Terraform Associate (003) | Associate (IaC) | Writing & operating Terraform | None (feeds every cloud exam) |
| GitHub Actions / GitLab CI/CD Associate | Tool | Pipeline authorship on that platform | Git fluency |
| AWS Certified DevOps Engineer – Professional (DOP-C02) | Professional | CI/CD, IaC, monitoring, incident response on AWS | SysOps/Developer Associate depth |
| Microsoft Certified: DevOps Engineer Expert (AZ-400) | Expert | DevOps on Azure (+GitHub) — pipelines, source control, security | AZ-104 or AZ-204 required for the Expert title |
| Google Cloud Professional Cloud DevOps Engineer | Professional | SRE, CI/CD, observability on GCP | Associate Cloud Engineer depth |
| CKA — Certified Kubernetes Administrator | Specialist (hands-on) | Operating a Kubernetes cluster | None |
| CKAD — Certified Kubernetes Application Developer | Specialist (hands-on) | Building/configuring workloads | None |
| CKS — Certified Kubernetes Security Specialist | Specialist (hands-on) | Securing Kubernetes | Current CKA required |
For the deep, single-exam treatment of the Terraform Associate (full objective checklist, practice bank, and cheat sheet), see the Terraform Associate prep kit in the IaC course.
Glossary
- Performance-based exam — a hands-on exam sat at a live terminal (CKA/CKAD/CKS), graded on tasks completed, not multiple choice.
- Objective domain — a published area the exam blueprint is organised around, often with a percentage weighting.
- Distractor — a deliberately plausible but incorrect option; reasoning about why it’s wrong is the skill these exams test.
- DORA four keys — deployment frequency, lead time for changes, change failure rate, time to restore (MTTR).
- CALMS — Culture, Automation, Lean, Measurement, Sharing — the DevOps culture model.
- SLI / SLO / error budget — the measurement, its target, and the allowed unreliability (1 − SLO) that governs release pace.
- Blue/green — release by standing up a parallel environment and switching all traffic, enabling instant rollback.
- Canary / progressive delivery — release to a small percentage, evaluate metrics, then auto-promote or roll back.
- CKA / CKAD / CKS — Kubernetes Administrator / Application Developer / Security Specialist certifications (CKS requires a current CKA).
- etcd snapshot — a point-in-time backup of the Kubernetes control-plane datastore (
etcdctl snapshot save/restore). - Pod Security Admission (PSA) — built-in enforcement of the Pod Security Standards, the replacement for PodSecurityPolicy.
- OIDC federation — keyless authentication where a pipeline presents a short-lived token a cloud trusts, replacing static keys.
- Branch policy — Azure DevOps controls enforced on a pull request before merge (reviewers, build validation, work-item links).
- Environment approval/gate — controls enforced at deploy time (approvals, business-hours gates, monitor queries).
Next steps
This is the final lesson of the KloudVin DevOps Zero-to-Hero course — congratulations on reaching it. From here:
- Pick your credential ladder from the study-order table and run the one-exam sprint in the Exercise. Book the first exam while the material is fresh.
- If you want a portfolio to point at in interviews — and the best possible exam practice — build the projects in Real-World DevOps Portfolio Projects.
- For the deep, single-exam treatment of the IaC credential that underpins every cloud DevOps exam, work the Terraform Associate prep kit.
- To reinforce the security domain that now appears on every exam, revisit the DevSecOps pipeline lesson and keyless OIDC deploys; for the SRE/observability themes the GCP exam loves, see DORA metrics in the pipeline.