DevOps Lesson 3 of 56

CI/CD Anatomy, In Depth: Pipelines, Triggers, Stages, Jobs, Agents, Artifacts & Environments

Every CI/CD tool you will ever touch — GitHub Actions, GitLab CI, Azure Pipelines, Jenkins, CircleCI, Tekton, Buildkite, Drone, Bitbucket Pipelines — is a different dialect of the same language. They all take an event (“someone pushed”), spin up a machine, check out your code, run a sequence of commands, save the outputs, and report green or red. Once you can see that shared skeleton, learning a new tool stops being “memorise a new YAML schema” and becomes “find where this tool spells the concept I already know”. An engineer who has internalised the anatomy can move from Jenkins to GitHub Actions in an afternoon; one who only memorised steps: keys has to start over.

This lesson teaches that anatomy — the universal, vendor-neutral mental model — one part at a time, exhaustively. We will define CI, CD and CD precisely; dissect the pipeline → stage → job → step hierarchy; enumerate every kind of trigger; explain the agent/runner executor model that actually does the work; and work through workspaces, variables and secrets, artifacts versus caching, fan-out/fan-in, matrix builds, conditions, environments and approvals. Throughout, a running concept-mapping table shows exactly how each idea is spelled in GitHub Actions, GitLab CI, Azure Pipelines and Jenkins, so the vocabulary transfers immediately.

This is the anatomy lesson. Its companion, CI/CD Pipeline Design: Stages, Quality Gates, Artifacts & Security Scans, is the design lesson — how to architect a production pipeline (the stage flow, which gates to place where, artifact promotion strategy, OIDC, supply-chain hardening). This one gives you the parts and how they fit; that one tells you how to assemble them well. Read this first.

Learning objectives

By the end of this lesson you will be able to:

Prerequisites & where this fits

You should be comfortable with Git — commits, branches, tags, and pull/merge requests — because triggers fire on Git events and the anatomy assumes you know what “a push to a branch” or “a tag” is; the companion Git, In Depth lesson covers exactly that. A basic reading knowledge of YAML helps, since most pipelines are defined in it (see YAML for DevOps for the syntax, anchors and gotchas). You do not need a cloud account or any tool installed: the lab runs on the free tier of GitHub Actions in the browser. This lesson sits early in the Fundamentals / CI/CD track of the DevOps Zero-to-Hero course — after Git and YAML, and before the tool-specific deep dives (GitHub Actions, In Depth) and the pipeline design lesson. Get the anatomy here; specialise next.

Core concepts: CI vs CD vs CD, precisely

Three terms are thrown around interchangeably and they are not the same thing. Pin them down, because the distinction is a near-guaranteed interview question.

Term What it automates Human still decides… Ends at
Continuous Integration (CI) Merging every change to a shared mainline often, and automatically building + testing it n/a (it is fully automatic up to here) A built, tested, publishable artifact
Continuous Delivery (CD) Everything CI does, plus automatically preparing the artifact for release so it is always deployable When to release to production (a push-button / approval) A change sitting one approval away from production
Continuous Deployment (CD) Everything continuous delivery does, plus the release itself — every change that passes all gates ships with no human step nothing — fully automatic Running in production, automatically

Two traps to avoid. First, both CDs share the initials, so always disambiguate in conversation (“delivery” vs “deployment”). The only difference between them is a single manual gate: delivery keeps a human (or policy) in the loop to choose when to ship; deployment removes even that. Second, CI is a practice, not a product — “we use a CI tool” does not mean you do CI. Genuine CI means everyone integrates to mainline at least daily and the build stays green; a tool that runs long-lived feature branches that merge monthly is automating something, but it is not continuous integration.

The other foundational idea, which the whole anatomy serves: a pipeline is automation expressed as code, living in your repository. The definition file (.github/workflows/*.yml, .gitlab-ci.yml, azure-pipelines.yml, Jenkinsfile) is versioned with the code it builds, reviewed in pull requests, and rolls back with git revert. Everything below is a building block of that file.

The pipeline hierarchy: pipeline → stage → job → step

This is the single most important structure to internalise, because every tool implements some version of it. From the outside in:

Level What it is Runs where Isolation & parallelism Fails how
Pipeline / Workflow The whole automated process triggered by an event — the top-level unit Spans many machines The entire run for one trigger One pipeline run succeeds or fails as a whole
Stage A named phase grouping related jobs (e.g. build, test, deploy); a sequencing/gating boundary Spans the machines of its jobs Stages usually run in order; a stage starts only when the previous one succeeds A failed stage normally stops later stages
Job A set of steps that runs together on one agent, in one workspace One agent/runner (one machine/container) Jobs are the unit of parallelism and the unit of isolation — different jobs get different, fresh machines A failed step fails its job
Step / Task A single unit of work — a shell command or a pre-packaged action/task Inside its job’s agent, sharing that workspace Steps run sequentially within a job, sharing files and (often) shell state A failing step fails the job (unless told to continue)

Three consequences fall out of this structure, and they explain most “why doesn’t my pipeline work?” confusion:

A note on naming: what one tool calls a step, another calls a task — they are identical (one unit of work inside a job). And what runs the job is the agent (Azure Pipelines, Jenkins) or runner (GitHub Actions, GitLab CI) — again the same concept. We map all of this explicitly later.

Triggers: every way a pipeline starts

A pipeline does nothing until an event starts it. The set of events a pipeline listens for is its trigger configuration. This is the entry point of the whole system, and there are more kinds than beginners expect. Here is the full taxonomy:

Trigger Fires when… Typical use Watch out for
Push (branch) Commits are pushed to a branch (often filtered to main, or by changed path) Build/test the mainline; deploy on push to main Path/branch filters matter — an unfiltered push trigger runs on every branch
Pull/Merge request A PR/MR is opened, updated (new commits), reopened or its target changes Pre-merge validation — the gate that keeps mainline green Fork PRs run with reduced permissions and (by design) no access to secrets — security boundary, not a bug
Tag A Git tag is pushed (often v* for releases) Release pipelines — build & publish a versioned release on tag Tag triggers are separate from branch triggers; you must opt in
Schedule / cron A clock time matches a cron expression Nightly builds, dependency scans, cleanup, periodic e2e suites Cron is usually UTC; scheduled runs typically use the default branch’s pipeline definition
Manual A human clicks “Run” (optionally supplying input parameters) On-demand deploys, one-off ops jobs, “run with these inputs” Needs explicit support (workflow_dispatch, when: manual, parameters) and permission control
API / webhook (repository_dispatch) An external system POSTs to the CI API with a custom event name + payload Trigger from a chatops bot, an external service, or another system The endpoint is privileged — protect the token that can fire it
Upstream / pipeline trigger Another pipeline finishes (chaining pipeline B after pipeline A) Multi-repo / multi-stage delivery: app build triggers infra deploy Creates cross-pipeline coupling; pass context explicitly (commit, version)
Resource change (container/package/PR review/issue, etc.) A non-Git resource changes — a new base image, a published package, a comment Rebuild when a base image updates; respond to a /deploy comment Tool-specific; availability varies

Two cross-cutting ideas sit on top of triggers:

Mentally, the trigger answers “what woke the pipeline up, and with what context?” — and that context (the commit SHA, the branch, the PR number, the actor, the event payload) is then available to every job through variables.

Agents and runners: the executor model

A pipeline definition is just instructions. Something has to actually run them — and that something is the agent (Azure Pipelines, Jenkins) or runner (GitHub Actions, GitLab CI). Understanding this executor model is what separates people who can debug pipelines from those who cannot, because “it works locally but fails in CI” is almost always a property of where and how the job ran.

The lifecycle of one job on an agent:

  1. The pipeline emits a job and a set of requirements (which OS, which labels/tags, which pool).
  2. The CI system matches the job to an eligible agent — a free machine that advertises the required labels/capabilities.
  3. The agent prepares a workspace (a working directory) and checks out the code (or you do, as a step).
  4. The agent runs the steps in order on that machine, streaming logs back.
  5. The agent uploads artifacts/caches as instructed and reports the result (pass/fail), then is cleaned up or returned to the pool.

The big architectural choice is who owns and runs the agent:

Model What it is Pros Cons / when to use
Hosted (cloud-provided) A fresh, managed VM or container per job, run by the CI vendor (GitHub-hosted runners, GitLab SaaS runners, Microsoft-hosted agents) Zero maintenance; clean machine every run; instant scale; multiple OS images available Per-minute cost; no access to your private network by default; fixed hardware specs; queue/concurrency limits
Self-hosted A machine you own and register with the CI system (a VM, a bare-metal box, your laptop) Reaches private networks/databases; custom hardware (GPU, lots of RAM); pre-warmed caches; cost control at high volume You patch, secure and scale it; state can leak between jobs unless cleaned; idle capacity costs money even unused
Ephemeral / autoscaling Self-hosted agents that are created fresh per job and destroyed after — typically as Kubernetes pods (GitHub Actions Runner Controller, GitLab Kubernetes executor, Azure scale-set agents) Private-network reach and clean-per-job isolation; scales to zero (no idle cost) Needs a cluster and operational know-how to run

How a job finds its agent — the matching mechanism — also has a shared shape with tool-specific names:

Three executor truths worth committing to memory:

Workspaces and checkout

When a job starts on its agent, it gets a workspace — a working directory that all of that job’s steps share. Two things matter:

The workspace is wiped between jobs (on hosted/ephemeral agents). Within a job it persists across steps — which is the whole reason multi-step jobs are useful: step 1 installs dependencies into the workspace, step 2 compiles using them, step 3 packages the result.

Variables and secrets: scopes and masking

Pipelines are parameterised by variables (non-sensitive configuration) and secrets (sensitive values like tokens and passwords). The mental model has two axes: scope (where the value is visible) and sensitivity (whether it is masked and protected).

Scope — variables can be defined at several levels, and a narrower scope usually overrides a broader one:

Scope Visible to Typical use
Organisation / global Every pipeline in the org/instance Company-wide config, shared registry URL
Project / repository Every pipeline in that repo/project Repo-wide settings, default region
Pipeline / workflow One pipeline definition A value used across that pipeline’s jobs
Stage One stage Phase-specific config
Job One job Job-local config
Step One step A value used by a single command
Environment-scoped Only when deploying to a named environment Per-environment secrets (the prod DB password only exists for the prod deploy)

Sensitivity — the difference between a variable and a secret:

There is also a crucial distinction between predefined/built-in variables the platform injects (the commit SHA, branch name, build number, repository, the actor who triggered the run, a temporary auth token) and user-defined ones you set. The built-ins are how a step learns the context the trigger captured. And modern pipelines increasingly replace stored cloud secrets entirely with OIDC short-lived federated credentials — covered in the design lesson and the OIDC deep dive.

Artifacts vs caching: the distinction everyone confuses

Both “save files from one job and use them later”, so beginners conflate them. They are completely different mechanisms with opposite guarantees, and confusing them causes both broken deploys and slow pipelines.

Artifacts Caching
Purpose Pass deliverable outputs between jobs, or keep them after the run (the build output, test reports, the packaged binary/image) Speed up rebuilds by restoring expensive, reproducible inputs (dependency directories, build/layer caches)
Guarantee Durable — if you published it, it is there; correctness can depend on it Best-effort — a cache miss is normal and must be safe; never depend on cache contents for correctness
Keying Named explicitly; retrieved by name Keyed on a hash (usually a lockfile) so it invalidates when inputs change
Lifetime Retention you set (days/weeks); release artifacts often kept long Evicted on size/age; transient by nature
If missing The consuming job fails (the thing it needed is gone) The job just rebuilds from scratch (slower, still correct)

The rule of thumb: if losing it would make the pipeline produce a wrong result, it is an artifact; if losing it only makes the pipeline slower, it is a cache. Your compiled binary, the container image, the test/coverage report you publish, the Terraform plan you hand to the apply job — artifacts. Your node_modules, ~/.m2, the pip cache, Docker layer cache — caches. (Promoting artifacts between environments — “build once, deploy many” — is a design topic covered in the companion lesson; here we only care that artifacts are the durable inter-job hand-off mechanism.)

The mechanics: a job publishes/uploads a named artifact; a later job consumes/downloads it by name. For tiny values (a version string, a computed tag) you do not need a file artifact — you use a job output, a small key/value a downstream job reads via the dependency.

Fan-out / fan-in, matrix and parallelism

Once you have jobs and dependencies, you can shape the graph of execution. Three patterns cover almost everything:

These build the dependency graph that the stage/job ordering executes. Parallelism is the main lever on pipeline lead time — and lead time is a DORA metric, so this is not academic. Two cautions: caches sharing a key across parallel jobs can race, and unbounded parallelism can blow past your hosted-runner concurrency limit or self-hosted capacity.

Conditions: running steps and jobs only when they should

Real pipelines are not straight lines — they branch. Conditional execution (if: / when: / condition: / Jenkins when {}) decides whether a step, job or stage runs, based on context: the branch (only deploy from main), the event type (only on a tag), the result of a previous step (run cleanup even if the build failed), a variable’s value, or a manual approval.

The two subtle, must-know behaviours:

Conditions are also how a single pipeline serves many situations — the same file auto-deploys to dev on every push, but the prod-deploy job is if branch is main and it is a tag, gated behind an approval. One definition, many behaviours, all visible in version control.

Environments and approvals/gates

An environment is a named deployment target — dev, staging, production — that you deploy to. Treating environments as first-class objects (rather than just a variable) unlocks the controls that make deployment safe:

Critically, approvals belong to the environment, not the pipeline. Configuring the gate on the environment (GitHub Environments, GitLab protected environments, Azure environment checks/approvals, Jenkins input plus folder permissions) keeps one pipeline definition serving every stage with different protection levels. This is the anatomical home of “manual approval before prod”.

Idempotent and ephemeral builds

Two properties make a pipeline trustworthy, and both follow from the executor model:

Put together: the ideal job runs on a fresh ephemeral agent, checks out an exact commit, restores a lockfile-keyed cache (safe to miss), runs deterministic steps, and publishes durable artifacts — so the same input always gives the same, trustworthy output.

The build → test → package → deploy flow

Tie the anatomy together with the canonical flow a change travels — the shape every pipeline approximates (the design lesson covers how to engineer each phase well; here we name the phases so the parts have a home):

  1. Build / compile — turn source into runnable form (compile, transpile, bundle), restoring dependencies (from cache).
  2. Test — prove correctness: unit → integration → end-to-end, fanned out in parallel, collecting reports as artifacts.
  3. Package — produce the deployable artifact (a container image, a .jar/.whl, a zip), versioned and published to a registry.
  4. Deploy — place that published artifact into an environment, gated by approvals where needed.

Build → test → package is the CI half (it ends with a trustworthy artifact and never touches a live system); deploy is the CD half (it moves that artifact through environments). The boundary is the artifact registry — exactly the build-once line from the design lesson.

Mapping the concepts across the four major tools

This is the payoff. Every concept above, spelled in the four most common tools. Learn the left column once; this table translates it anywhere.

Universal concept GitHub Actions GitLab CI Azure Pipelines Jenkins (declarative)
Definition file .github/workflows/*.yml .gitlab-ci.yml azure-pipelines.yml Jenkinsfile
Pipeline / top level Workflow Pipeline Pipeline Pipeline
Stage (no keyword — order via needs:) stages: + stage: stages: + - stage: stages { stage('…') }
Job jobs.<id>: top-level job key - job: (under a stage) stage body / parallel stages
Step / task steps: (run or uses) script: lines steps: (- script / - task) steps { sh '…' }
Executor name Runner Runner Agent Agent / node
Select an agent runs-on: (labels) tags: pool: + demands: agent { label '…' }
Hosted executor GitHub-hosted runners GitLab SaaS runners Microsoft-hosted agents (none — you host)
Push trigger on: push (+ branches,paths) rules/only on push trigger: (+ branches,paths) triggers { } / SCM webhook
PR/MR trigger on: pull_request merge_request_event rule pr: trigger Multibranch / GH-PR plugin
Tag trigger on: push: tags: rules on $CI_COMMIT_TAG trigger: tags: tag condition in when
Schedule / cron on: schedule: cron rules + pipeline schedules schedules: - cron triggers { cron('…') }
Manual run workflow_dispatch (+ inputs) when: manual (+ pipeline run) manual / parameters parameters {} + Build button
API / external trigger repository_dispatch / API pipeline trigger token / API REST API / webhook build API / Generic Webhook
Upstream pipeline workflow_run / reusable call trigger: (child/multi-project) pipeline resource trigger build job: step
Job dependency (order) needs: needs: (or stage order) dependsOn: stage order / parallel
Concurrency control concurrency: resource_group: / interruptible (queueing settings) disableConcurrentBuilds / lock
Variable env: / vars variables: variables: environment {}
Secret secrets.* (repo/env/org) masked/protected CI variables secret variables / Key Vault Credentials + credentials()
Built-in context ${{ github.* }} CI_* predefined vars $(Build.*) / predefined env.* (e.g. BUILD_NUMBER)
Artifact (durable) actions/upload/download-artifact artifacts: (+ dependencies) PublishPipelineArtifact task archiveArtifacts / stash-unstash
Cache (speed) actions/cache / cache: input cache: (keyed) Cache@2 task plugin / scripted cache
Job output (small value) outputs: + needs.*.outputs dotenv artifact output variables isOutput=true script { } returns / env
Matrix / fan-out strategy.matrix parallel: matrix: strategy.matrix matrix {} / parallel {}
Fail-fast / max parallel fail-fast,max-parallel parallel: count maxParallel matrix failFast
Condition if: (+ always(),success()) rules: / when: condition: (+ always()) when {} / post {}
Environment Environments (+ required reviewers) Environments (protected) Environments (+ checks) folder + input approval
Approval gate environment required reviewers protected env + approvals environment approvals/checks input step
Reuse / templating reusable workflows, composite actions include: / extends: template: / extends: Shared libraries

A few translation notes that trip people up when they switch tools:

CI/CD pipeline anatomy: pipeline, stages, jobs, steps, triggers, agents, artifacts and environments

The diagram lays the anatomy out top to bottom: a trigger (push/PR/tag/schedule/manual/API) starts a pipeline, which contains ordered stages; each stage holds jobs that run in parallel on agents/runners, each job a sequence of steps in a shared workspace; artifacts flow forward from build jobs into deploy jobs while caches restore dependencies side-on, and the final deploy jobs target gated environments (dev → staging → prod) with approvals on the prod gate.

Hands-on lab

We will build a small pipeline on GitHub Actions (free tier, runs entirely in the browser — no installs) that demonstrates every core part of the anatomy in one place: multiple triggers, a fan-out/fan-in job graph, a matrix, an artifact hand-off, a cache, a job output, a condition, and an environment gate. The point is to see the anatomy, not to ship anything.

1. Create a repo. On GitHub, create a new public repository (e.g. cicd-anatomy-lab) with a README so it has a default branch.

2. Add an environment with an approval. In the repo: Settings → Environments → New environment, name it production, and under Deployment protection rules tick Required reviewers and add yourself. This is the approval gate, attached to the environment.

3. Add the pipeline. Create the file .github/workflows/anatomy.yml (use Add file → Create new file in the web UI):

name: anatomy-demo

# --- TRIGGERS: several kinds at once ---
on:
  push:
    branches: [main]          # push trigger, branch-filtered
  pull_request:               # PR trigger (validation)
  workflow_dispatch:          # manual trigger with an input
    inputs:
      note: { description: "why are you running this?", default: "manual run" }
  schedule:
    - cron: "0 3 * * *"       # nightly at 03:00 UTC

# Only one run per branch at a time; cancel the older one.
concurrency:
  group: anatomy-${{ github.ref }}
  cancel-in-progress: true

permissions:
  contents: read

jobs:
  # --- BUILD job: produces an ARTIFACT and a job OUTPUT ---
  build:
    runs-on: ubuntu-latest                 # select a HOSTED runner by label
    outputs:
      version: ${{ steps.ver.outputs.v }}  # a small value passed downstream
    steps:
      - uses: actions/checkout@v4          # CHECKOUT is an explicit step
      - id: ver
        run: echo "v=1.0.${GITHUB_RUN_NUMBER}" >> "$GITHUB_OUTPUT"   # built-in context
      - name: Produce a build artifact
        run: |
          mkdir -p out
          echo "built version ${{ steps.ver.outputs.v }} from ${GITHUB_SHA::7}" > out/app.txt
      - uses: actions/upload-artifact@v4   # publish a DURABLE artifact
        with: { name: app, path: out/ }

  # --- TEST job: a MATRIX (fan-out) that also uses a CACHE ---
  test:
    needs: build                            # runs after build (ordering)
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false                      # see every leg's result
      matrix:
        suite: [unit, integration, lint]    # one job def -> three parallel jobs
    steps:
      - uses: actions/checkout@v4
      - uses: actions/cache@v4              # CACHE keyed on a lockfile-like key
        with:
          path: ~/.cache/demo
          key: demo-${{ runner.os }}-${{ hashFiles('**/README.md') }}
      - run: echo "running ${{ matrix.suite }} tests for ${{ needs.build.outputs.version }}"

  # --- DEPLOY job: FAN-IN + CONDITION + ENVIRONMENT gate ---
  deploy:
    needs: [build, test]                    # fan-in: waits for build AND all test legs
    if: github.ref == 'refs/heads/main'     # CONDITION: only from main
    runs-on: ubuntu-latest
    environment: production                  # the APPROVAL gate fires here
    steps:
      - uses: actions/download-artifact@v4  # CONSUME the artifact from build
        with: { name: app }
      - run: |
          echo "Deploying $(cat app.txt)"
          echo "version=${{ needs.build.outputs.version }}"

4. Run it and watch the anatomy. Commit to main. Open the Actions tab and click the run. You will see:

5. Validate each concept.

Validation checklist: a parallel matrix of test jobs; a deploy that waits on fan-in and a manual approval; an artifact produced by one job and consumed by another; a cache hit on the second run; the deploy job skipped on a non-main trigger.

Cleanup. Delete the workflow file (or the whole repo: Settings → Delete this repository). Artifacts auto-expire; you can delete them sooner from the run’s summary page. The production environment disappears with the repo.

Cost note. Public-repo Actions minutes and storage are free. Private repos get a monthly free allotment of minutes and artifact storage; beyond it, hosted minutes bill per minute (Linux cheapest; Windows/macOS multiplied) and artifacts bill on storage. The cost levers are the same anywhere: parallelism (more concurrent minutes), artifact retention, and hosted vs self-hosted runners.

Common mistakes & troubleshooting

Symptom Likely cause Fix
“File from the build job is missing in deploy” Expecting jobs to share a filesystem — they run on different, clean machines Publish an artifact in the producer job and download it in the consumer (or use a job output for small values)
Step works locally, fails in CI with “command not found” / wrong version The agent is a different machine — different OS, tool versions, no env vars Pin tool versions in the pipeline; install what you need per job; treat the agent as the real environment
“fatal: not a git repository” or “no such commit” Checkout skipped or shallow clone lacks needed history (tags, base branch) Add/keep the checkout step; request a deeper/full fetch when you need history (version/changelog/diff)
Test report / notification doesn’t appear when the build fails The reporting step was skipped because an earlier step failed Mark it to run with always() / “on failure” so it runs regardless
New gate/scan deploys to prod from a feature branch Missing branch condition on the deploy job/environment Add if: branch is main and restrict the environment’s allowed source branches
Two deploys to the same environment race No concurrency control on overlapping runs Add a concurrency group (per branch/environment) with cancel-in-progress or queueing
Pipeline is reliably slow Everything serial; no caching; no parallelism Fan out independent jobs; add lockfile-keyed caches; shard big test suites via a matrix
Secret appears in the logs The value was transformed (so masking missed it) or echo-ed for “debugging” Never print secrets; remember masking is literal string replacement — a decoded/derived secret is not masked
Fork PR can’t see secrets / can’t deploy By design — fork PR runs get reduced permissions and no secrets Don’t rely on secrets in fork-PR validation; gate privileged work behind approval on the protected branch

Best practices

Security notes

The pipeline runs your code on a machine that holds tokens and can write to production, so it is a high-value target. Treat secrets carefully: store them in the platform’s encrypted secret store (never in the YAML), scope them as narrowly as possible (environment-scoped for prod), and never print them — masking is best-effort string replacement that a transformed value defeats. Remember the fork-PR boundary: pull requests from forks deliberately run with reduced permissions and no secrets, so an attacker’s PR cannot exfiltrate them — do not “fix” this by loosening it. Never run untrusted code on a persistent self-hosted agent; a fork could read other jobs’ files, cached credentials and the agent token — use ephemeral agents and require approval for fork PRs. Pin third-party actions/tasks to a commit SHA so a hijacked tag cannot silently run attacker code with your tokens. Grant each job least-privilege permissions (a read-only token unless it genuinely needs to write). Where you authenticate to a cloud, prefer OIDC short-lived federated credentials over a stored static key. And make the pipeline auditable — who approved which deploy to which environment, and when. The companion design lesson goes deeper on OIDC, supply-chain signing and SBOMs.

Interview & exam questions

  1. What is the difference between continuous integration, continuous delivery and continuous deployment? CI merges every change to mainline often and validates it automatically, ending at a tested, publishable artifact. Continuous delivery keeps that artifact always-releasable but a human/policy decides when to release (an approval). Continuous deployment removes that last gate — every change passing all automated gates ships to production automatically.

  2. Walk me through the pipeline hierarchy. A pipeline/workflow (the whole run for one trigger) contains stages (ordered phases), which contain jobs (the unit of parallelism and isolation — each runs on one agent in one workspace), which contain steps/tasks (single commands/actions that run sequentially and share the job’s filesystem). GitHub Actions omits an explicit stage keyword and uses needs: for ordering instead.

  3. Why can’t a later job see files a previous job created, and what do you do about it? Because each job typically runs on a different, clean machine with its own workspace — nothing is shared across jobs. To pass a file you publish an artifact and download it; to pass a small value you use a job output.

  4. Name the trigger types and give a use for each. Push (build/deploy mainline), pull/merge request (pre-merge validation), tag (release pipelines), schedule/cron (nightly scans/cleanup), manual (on-demand deploys with inputs), API/webhook repository_dispatch (trigger from external systems/chatops), and upstream/pipeline triggers (chain one pipeline after another).

  5. Hosted vs self-hosted vs ephemeral runners — when each? Hosted: zero maintenance, clean per job, pay per minute — the default. Self-hosted: when you need private-network reach, special hardware or cost control at scale — but you secure/scale it and risk state leakage. Ephemeral/autoscaling (e.g. ARC, Kubernetes executors): self-hosted reach with hosted-style clean-per-job isolation and scale-to-zero.

  6. Explain artifacts vs caching. Artifacts are durable, named outputs passed between jobs or kept after the run; correctness can depend on them, and a missing one fails the consumer. Caches are a best-effort speed optimisation (dependency/build outputs keyed on a lockfile hash); a miss is normal and the job just rebuilds. Rule: if losing it makes the result wrong it’s an artifact; if it only makes the run slower it’s a cache.

  7. What is fan-out/fan-in, and how is a matrix related? Fan-out is one job triggering many parallel jobs; fan-in is many jobs converging on one that waits for all (a natural gate). A matrix expands a single job definition into many parallel jobs over parameter combinations (versions/OSes) or to shard a test suite — a DRY form of fan-out.

  8. How do you make a step run even when an earlier step failed, and why would you? Mark it with always() (or “on failure”). You need it for steps that must run regardless of outcome — uploading test reports, sending notifications, tearing down test infrastructure. Without it they’re skipped the moment something fails.

  9. Where do deployment approvals belong, and why there? On the environment (GitHub Environments, GitLab protected environments, Azure environment checks), not in the pipeline body. That way one pipeline definition auto-deploys to dev, requires one approver for staging and two for prod, with environment-scoped secrets and a clean audit trail — no forking the YAML.

  10. What does “ephemeral and idempotent build” mean, and why does it matter? Ephemeral = each run starts clean and leaves nothing behind (kills “passes on re-run” flakiness and leftover-credential risk). Idempotent/reproducible = the same commit yields the same result every time (requires pinned versions, lockfile-keyed caches, no reliance on mutable inputs). Together they make the pipeline trustworthy.

  11. How are secrets protected, and how do they still leak? They’re encrypted at rest, masked in logs (the string is replaced with ***), and withheld from fork-PR runs. They leak when you transform a secret (e.g. base64-decode it) so the new value isn’t masked, or when you print one “to debug” — masking is literal string matching, not magic.

  12. What is concurrency control in a pipeline and when is it essential? A mechanism that limits overlapping runs in the same group (per branch or per environment), cancelling or queuing the rest. It’s essential for deploy pipelines so two deploys never race against the same environment.

  13. Give the GitLab/Azure/Jenkins equivalents of: runs-on, needs, uses, secrets.*. runs-on → GitLab tags:, Azure pool:/demands, Jenkins agent { label }. needs → GitLab needs:, Azure dependsOn, Jenkins stage order/parallel. uses (an action) → there’s no direct equivalent (GitLab include/templates, Azure task, Jenkins shared-library steps fill the role). secrets.* → GitLab masked/protected variables, Azure secret variables/Key Vault, Jenkins Credentials + credentials().

Quick check

  1. Which level of the pipeline hierarchy is the unit of parallelism and isolation, and what does that imply about sharing files?
  2. You need to pass a compiled binary from a build job to a deploy job. Artifact or cache — and why?
  3. Name three distinct triggers and a use for each.
  4. A teammate says “just put the test-report upload as the last step and it’ll always run”. Why are they wrong, and what’s the fix?
  5. Where should a “require two approvers before production” rule be configured, and why there rather than in the pipeline body?

Answers

  1. The job — each job runs on its own clean machine/workspace, so nothing is shared between jobs automatically; you must publish artifacts or declare outputs to pass anything across the job boundary.
  2. Artifact. The binary is a durable deliverable the deploy job needs — if it’s missing, deploy must fail, not silently rebuild. Caches are best-effort and safe to miss, which is the wrong guarantee for a deliverable.
  3. Any three of: push (build/deploy mainline), pull/merge request (pre-merge validation), tag (release pipeline), schedule/cron (nightly scan/cleanup), manual (on-demand deploy with inputs), API/webhook (trigger from an external system).
  4. By default a step is skipped once an earlier step fails, so if tests fail the upload never runs. Mark the upload step with always() (or “on failure”) so it runs regardless of the build outcome.
  5. On the environment (e.g. GitHub Environments / GitLab protected environments / Azure environment checks). Putting it there lets one pipeline definition serve dev/staging/prod with different protection levels and environment-scoped secrets, and gives a clean audit trail — no forked YAML.

Exercise

Take the lab pipeline and deepen the anatomy:

  1. Add an upstream/pipeline trigger. Create a second tiny workflow that triggers when the first one completes (workflow_run), and have it print the version the first run produced — proving cross-pipeline chaining and context passing.
  2. Add a tag-triggered release path. Add a job that runs only on a pushed tag (if is a tag / on: push: tags:), builds the artifact, and “releases” it (just print, for the lab). Push a v1.0.0 tag and confirm only that path runs.
  3. Shard the test matrix. Convert the suite matrix into a sharded unit-test matrix (shard: [1, 2, 3, 4]) and add max-parallel: 2 — observe two legs running at a time. Note the effect on total time.
  4. Add a failure-only notify job. Add a job that runs only if the build failed (a “rollback/notify” placeholder) using a failure condition — and verify it stays skipped on a green run and fires on a red one (break a step to test).
  5. Re-map it. Pick one other tool (GitLab CI, Azure Pipelines or Jenkins) and translate your workflow into it using the mapping table — even just on paper. The act of translation is what cements the anatomy.

Record in your notes: the run graph showing fan-out → fan-in → gated deploy, and your one-page translation of the same pipeline into a second tool.

Certification mapping

Exam / certification Relevant objectives
Microsoft Azure DevOps Engineer Expert (AZ-400) Designing and implementing pipelines with Azure Pipelines — stages, jobs, steps/tasks, triggers, agents/pools, variables & secret variables, artifacts, environments and approvals; the CI/CD concepts underpinning all of it
AWS Certified DevOps Engineer – Professional (DOP-C02) CI/CD concepts and pipeline structure (CodePipeline stages/actions, CodeBuild jobs), triggers, artifacts, environment/stage gating and approvals
Google Cloud Professional DevOps Engineer CI/CD fundamentals, Cloud Build triggers/steps, build artifacts, and release/promotion concepts
GitHub Actions certification Workflow/event/job/step model, runners, contexts & variables, secrets, artifacts vs caching, matrix, environments — this lesson is the conceptual core
GitLab certifications Pipeline/stage/job structure, runners & tags, rules-based triggers, artifacts/cache, environments and protected environments
DevOps Foundation / DevSecOps Foundation CI vs CD vs continuous deployment, pipeline flow and feedback loops, the build→test→package→deploy lifecycle

Glossary

Next steps

You now hold the vendor-neutral mental model behind every CI/CD tool — the parts and how they fit. Next, specialise on the most popular tool with GitHub Actions, In Depth: Workflow Syntax, Events, Jobs, Runners, Contexts & Secrets, where each concept here gets its concrete GitHub spelling. Then move from anatomy to architecture with the companion CI/CD Pipeline Design: Stages, Quality Gates, Artifacts & Security Scans — how to place gates, promote artifacts and harden the supply chain. The Git and YAML for DevOps lessons underpin everything here if you need to shore up the foundations.

CI/CDPipelinesTriggersRunnersArtifactsEnvironments
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments