Certifications prove you can pass an exam. A portfolio proves you can build, and with Terraform that distinction is sharper than almost anywhere else — because Infrastructure as Code is, by definition, the artefact. The code is the work. After twenty-two years of hiring cloud and platform engineers, I can tell you what separates the candidates who get the offer: not the badge count, but a GitHub profile full of working, well-documented Terraform that a hiring manager reads in five minutes and thinks, this person could own our infrastructure tomorrow.
This lesson is the map to that profile: a deliberately ordered ladder of six Terraform projects, each rung harder than the last and each chosen because it demonstrates a cluster of skills hiring managers screen for — module design, state at scale, DRY multi-account topologies, policy-as-code, release engineering, and running a self-service platform. For every project you get the brief (what you build and why it matters), the tools, a build outline, the GitHub deliverable that makes it legible to a recruiter, and a copy-paste, quantified résumé bullet. We close with a GitHub presentation standard — an undocumented Terraform repo is, for hiring, almost worthless — and a mapping from each project to the certifications and roles it supports.
Build these in order. By the top you will have a portfolio that tells a coherent story: I can author reusable modules, manage state safely, keep multi-account environments DRY, enforce policy automatically, ship versioned releases, and operate a governed platform other engineers self-serve from. That story is the job.
Learning objectives
By the end of this lesson you can:
- Explain why a Terraform portfolio beats a certification in a hiring conversation, and what a reviewer inspects when they open your
.tffiles. - Build the six ladder projects — a published module, a multi-env 3-tier app, a Terragrunt multi-account setup, policy-as-code gates, a versioned registry, and an enterprise platform.
- Produce, for each project, a GitHub deliverable that is self-explanatory: a README, a
terraform-docstable, an architecture diagram, a CI badge, and a cost note. - Write quantified résumé bullets that survive a recruiter’s six-second scan and give an interviewer something concrete to probe.
- Present a Terraform repo to a GitHub presentation standard so a stranger can understand, trust, and reproduce it in minutes.
- Map each project to the certifications and roles it supports, so your portfolio and exam plan reinforce each other.
Prerequisites & where this fits
You should have worked through the bulk of the Terraform & DevOps Zero-to-Hero track (or have equivalent hands-on time): you are comfortable with HCL, providers and version pinning, the init → plan → apply → destroy workflow, remote state and locking, module authoring, and at least the basics of Terragrunt and CI/CD. This is the portfolio capstone of the course’s career track, sitting just before the final Terraform Associate certification prep kit, because the projects you build here become the raw material for your interview stories — and several of them double as exam practice.
You will need a free GitHub account, a cloud account (AWS, Azure, or GCP — every project notes free-tier-friendly choices and a teardown step), Terraform 1.x (or OpenTofu, the open-source fork — every project here runs unchanged on either; noting that in the README signals awareness of the post-1.6 licensing landscape), and the discipline to document as you go. Terragrunt is needed from rung 3 onward.
Core concepts: what a Terraform portfolio actually proves
Before the projects, internalise three ideas — they determine whether your effort converts into offers.
The code is the evidence — and reviewers read it. Unlike a generic app portfolio where recruiters only skim the README, a Terraform reviewer opens the HCL. Clean variable typing with validation blocks, sensible for_each, a pinned provider, no secrets, a tidy state strategy — these read directly as competence. So a plain module that is beautifully written beats a sprawling repo that is not.
Quantify everything you can and document as you build — both covered in the presentation standard below. With that settled, each rung adds a distinct, hireable skill cluster while reusing what came before:
| Rung | Project | New skill cluster it proves | Builds on |
|---|---|---|---|
| 1 | Published reusable module | HCL design, typed inputs, tests, docs, SemVer | — |
| 2 | Multi-env 3-tier app | Remote state, environment isolation, composition | Module from rung 1 |
| 3 | Terragrunt DRY multi-account | DRY config, dependencies, multi-account state | Modules + remote state |
| 4 | Policy-as-code gates in CI | Governance, OPA/Checkov/tfsec, plan gating | Modules + CI |
| 5 | Versioned module registry | Release engineering, terraform-docs, consumption | Module + tests |
| 6 | Enterprise IaC platform | Spacelift/Atlantis, OIDC, drift, observability | Everything above |
Reuse across rungs. Rung 1’s module becomes a building block for rung 2; rung 2’s environments are what Terragrunt makes DRY in rung 3; rung 4’s policies gate those plans; rung 5 publishes rung 1 properly; rung 6 orchestrates it all. A connected portfolio — one estate growing in maturity — tells a far stronger story than six disconnected toys, and mirrors how real infrastructure evolves.
Project 1 — Author and Publish a Reusable Module
The brief. Write one genuinely reusable Terraform module — a self-contained, well-typed, tested, documented unit that does one thing well (a VPC/VNet, a secure-by-default S3 bucket, a managed database) — and publish it with SemVer tags. This is the foundational rung because module authoring is the single most-tested Terraform skill in interviews and the daily work of any platform engineer. It forces you through typed inputs with validation, outputs, provider pinning inside a module, for_each/count/dynamic blocks, automated tests, and generated docs — the craft of writing HCL others can safely consume.
Tools. Terraform 1.x / OpenTofu · a cloud provider (AWS/Azure/GCP) · terraform-docs (auto-generated inputs/outputs table) · tflint + tfsec (lint and static security scan) · Terratest (Go) or the native terraform test framework (.tftest.hcl) · Git with SemVer tags · GitHub Actions for CI.
Build outline.
- Pick one resource with real configuration surface (a VPC/VNet with subnets is the classic). Lay out the standard module anatomy:
main.tf,variables.tf,outputs.tf,versions.tf,README.md, and anexamples/directory. - Make the inputs typed and validated — object/list types,
defaults,validationblocks (e.g. reject a non-CIDR),sensitive = truewhere appropriate — and expose useful outputs (IDs, ARNs). - Pin the provider inside the module with
required_providers, and usefor_each/dynamicblocks so it scales to N subnets/rules rather than being hard-coded. - Add a runnable example under
examples/and write tests: aterraform testsuite (.tftest.hcl) for plan-time assertions, or Terratest for a real apply-and-verify. - Generate the inputs/outputs table with terraform-docs (injected via README markers), run tflint and tfsec in CI, and tag
v1.0.0(thenv1.1.0on a feature) so consumers can pin a version.
GitHub deliverable. One repo named terraform-<provider>-<thing> (the registry convention, e.g. terraform-aws-vpc) with the module at the root, an examples/ folder, the test suite, and CI. The README leads with a one-line description, a usage snippet (a copy-paste module {} block), the terraform-docs table, an architecture diagram, and a cost note. A green CI badge (fmt/validate/tflint/tfsec/test) signals quality at a glance, and SemVer tags show you understand versioned releases.
Résumé bullet (copy-paste, then adapt the numbers).
Authored and published a production-grade, reusable Terraform module (typed/validated inputs,
for_eachscaling, provider pinning) with aterraform testsuite and tflint/tfsec gates in CI, auto-generated terraform-docs, and SemVer releases — cutting the boilerplate to provision a network from ~200 lines of HCL to a single 8-line module block.
Project 2 — A Multi-Environment 3-Tier App with Remote State
The brief. Deploy a three-tier application (load-balancer → compute → managed database) across dev, staging, and prod, each with isolated remote state and locking, composed from the rung-1 module plus a few others. This is the leap from “I can write a module” to “I can manage a multi-environment estate without stepping on my own state” — the most common gap in junior portfolios. It exercises the load-bearing operational skills: backend configuration, per-environment state isolation, root-module composition, and promoting the same code from dev to prod with only inputs changing.
Tools. Terraform 1.x / OpenTofu · a remote backend (S3 + DynamoDB lock, Azure Storage blob lease, or GCS) · the rung-1 module + community modules (terraform-aws-modules, Azure Verified Modules, terraform-google-modules) for the LB, compute, and database · tfvars per environment · GitHub Actions running fmt/validate/plan on PRs.
Build outline.
- Define a three-tier app from modules — a load balancer → an autoscaling group / VM scale set → a managed relational database (RDS / Azure Database / Cloud SQL). Reuse your rung-1 network module underneath.
- Configure a remote backend with state locking so concurrent applies can’t corrupt state, and choose separate state files per environment via distinct backend keys — explaining in the README why you avoided a single shared state blob.
- Keep the root configuration identical and vary only
dev.tfvars/staging.tfvars/prod.tfvars(sizes, counts, retention, tags), passing-var-fileand-backend-configper environment so promotion is “same code, different inputs.” - Mark sensitive values (DB passwords)
sensitiveand source them from the cloud’s secret store, never hard-coded. - Add a CI workflow running
terraform fmt -check,validate, and aplanon pull requests so reviewers see the diff before merge.
GitHub deliverable. A repo with a clear layout (modules/, environments/dev|staging|prod/), the backend configuration, and CI. README with an architecture diagram (LB → compute → DB across the network), a state-isolation explanation (the senior signal — why separate state per env), “deploy it yourself” steps, and a cost note (the database and compute are the spend — give an hourly figure and a terraform destroy reminder). State the number of environments you support from one codebase.
Résumé bullet.
Built and deployed a 3-tier application across dev/staging/prod on AWS using Terraform with isolated remote state (S3 backend + DynamoDB locking) and per-environment tfvars, composing reusable modules so the identical codebase promotes from dev to production by changing inputs only — provisioning a complete, state-safe environment in a single
apply.
Project 3 — A Terragrunt DRY Multi-Account Setup
The brief. Take the multi-environment estate from rung 2 and make it truly DRY across multiple cloud accounts using Terragrunt: generate the backend and provider blocks once instead of copy-pasting them per environment, wire dependencies so one unit consumes another’s outputs, and run the whole estate with a single run-all. This rung signals “I can operate Terraform at organisational scale,” because repetition is the enemy of a large estate and Terragrunt is how serious teams kill it. It exercises exactly what a platform interviewer probes: how you avoid backend/provider duplication, pass outputs between stacks, and manage many accounts without a wall of boilerplate.
Tools. Terraform 1.x / OpenTofu · Terragrunt (current 2026 release — note the move toward stacks/units) · separate cloud accounts/subscriptions (e.g. sandbox-dev and sandbox-prod) · remote state per account · the rung-1/2 modules as terraform { source = ... } references.
Build outline.
- Restructure into a Terragrunt hierarchy: a root
terragrunt.hclholding shared config, thenaccounts/<account>/<env>/<component>/terragrunt.hclleaf units that each setterraform { source = "..." }pointing at a versioned module. - Generate the backend and provider blocks once in the root via
remote_state {}andgenerate "provider" {}, with per-environment state keys frompath_relative_to_include()— so no leaf repeats backend/provider config. - Use
include { path = find_in_parent_folders() }in every leaf to inherit the root, andinputs = {}to supply only what differs — the DRY payoff made concrete. - Wire
dependencyblocks so the app unit consumes the network unit’svpc_id/subnet_ids, demonstrating values flowing between stacks. - Drive the whole estate with
terragrunt run-all plan/run-all applyacross both accounts, pinning module sources to SemVer Git tags so an account upgrades deliberately.
GitHub deliverable. A repo with the Terragrunt hierarchy, the root terragrunt.hcl, leaf units per account/env/component, and a README showing the folder tree prominently (the structure is the architecture here). Include an architecture diagram of the accounts and the dependency graph, a DRY before/after note (lines of duplicated config eliminated), a cost note, and the run-all destroy teardown. State the number of accounts and environments managed from one DRY codebase.
Résumé bullet.
Re-architected a multi-environment Terraform estate into a DRY Terragrunt hierarchy across 2 cloud accounts — generating backend/provider config once, deriving per-environment state keys, and wiring inter-stack
dependencyoutputs — eliminating ~70% of duplicated configuration and enabling whole-estate plan/apply via a singlerun-all.
Project 4 — Policy-as-Code Gates in CI
The brief. Put automated governance in front of every Terraform change: a CI pipeline that, on each pull request, generates a plan and runs policy-as-code checks that block the merge if the change is non-compliant — no public S3 buckets, no security groups open to 0.0.0.0/0, mandatory tags, allowed regions, no unencrypted volumes. This is the DevSecOps rung, and it is gold in interviews because most engineers can write infrastructure but far fewer can enforce standards on it automatically. It proves you think about guardrails, not just resources — the difference between an engineer and a platform engineer.
Tools. Terraform 1.x / OpenTofu · Checkov and tfsec/Trivy (out-of-the-box misconfiguration scanning of HCL and the plan) · OPA / Conftest with Rego (custom org-specific policy on the JSON plan) · GitHub Actions (or GitLab CI / Azure Pipelines) · the rung-2/3 estate as the thing under test.
Build outline.
- Stand up a CI workflow on every PR:
terraform init,validate, thenterraform plan -out=tfplanandterraform show -json tfplan > plan.jsonto produce a machine-readable plan. - Run Checkov and tfsec/Trivy against the config and plan for known misconfigurations; fail the build on
HIGH/CRITICALfindings. - Author custom policies in Rego and evaluate them with Conftest/OPA against
plan.json— encode a rule the scanners don’t cover (e.g. “every resource must carryownerandcost-centertags,” or “RDS must havestorage_encrypted = true”). - Make the gate blocking: configure the job as a required status check so a non-compliant plan cannot merge, and surface findings as a PR comment.
- Deliberately commit a violating change (a public bucket) on a branch and screenshot CI rejecting it, then the fix passing — proof the gate bites.
GitHub deliverable. A repo with the Terraform under test, a policy/ directory of Rego rules (and any tuned Checkov/tfsec config), and the pipeline YAML. README with a policy catalogue table (rule → tool → severity → what it prevents), an architecture diagram of the plan-then-gate flow, the before/after screenshots of a blocked vs passing PR, and the number of policies enforced. This repo is unusually high-signal — lead the README with the blocked-PR screenshot.
Résumé bullet.
Implemented policy-as-code governance for Terraform in CI — Checkov and tfsec/Trivy misconfiguration scanning plus 10+ custom OPA/Rego rules evaluated against the JSON plan — as a required, blocking PR gate, preventing public storage, open security groups, untagged, and unencrypted resources from ever reaching apply.
Project 5 — A Versioned Module Registry
The brief. Turn rung 1’s single module into a properly released, versioned, consumable registry — a small library of modules published with SemVer, auto-generated terraform-docs, an automated release pipeline (changelog + tag), and at least one consumer pinning a specific version. This is the release-engineering rung: it proves you understand modules not as one-off files but as versioned products with a publishing lifecycle, exactly how internal platform teams operate. It exercises the registry pattern (public Terraform Registry, private registry, or Git-ref pinning), SemVer discipline, and the automation that makes releases repeatable.
Tools. Terraform 1.x / OpenTofu · the public Terraform Registry (publish from a GitHub repo) or a private registry (Terraform Cloud, or one served from Git tags) · terraform-docs · semantic-release / conventional-commit tooling for automated SemVer + changelog · GitHub Actions · a consumer repo pinning version = "~> 1.2".
Build outline.
- Take your rung-1 module (ideally add a sibling to make it a library), ensure each follows registry naming/structure rules and has an
examples/directory the registry can surface. - Wire terraform-docs into CI so the README inputs/outputs table regenerates on every change and never drifts.
- Add an automated release workflow: conventional commits drive semantic-release (or tag-on-merge) to bump the SemVer version, write a changelog, and push a Git tag — so
feat:/fix:map deterministically to minor/patch bumps. - Publish: connect the repo to the public Terraform Registry (it indexes your tags), or stand up a private registry / document the Git-ref convention, so consumers reference
source = "<namespace>/<name>/<provider>"with a pinnedversion. - Create a separate consumer repo using the module with a pinned version constraint, and upgrade it one minor version — proving the release-and-consume loop end to end.
GitHub deliverable. A registry repo (or small mono-repo of modules) with terraform-docs tables, a generated CHANGELOG.md, the release workflow, and SemVer tags in the releases tab — plus a companion consumer repo showing real version-pinned usage. README with a versioning policy (what triggers major/minor/patch), a usage snippet with a pinned version, an architecture diagram of the publish→consume flow, and the number of modules and releases shipped.
Résumé bullet.
Established a versioned Terraform module registry — modules published to the Terraform Registry with auto-generated terraform-docs and a semantic-release pipeline mapping conventional commits to SemVer bumps and changelog entries — enabling consumer teams to pin and upgrade infrastructure modules deterministically across 5+ tagged releases.
Project 6 — An Enterprise IaC Platform
The brief. Tie the ladder together into a self-service, governed IaC platform: a managed Terraform automation backend (Spacelift or Atlantis) that runs plans on PRs and applies on merge, authenticates with OIDC keyless federation (no long-lived credentials), enforces the rung-4 policies as platform-level gates, runs drift detection on a schedule, and uses a Dynatrace or Datadog Terraform provider to manage monitors/dashboards as code so the platform observes itself. This is the summit: it proves you can not only build infrastructure but operate the system other engineers build infrastructure through — the staff/platform-lead remit.
Tools. Terraform 1.x / OpenTofu · Spacelift or Atlantis (PR-driven Terraform automation; Terraform Cloud is the managed alternative) · cloud OIDC federation (→ AWS IAM role / Azure federated credential / GCP Workload Identity Federation — no stored keys) · the rung-4 OPA/Checkov policies as platform policies · drift detection (Spacelift native, or scheduled plan) · the Datadog or Dynatrace Terraform provider for monitors/dashboards as code.
Build outline.
- Connect Spacelift (stacks tracking your repos) or self-host Atlantis (
atlantis plan/applyvia PR comment) so every change runs a plan on the PR and applies on merge — changes flow through review, not laptops. - Replace all static cloud credentials with OIDC: configure the platform to assume a cloud role via federated identity, so there are zero long-lived secrets anywhere in the system — say so loudly in the README.
- Wire the rung-4 policy-as-code checks as platform-level gates (Spacelift policies / OPA in the Atlantis workflow) so non-compliant plans are rejected centrally for every stack, not per-repo.
- Enable scheduled drift detection so the platform re-plans tracked stacks and alerts on divergence between code and live cloud — the operational-maturity signal most portfolios lack.
- Use the Datadog/Dynatrace Terraform provider to manage monitors and a dashboard as code (deployment frequency, plan/apply outcomes, drift alerts), closing the loop: the platform managing infrastructure also manages its own observability declaratively.
GitHub deliverable. A repo (or small set) with the Spacelift/Atlantis configuration, the OIDC federation setup, the platform policies, the drift-detection schedule, and the monitoring-as-code provider definitions. README with an architecture diagram of the full platform (Git → plan/apply automation → OIDC → cloud, with policy gate, drift loop, and observability provider), an explicit “zero long-lived credentials” statement, a screenshot of a drift alert or policy-blocked apply, and a cost note. Quantify the number of stacks managed, the drift-check cadence, and credentials eliminated.
Résumé bullet.
Built a self-service Terraform platform on Spacelift with PR-driven plan/apply, OIDC keyless cloud authentication (eliminating all long-lived credentials), OPA/Checkov policy gates enforced platform-wide, scheduled drift detection with alerting, and Datadog monitors managed as code — giving engineering teams governed, audited self-service across 10+ infrastructure stacks.
The diagram above shows the six projects as a rising ladder, with each rung’s headline tooling and the skill it proves — read it as the order to build in and the story your finished GitHub profile will tell, from a single tested module at the bottom to a governed multi-cloud platform at the top.
How to present each Terraform repo on GitHub: the presentation standard
A Terraform repo that is not legible to a stranger in five minutes is, for hiring purposes, half-finished. And because reviewers open the HCL, the standard here is stricter than for a generic app: the code must be as presentable as the README.
The README is the product, and the code is the proof. A reviewer reads the README, looks at the diagram, then opens main.tf and variables.tf. Lead with a one-sentence description of what the module/stack does, then a copy-paste usage snippet, the diagram, the terraform-docs table, how to run it, and a cost note — with the most impressive thing (a green CI badge, a SemVer release, a drift-alert screenshot) at the very top.
| README element | Why it matters | What “good” looks like |
|---|---|---|
| One-line summary | The six-second scan | “Reusable AWS VPC module — typed inputs, tested, SemVer-released.” |
| Usage snippet | Proves it’s consumable | A copy-paste module {} / terragrunt.hcl block with a pinned version |
terraform-docs table |
Shows discipline; never drifts | An auto-generated inputs/outputs table injected via markers |
| Architecture diagram | Shows you think in systems | A clean PNG/SVG of resources and data flow, embedded inline |
| CI status badge | Instant credibility | A green badge for fmt/validate/tflint/tfsec/plan/test |
| “Deploy it yourself” | Shows reproducibility | Exact init/plan/apply (or run-all) commands; assume a clean machine |
| Cost note + teardown | Signals cost-awareness (rare, valued) | “≈₹X/month; run terraform destroy (or run-all destroy) to stop the meter.” |
| SemVer tags / releases | Shows release engineering | Tagged releases with a changelog in the releases tab |
A few points deserve emphasis for Terraform specifically. Run terraform fmt and commit the result — unformatted HCL is the first red flag a reviewer notices; gate it with fmt -check in CI. Pin everything — provider versions, module sources to SemVer tags, the required Terraform version — because pinning reads as “this person has been burned by an unpinned upgrade and learned.” Include the terraform-docs table so inputs/outputs stay provably in sync with the code. Keep a .gitignore that excludes .terraform/, *.tfstate*, and crash logs — and never commit state or secrets: a terraform.tfstate in Git history often contains plaintext secrets and is an instant rejection. These projects teach remote state, OIDC, and secret managers precisely so there is nothing to leak. Finally, keep commits meaningful and pin your best repos to your profile.
Common mistakes & troubleshooting
| Symptom / mistake | Likely cause | Fix |
|---|---|---|
| Great HCL but no interview traction | No README, no diagram, no usage snippet — illegible to a non-author | Apply the presentation standard; lead with summary, usage block, diagram |
terraform.tfstate (or a secret) committed to Git |
No .gitignore; local state in repo |
Add .gitignore, purge the file from history, rotate any exposed secret, move to a remote backend |
| Résumé bullet reads “built infra with Terraform” and lands flat | No quantification | Add a number: lines of boilerplate cut, environments, policies, releases, stacks, credentials eliminated |
| Reviewer dings the code on sight | Unformatted HCL, unpinned providers | terraform fmt; pin required_providers and module sources; gate fmt -check in CI |
| Module isn’t actually reusable | Hard-coded names/regions, untyped vars | Parameterise with typed variables + validation; use for_each/dynamic; add an examples/ dir |
| Multi-env apply clobbers another environment | Shared single state across environments | Isolate state per environment (distinct backend keys); explain why in the README |
| Terragrunt repo still full of duplicated config | Backend/provider not generated centrally | Move remote_state/generate "provider" to the root and include it everywhere |
| Policy gate exists but doesn’t block | CI job not a required status check | Mark the policy job required; surface findings as a PR comment; prove with a blocked-PR screenshot |
| Trying to build all six at once and finishing none | Scope overload | Build strictly in ladder order; ship and document rung n before starting n+1 |
Best practices
- Build in order, finish each rung. A finished, documented, tested module beats four half-built repos — and the ladder is designed so each rung reuses the last.
- Automate and gate from project one. Every project runs
fmt/validate/planin CI; from rung 4 the plan is policy-gated. Click-ops teaches the console; automation teaches the job. - Pin relentlessly. Provider versions, the Terraform version, and module sources to SemVer tags — unpinned infrastructure is how estates break on a Friday, and reviewers know it.
- Document as you build. Write the README and run terraform-docs while the decisions are fresh; you will forget why you isolated state per environment a week later.
- Reuse across rungs. Rung 1 underpins rung 2; rung 2 goes DRY in rung 3; rung 4 gates them; rung 5 publishes the module; rung 6 orchestrates everything. One growing estate beats six toys.
- Quantify everything. Boilerplate eliminated, environments, policies, releases, drift cadence, credentials removed — put the number in the README and the résumé bullet.
- Note Terraform and OpenTofu. Stating a project runs on either signals you understand the post-1.6 fork — a current, senior detail.
Security notes
These projects are also a chance to demonstrate secure-by-default habits, which hiring managers weight heavily — and IaC is where insecure defaults do the most damage because they replicate across every environment:
- Never commit state or secrets. Remote state in a locked backend, secrets from a secret manager, a
.gitignoreexcluding*.tfstate*— a state file in public Git history is a hard, permanent rejection unless you rewrite history and rotate. This is why rung 2 sources DB passwords from a vault and rung 6 uses OIDC. - Prefer OIDC over stored credentials. Rung 6 makes this the centrepiece: no long-lived cloud keys anywhere, only short-lived federated tokens. Use it from rung 2 onward.
- Enforce policy-as-code, not policy-by-hope. Rung 4 makes “no public buckets, no open security groups, mandatory tags, encryption on” a blocking CI gate, not a wiki page nobody reads.
- Apply least privilege and secure module defaults. Scope cloud roles to the resource/account, not the org; make modules secure by default (encryption on, public access off) so consumers inherit safety unless they opt out.
Cost & sizing
The whole ladder is buildable for a modest amount if you are disciplined — and being disciplined is itself the lesson a cost-aware hiring manager looks for. The highest-leverage habit is to put a destroy / run-all destroy command in every README and actually run it between demos: every project bills on the resources that exist right now. Several rungs (the module, the policy/CI work, terraform-docs, OPA) cost essentially nothing because they are code and pipeline; the spend is the cloud resources you apply.
| Project | Main cost driver | Keep it cheap by |
|---|---|---|
| 1 — Reusable module | Near zero (apply only the example briefly) | Run the examples/ apply only to test, then destroy |
| 2 — 3-tier app | Managed database + compute | Smallest SKUs; destroy between sessions; burstable DB |
| 3 — Terragrunt multi-account | Same resources, now ×accounts | run-all destroy; keep accounts small; tear down at rest |
| 4 — Policy gates in CI | Near zero (CI minutes only) | Free CI tiers; plan-only, no apply needed to test policies |
| 5 — Module registry | Near zero (registry + CI) | Public registry is free; consumer apply only to verify |
| 6 — Enterprise platform | Spacelift tier / Atlantis host + cloud resources | Free Spacelift tier or a tiny Atlantis VM; minimal stacks |
Interview & exam questions
- “Why should I hire you over someone with more certifications?” — Certifications show I learned the material; my portfolio shows I can apply it. I’ve shipped six rungs — a tested published module, a multi-env estate with isolated state, a DRY Terragrunt multi-account setup, policy-as-code gates, a versioned registry, and a self-service platform — all in readable, pinned HCL you can open right now.
- “How do you keep state safe across environments?” — Each environment gets its own state file via a distinct backend key, with locking (DynamoDB / blob lease) so concurrent applies can’t corrupt it; the same root code promotes dev→prod by changing only the
tfvars— never one shared state for three environments. - “What makes a Terraform module genuinely reusable?” — Typed, validated inputs with sensible defaults, useful outputs, provider pinning inside the module,
for_each/dynamicfor N-scaling, a runnableexamples/dir, tests (terraform test/Terratest), terraform-docs, and SemVer releases so consumers can pin a version. - “What does Terragrunt give you that plain Terraform doesn’t?” — It removes duplication: generate the backend and provider blocks once and
includethem everywhere, derive per-environment state keys from the path, wiredependencyoutputs between units, and drive the estate withrun-all. - “How do you stop a non-compliant change reaching production?” — A policy-as-code gate in CI: generate the JSON plan, run Checkov/tfsec for known misconfigurations and OPA/Conftest for custom rules, and make the job a required, blocking status check so a public bucket can never merge.
- “How does your automation authenticate to the cloud without stored keys?” — OIDC federation: the CI/platform exchanges a short-lived token for a cloud role via a federated credential (GitHub→IAM role / Azure federated credential / GCP Workload Identity Federation), so there are no long-lived secrets to leak or rotate.
- “What is configuration drift and how do you handle it?” — Divergence between code/state and the live cloud (someone clicked a button). I run scheduled
plan-based drift detection that alerts on divergence, then reconcile deliberately — import or re-apply — rather than letting the next apply clobber it. - “
countvsfor_each— when each?” — Usefor_eachover a map/set when instances have stable identities (keyed by name) so removing one doesn’t reindex the others; usecountfor interchangeable N copies or an on/off toggle. Reindexing undercountcauses spurious destroy/recreate — the classic gotcha. - “How do you version and publish modules safely?” — Treat them as versioned products: conventional commits drive semantic-release to bump SemVer and write a changelog, terraform-docs keeps the table in sync, and the module is published to the Terraform Registry (or pinned Git tags) so consumers pin a
versionand upgrade deliberately. - “Spacelift/Atlantis/Terraform Cloud — what problem do they solve?” — They make changes flow through PR-driven plan/apply with review, OIDC auth, policy gates, state management, and drift detection — so applies happen in an audited pipeline, not on laptops. That’s the rung-6 platform layer.
- “Terraform vs OpenTofu — does it matter?” — OpenTofu is the open-source (MPL-2.0) fork created after Terraform moved to the BSL. My projects run unchanged on either; for some teams the licence matters, so fluency in both is the pragmatic stance.
Quick check
- Why is “the code is the evidence” especially true for a Terraform portfolio compared with a generic app portfolio?
- In the multi-environment project, what is the recommended way to keep dev, staging, and prod from corrupting each other’s state?
- What does Terragrunt let you generate once that you would otherwise copy into every environment?
- In the policy-as-code rung, what makes the gate actually prevent a bad change from merging?
- Name two things a Terraform reviewer reads as “this person has been burned and learned.”
Answers
- Because the deliverable is HCL and reviewers open and read the
.tffiles — clean typing, validation, pinning, no secrets, and a sound state strategy are read directly as competence, so code quality matters as much as the README. - Give each environment its own remote state file via a distinct backend key, with state locking, and promote the identical root code across environments by changing only
tfvars/backend config — never a single shared state. - The backend (
remote_state) and provider (generate "provider") blocks — defined once in the rootterragrunt.hclandincluded everywhere, with per-environment state keys derived from the path. - Making the policy job a required, blocking status check (and surfacing findings on the PR), so a plan that fails Checkov/tfsec/OPA cannot be merged — policy enforced, not merely suggested.
- Any two of: pinned provider/module/Terraform versions,
terraform fmt-clean code (ideallyfmt -checkin CI), a.gitignoreexcluding state/secrets, remote state with locking, and OIDC instead of stored credentials.
Exercise
Build Project 1, the published reusable module, end to end — and present it to the standard in this lesson. Follow the build outline: a single well-typed module (a VPC/VNet is the classic) with main.tf/variables.tf/outputs.tf/versions.tf, typed inputs with validation, useful outputs, provider pinning inside the module, for_each/dynamic for N-scaling, a runnable examples/ directory, and a terraform test (or Terratest) suite. Then do the things that turn working HCL into a hiring asset: wire CI to run fmt -check/validate/tflint/tfsec/the tests, inject a terraform-docs table into the README, write the README to the presentation standard (one-line summary, usage snippet, diagram, cost note with teardown, green CI badge), tag v1.0.0, and draft your quantified résumé bullet with real numbers.
When you can hand a stranger that repo URL and they understand, trust, and apply it in five minutes — and the HCL reads cleanly when they open it — you have completed the first rung, and a template for the other five.
Certification mapping
This portfolio is the practical complement to the Terraform and cloud-DevOps certification ladder; each project reinforces specific exams and roles.
| Project | Reinforces certification(s) | Target roles it supports |
|---|---|---|
| 1 — Reusable module | HashiCorp Terraform Associate (003) | Junior cloud / IaC engineer |
| 2 — Multi-env 3-tier app | Terraform Associate, AWS/Azure/GCP Associate | Cloud / DevOps engineer |
| 3 — Terragrunt multi-account | Terraform Associate, cloud DevOps pro (AZ-400 / DOP-C02) | Platform / DevOps engineer |
| 4 — Policy-as-code gates | cloud DevOps pro, DevSecOps practice | DevSecOps / platform engineer |
| 5 — Module registry | Terraform Associate (modules/registry domain) | Platform engineer, IaC lead |
| 6 — Enterprise platform | AZ-400 / DOP-C02 / GCP DevOps, Terraform Associate | Staff / platform / SRE lead |
Taken together, the six projects give you concrete, demonstrable evidence for the HashiCorp Terraform Associate (003) and branch credibly into the cloud DevOps professional exams (AWS DevOps Engineer DOP-C02, Azure DevOps Engineer AZ-400, Google Cloud DevOps Engineer) — exactly the spread a versatile IaC/platform engineer needs.
Glossary
- Module (Terraform) — a reusable, self-contained collection of
.tffiles with typed inputs and outputs that provisions a unit of infrastructure; the core building block of a portfolio. - Remote state — state stored in a shared backend (S3, Azure Storage, GCS) rather than locally, enabling collaboration and locking to prevent concurrent corruption.
- State isolation — keeping each environment’s state in a separate file/key so an apply in dev cannot affect staging or prod.
- Terragrunt — a thin wrapper over Terraform that keeps config DRY by generating backend/provider blocks, wiring dependencies between units, and running many units with
run-all. - Policy-as-code — encoding governance rules (no public buckets, mandatory tags, encryption) as code (OPA/Rego, Checkov, tfsec) and enforcing them automatically in CI.
- Checkov / tfsec (Trivy) / OPA-Conftest — static analysis for known misconfigurations (Checkov, tfsec) plus custom Rego policy on the JSON plan (OPA/Conftest).
- terraform-docs — auto-generates a module’s inputs/outputs table so the README never drifts from the code.
- SemVer — Semantic Versioning (
MAJOR.MINOR.PATCH); lets consumers pin a module version and upgrade deliberately. - OIDC federation — authenticating CI/automation to a cloud with a short-lived federated token instead of a stored long-lived credential.
- Drift detection — scheduled re-planning that surfaces divergence between code/state and the live cloud, so out-of-band changes are reconciled rather than clobbered.
- Spacelift / Atlantis / Terraform Cloud — platforms that run Terraform via PR-driven plan/apply with review, OIDC, policy gates, and drift detection — the self-service IaC platform layer.
- OpenTofu — the open-source (MPL-2.0) fork of Terraform created after the move to the BSL; broadly drop-in compatible with the 1.x workflow.
Next steps
You now have a build plan for a Terraform portfolio that tells a complete story — author, isolate, DRY-up, govern, release, operate — and the standard to make every repo legible to a hiring manager. Build the six in order, document and fmt as you go, and pin them to your profile.
- Next lesson: HashiCorp Terraform Associate (003) Prep Kit: Objectives, Practice Questions & Cheat Sheet — turn these projects into a passing score, with the objective-domain checklist, a practice-question bank with explained answers, and a one-page command/function cheat sheet.
Related reading to go deeper on individual rungs:
- Authoring Terraform Modules: Structure, Inputs/Outputs, Versioning & Publishing — the full craft behind rungs 1 and 5.
- Multi-Environment 3-Tier Infrastructure with Terragrunt & CI/CD Approval Gates — the deep version of rungs 2 and 3.
- OPA/Conftest Terraform Plan Policy Gates — the policy-as-code mechanics behind rung 4.
- The Terraform Architecting Ladder: From a Single Module to an Enterprise IaC Platform — the architecture rationale that mirrors this portfolio’s rise to rung 6.