Terragrunt Fundamentals: DRY Configurations, Remote State & Dependencies

You have learnt to write Terraform and to package it into reusable modules. Then you stood up your second environment, and your third, and somewhere around the fourth you noticed that every environment directory contains a near-identical backend block, a near-identical provider block, and a wall of variable wiring that has already started to drift apart because someone fixed a tag in prod and forgot staging. You are now copy-pasting infrastructure, which is the precise sin Infrastructure as Code was meant to abolish. The day you need to change the state-bucket naming convention, or bump the AWS provider, or add a default tag, you will edit it once per environment and miss one. Terragrunt exists to delete that duplication — to let you define the boilerplate once, generate it per environment, keep each environment’s inputs in a tiny file, and run a change across a whole tree of environments with one command and a correct dependency order.

This lesson is the on-ramp to Terragrunt for someone who already knows Terraform. We will be precise about what Terragrunt is (a thin orchestration wrapper around the terraform/tofu binary — not a separate IaC engine, not a new language to provision clouds), walk every block and function you will actually use, show how backend and provider config get generated rather than copied, explain the dependency graph and run --all, and pass outputs cleanly from one unit to another. We will also be honest about Terragrunt’s current direction — the move towards Stacks and units — so what you learn here is not stale the day after you read it. Throughout, a fictional regional logistics company, Frachtline, provides the running example: a four-engineer platform team running a fleet-tracking and routing platform across dev, staging, and prod, in two AWS accounts, who have just hit the copy-paste wall.

Learning objectives

By the end of this lesson you will be able to:

Explain what Terragrunt solves and, just as importantly, when not to reach for it.
Read and write every core Terragrunt block — terraform, include, remote_state, generate, inputs, dependency/dependencies, and before_hook/after_hook/error_hook — and the configuration functions (find_in_parent_folders, path_relative_to_include, get_env, read_terragrunt_config, and the get_* family).
Generate DRY backend and provider configuration once and have it materialise correctly in every unit.
Wire dependencies between units, pass outputs between them, and survive plan-time and greenfield applies with mock_outputs.
Drive a whole tree of units with terragrunt run --all and read the dependency graph it builds.
Lay out a repository that separates reusable modules from the live environment tree.
Describe Terragrunt’s current Stacks/units direction and run Terragrunt against OpenTofu.

Prerequisites

You should be comfortable with Terraform’s core workflow (init/plan/apply/destroy), with modules (inputs, outputs, calling a child module, pinning a Git or registry source), and with the idea of remote state and state locking — what a backend is and why concurrent writes corrupt state. If those are shaky, read Terraform Fundamentals: HCL, Providers, State & the Core Workflow and Authoring Terraform Modules: Structure, Inputs/Outputs, Versioning & Publishing first; this lesson assumes them. In the KloudVin Terraform & DevOps Zero-to-Hero course this is the Terragrunt module — the bridge between writing modules and the multi-environment 3-tier build that comes next. You need only a free local toolchain: Terraform (or OpenTofu) and the Terragrunt binary; the hands-on lab runs entirely with local-state stand-ins so it costs nothing and touches no cloud.

What Terragrunt is — and is not

Terragrunt is a thin wrapper that calls Terraform (or OpenTofu) under the hood. It is distributed as a single binary, terragrunt, written in Go by Gruntwork. When you run terragrunt apply, Terragrunt reads a terragrunt.hcl configuration file, does some preparation (generates files, downloads remote modules, resolves dependencies), and then shells out to terraform apply in a working directory it controls. Every Terraform concept you know is still in play underneath. Terragrunt adds orchestration on top; it does not replace the engine.

It is worth stating the negatives plainly, because newcomers over-attribute power to it:

Terragrunt is	Terragrunt is not
An orchestrator that calls `terraform`/`tofu`	A separate provisioning engine — it has no providers of its own
A way to keep backend/provider/input config DRY across many units	A new language for describing cloud resources (that is still HCL in your modules)
A dependency runner (`run --all`, `dependency` blocks)	A state store — state still lives in your Terraform backend (S3, GCS, Azure Blob, etc.)
A tool that generates Terraform files at runtime	A replacement for modules — you still write/consume normal Terraform modules
Useful once environment count makes repetition painful	Worth adding to a three-resource, single-environment project

The mental model to hold: modules are the reusable definition of infrastructure; the live tree is the per-environment instantiation; Terragrunt is the glue that instantiates a module per environment without copy-paste. Reach for it when the number of environments (or accounts, or regions) makes Terraform’s own repetition genuinely painful — Frachtline’s three environments across two accounts is right at the threshold; a single-environment side project is firmly below it and Terragrunt would be over-engineering.

Versions (2026). This lesson targets Terraform 1.x (1.13 at time of writing) and a current Terragrunt release. Two things have changed recently and matter: the orchestration command is now terragrunt run --all <cmd> (the older terragrunt run-all <cmd> still works but is deprecated, and the very old terragrunt apply-all style is gone), and Terragrunt is moving towards Stacks (terragrunt.stack.hcl with unit/stack blocks). Both are covered below. Everything here works identically against OpenTofu, the open fork of Terraform — set terraform_binary = "tofu" in the terraform block (or export TG_TF_PATH=tofu) and every command is unchanged.

The two problems, concretely

Before the blocks, see the duplication Terragrunt removes. In plain Terraform with a directory-per-environment layout, every leaf repeats this:

# live/prod/vpc/backend.tf  — and again in dev/, staging/, with one key changed
terraform {
  backend "s3" {
    bucket         = "frachtline-tfstate-prod"
    key            = "vpc/terraform.tfstate"
    region         = "ap-south-1"
    dynamodb_table = "frachtline-tflock"
    encrypt        = true
  }
}

# live/prod/vpc/provider.tf  — identical in every environment except the role ARN
provider "aws" {
  region = "ap-south-1"
  assume_role { role_arn = "arn:aws:iam::111111111111:role/terraform" }
  default_tags { tags = { managed_by = "terraform", env = "prod" } }
}

Two failure modes follow. (1) Boilerplate drift: there is no single source of truth for the backend or provider, so changing the bucket-naming scheme, the lock table, the provider version, or a default tag is an N-place edit and you will miss one. (2) Manual orchestration: to apply the whole prod environment in the right order (network before database before app), you cd into each directory and run apply by hand, remembering the order. Terragrunt’s generate/remote_state blocks fix (1); its dependency blocks and run --all fix (2).

Repository layout: live vs modules

Separate the definition of infrastructure (reusable modules) from its instantiation (the live tree). Modules can live in this repo or, better, in a versioned registry/Git ref; the live tree is environment-specific and changes constantly.

infra/
  modules/                         # reusable Terraform modules (or a separate versioned repo)
    vpc/                           #   main.tf / variables.tf / outputs.tf / versions.tf
    rds/
    app/
  live/
    root.hcl                       # the one root config every unit includes
    accounts.hcl                   # account-level locals (account id, role)  [optional]
    dev/
      env.hcl                      # env-level locals (env name, region, sizes)
      vpc/
        terragrunt.hcl             # a "unit": points at modules/vpc, declares inputs
      rds/
        terragrunt.hcl             # depends on vpc
      app/
        terragrunt.hcl             # depends on vpc + rds
    staging/
      env.hcl
      vpc/   …  rds/  …  app/  …
    prod/
      env.hcl
      vpc/   …  rds/  …  app/  …

Each leaf directory (dev/vpc, prod/rds, …) is a unit: one terragrunt.hcl that points at a module, supplies inputs, and includes the shared root. The path itself encodes the identity — live/prod/rds is “the RDS state for prod” — which is what lets a single root config compute the right backend key and the right environment inputs for every unit automatically. (Older versions named the root terragrunt.hcl; current guidance is to name it root.hcl to avoid the parent being mistaken for a unit. Either works; we use root.hcl.)

The blocks, one by one

Everything Terragrunt does is expressed in a small set of top-level blocks inside terragrunt.hcl. Here is the whole vocabulary, with what each is for.

Block	Purpose	Appears in
`terraform { source = … }`	Point this unit at a Terraform module (local path, Git ref, or registry)	Unit (and can be set in root)
`include "<name>" { path = … }`	Inherit configuration from a parent `terragrunt.hcl`/`root.hcl`	Unit
`remote_state { … }`	Declare the backend once; Terragrunt generates the `backend` block per unit	Root (inherited)
`generate "<name>" { … }`	Write an arbitrary `.tf` file into the unit at runtime (commonly the `provider`)	Root or unit
`inputs = { … }`	Supply variable values to the module (equivalent to a `.tfvars`)	Root and/or unit (merged)
`dependency "<name>" { config_path = … }`	Reference another unit and read its outputs	Unit
`dependencies { paths = [...] }`	Declare ordering-only edges (no output passing)	Unit
`before_hook` / `after_hook` / `error_hook`	Run commands around (or on failure of) a Terraform command	Root or unit
`locals { … }`	Local values, often loaded from shared `.hcl` files	Any

`terraform` — where the module comes from

The terraform block tells the unit which module to run and can wrap that module with hooks and extra arguments.

# live/prod/vpc/terragrunt.hcl
terraform {
  # Local path during development …
  source = "../../../modules//vpc"
  # … or a pinned remote ref in production (note the // separating repo from subdir):
  # source = "git::git@github.com:frachtline/infra-modules.git//vpc?ref=v1.4.0"
  # … or a registry module:
  # source = "tfr:///terraform-aws-modules/vpc/aws?version=5.8.1"
}

The double slash // matters: everything before it is the repository/archive Terragrunt downloads and caches; everything after is the subdirectory within it that is the actual module. Pin remote sources with ?ref= (Git tag/commit) or ?version= (registry) exactly as you would module versions — this is your reproducibility guarantee. The terraform block also accepts extra_arguments (inject -var-file, -lock-timeout, etc. into specific commands) and the hook sub-blocks covered below.

`include` — inheriting the root

include is how a unit pulls in the shared root configuration so it does not repeat backend, provider, or common inputs.

# live/prod/vpc/terragrunt.hcl
include "root" {
  path = find_in_parent_folders("root.hcl")
}

find_in_parent_folders("root.hcl") walks up the directory tree from the current unit until it finds root.hcl and returns its path — so the same include line works in every unit regardless of depth. The merge_strategy (default no_merge, or shallow/deep) controls how the parent’s inputs/generate/etc. combine with the child’s. You can have multiple named includes (e.g. a root include plus an env include or a region include) — this is how Terragrunt composes layered configuration, and a child can read an exposed parent via expose = true and the include.<name> reference.

`remote_state` — generate the backend once

This is the block that deletes backend duplication. Declare the backend once in the root; Terragrunt computes the per-unit key from the path and writes a backend block into each unit at init time.

# live/root.hcl
locals {
  account = read_terragrunt_config(find_in_parent_folders("accounts.hcl"))
  env     = read_terragrunt_config(find_in_parent_folders("env.hcl"))
}

remote_state {
  backend = "s3"
  generate = {
    path      = "backend.tf"
    if_exists = "overwrite_terragrunt"
  }
  config = {
    bucket         = "frachtline-tfstate-${local.env.locals.environment}"
    key            = "${path_relative_to_include()}/terraform.tfstate"
    region         = "ap-south-1"
    encrypt        = true
    dynamodb_table = "frachtline-tflock"   # state locking (use_lockfile = true for S3-native locking)
  }
}

The magic is key = "${path_relative_to_include()}/terraform.tfstate". path_relative_to_include() returns the unit’s path relative to the included parent — for live/prod/vpc that is prod/vpc, so the state key becomes prod/vpc/terraform.tfstate automatically. No unit hard-codes its own key; the path is the key. Terragrunt will even create the bucket and lock table on first run if they do not exist (handy for bootstrap; some teams disable this and provision the backend explicitly). The if_exists = "overwrite_terragrunt" setting means Terragrunt manages and overwrites the file it generated but will not clobber a hand-written backend.tf.

remote_state supports every Terraform backend (s3, gcs, azurerm, local, …) and a disable_init/disable_dependency_optimization set of toggles for edge cases. For the modern S3 backend you can drop the DynamoDB table and set use_lockfile = true to use S3-native conditional-write locking.

`generate` — DRY provider (and anything else)

remote_state is really a specialised generate. The general generate block writes any file into the unit at runtime — most commonly the provider, so it too lives in exactly one place.

# live/root.hcl  (continued)
generate "provider" {
  path      = "provider.tf"
  if_exists = "overwrite_terragrunt"
  contents  = <<-EOF
    provider "aws" {
      region = "${local.env.locals.aws_region}"
      assume_role { role_arn = "${local.account.locals.role_arn}" }
      default_tags {
        tags = {
          managed_by  = "terragrunt"
          environment = "${local.env.locals.environment}"
        }
      }
    }
    terraform {
      required_providers {
        aws = { source = "hashicorp/aws", version = "~> 5.60" }
      }
    }
  EOF
}

Now the provider, its version pin, and the default tags are defined once. Change the AWS provider version here and every unit picks it up on its next init. The if_exists choices are worth knowing: overwrite_terragrunt (manage only files Terragrunt generated — the safe default), overwrite (clobber any file), skip (never touch an existing file), and error (fail if the file exists). disable_signature and comment_prefix tune the header Terragrunt stamps on generated files.

`inputs` — supplying variables

inputs is a map that Terragrunt turns into TF_VAR_* environment variables for the underlying module — it is the Terragrunt equivalent of a .tfvars file. Inputs from an included root and from the unit merge, with the unit winning, so you put common defaults in the root (or an env.hcl) and per-unit specifics in the unit.

# live/prod/rds/terragrunt.hcl
include "root" { path = find_in_parent_folders("root.hcl") }
include "env"  { path = find_in_parent_folders("env.hcl"); expose = true }

terraform { source = "../../../modules//rds" }

inputs = {
  instance_class    = "db.r6g.large"     # prod-only override
  multi_az          = true
  allocated_storage = 200
}

`dependencies` and `dependency` — ordering and outputs

These two are easy to confuse and do different jobs:

	`dependencies`	`dependency "<name>"`
Shape	`dependencies { paths = ["../vpc"] }`	`dependency "vpc" { config_path = "../vpc" }`
Gives you	Ordering only — run that unit first	Ordering plus the unit’s outputs (`dependency.vpc.outputs.*`)
Use when	A unit must run after another but needs none of its outputs	You need to pass an output (VPC id, subnet ids, security-group id) into this unit

In practice you reach for dependency almost always, because the reason one unit follows another is usually that it consumes the first’s output:

# live/prod/app/terragrunt.hcl
include "root" { path = find_in_parent_folders("root.hcl") }
terraform { source = "../../../modules//app" }

dependency "vpc" {
  config_path = "../vpc"
}
dependency "rds" {
  config_path = "../rds"
  # Let plan/validate succeed before rds has ever been applied:
  mock_outputs = {
    endpoint = "mock-endpoint:5432"
  }
  mock_outputs_allowed_terraform_commands = ["validate", "plan"]
}

inputs = {
  vpc_id          = dependency.vpc.outputs.vpc_id
  private_subnets = dependency.vpc.outputs.private_subnet_ids
  db_endpoint     = dependency.rds.outputs.endpoint
}

To read a dependency’s outputs, Terragrunt runs terraform output on that unit’s state — which means the dependency must already be applied. That breaks two situations: a fresh greenfield apply (the dependency has no state yet) and a plan-only CI run on a brand-new unit. mock_outputs solves both: it supplies placeholder values that satisfy the configuration during the commands you list in mock_outputs_allowed_terraform_commands (typically validate and plan), while a real apply uses the real outputs. Use mocks for shape, not for values you actually depend on at apply time.

Hooks — `before_hook`, `after_hook`, `error_hook`

Hooks run shell commands around a Terraform command. They live in the terraform block and are perfect for cross-cutting concerns: a tflint pass before plan, a Slack ping after apply, a cleanup on error.

terraform {
  source = "../../../modules//app"

  before_hook "fmt_check" {
    commands = ["plan", "apply"]
    execute  = ["terraform", "fmt", "-check"]
  }
  after_hook "notify" {
    commands     = ["apply"]
    execute      = ["bash", "-c", "echo applied ${path_relative_to_include()}"]
    run_on_error = false
  }
  error_hook "diagnose" {
    commands = ["plan", "apply"]
    execute  = ["bash", "-c", "echo 'failed — capturing logs'"]
    on_errors = [".*"]
  }
}

commands selects which Terraform commands trigger the hook; run_on_error controls whether an after_hook still fires on failure; error_hook runs only on failure and matches on_errors regexes.

The configuration function reference

Terragrunt configuration is dynamic because of its built-in functions (it also supports all of Terraform’s HCL functions). These are the ones you will actually use:

Function	Returns / does	Typical use
`find_in_parent_folders("name")`	Path to the nearest ancestor file of that name	`include { path = find_in_parent_folders("root.hcl") }`
`path_relative_to_include()`	This unit’s path relative to the included parent	Compute the per-unit state `key`
`path_relative_from_include()`	The inverse — parent’s path relative to the unit	Build relative `source` paths
`get_env("VAR", "default")`	An environment variable (with default)	Inject CI-provided values/secrets without hard-coding
`read_terragrunt_config(path)`	Parse another `.hcl` file into an object	Load shared `accounts.hcl`/`env.hcl` locals
`get_terragrunt_dir()`	Absolute path of the current unit’s dir	Reference files next to the unit
`get_parent_terragrunt_dir()`	Absolute path of the dir holding the included parent	Anchor paths to the live-tree root
`get_aws_account_id()` / `get_aws_caller_identity_arn()`	The caller’s AWS account / ARN at runtime	Guardrails: assert you are in the right account
`get_terraform_command()` / `get_terraform_cli_args()`	The command being run / its args	Conditional hooks
`run_cmd("cmd", "args"...)`	Shell out and capture output (cached per args)	Pull a value from an external tool
`sops_decrypt_file(path)`	Decrypt a SOPS-encrypted file	Bring secrets in safely

Two cautions. First, prefer find_in_parent_folders with an explicit filename argument — recent Terragrunt deprecated the no-argument form that implicitly looked for terragrunt.hcl. Second, get_env and run_cmd make configuration depend on the environment it runs in; that is powerful for CI but means “the same code” can behave differently per machine, so document those dependencies.

Architecture overview

Terragrunt DRY architecture

The diagram shows the whole shape at once: a single root.hcl carrying the remote_state and generate blocks, a layered set of *.hcl locals files (accounts.hcl, env.hcl), the per-environment tree of units each include-ing that root, the dependency edges that order vpc → rds → app, and the generated backend.tf/provider.tf that Terragrunt materialises into each unit’s working directory before shelling out to Terraform against the remote state backend.

Passing outputs between units

The payoff of dependency is clean output passing without the brittle alternative — a Terraform remote_state data source hand-wired in every consumer. With Terragrunt the consumer simply reads dependency.<name>.outputs.<output>, and Terragrunt guarantees the producer ran first. A few rules keep this healthy:

Only expose what consumers need. A dependency’s entire output set is readable, but treat the outputs you actually consume as the unit’s public contract; keep it small and stable.
Mock for shape, apply for truth. mock_outputs exists to make validate/plan pass before the producer has state. Never list apply in mock_outputs_allowed_terraform_commands for a value you genuinely need, or you will apply against fake data.
Watch the optimisation. Terragrunt caches dependency outputs during a run --all for speed. If a producer changed in the same run, dependents see the new outputs because Terragrunt applies in graph order; outside run --all, a stale apply of only the consumer reads whatever is currently in the producer’s state.
Cross-environment reads are a smell. A prod unit reading a dev unit’s outputs almost always means a layering mistake; keep dependency edges within an environment.

run --all and the dependency graph

terragrunt run --all <command> is the orchestration headline. Point it at a directory and Terragrunt discovers every unit beneath it, builds a directed acyclic graph from the dependency/dependencies edges, and runs the command across the whole tree in dependency order (and in parallel where the graph allows).

# From live/prod — plan/apply the entire environment in the right order
terragrunt run --all plan
terragrunt run --all apply

# Visualise the graph Terragrunt computed (pipe to Graphviz)
terragrunt dag graph | dot -Tsvg > graph.svg

For apply/plan Terragrunt walks the graph leaves-last (producers before consumers: vpc, then rds, then app); for destroy it reverses the order automatically (app, then rds, then vpc) so nothing is torn down while something still depends on it. Useful flags:

Flag	Effect
`--terragrunt-include-dir` / `--terragrunt-exclude-dir` (or `--queue-include-dir`/`--queue-exclude-dir`)	Restrict the run to (or skip) specific units
`--terragrunt-parallelism N`	Cap how many units run concurrently
`--terragrunt-ignore-dependency-errors`	Keep going past a failed unit (use with care)
`-- <args>` (after `--`)	Pass raw args through to Terraform (e.g. `-- -lock-timeout=5m`)

run --all is the deprecated run-all’s successor; for a single unit you still just cd into it and run terragrunt plan/apply normally. A caution on run --all apply: because it applies many units non-interactively, treat it as a CI primitive with a reviewed plan, not a casual local command — an unreviewed run --all apply across prod is how accidents happen.

Terragrunt’s current direction: Stacks and units

The classic model above — a hand-built tree of terragrunt.hcl units wired by dependency — is stable and widely used, and is what most teams run today. Terragrunt is, however, evolving towards Stacks: a terragrunt.stack.hcl file declares a set of unit (and nested stack) blocks that Terragrunt generates into a .terragrunt-stack directory, so you describe a reusable bundle of units (a “VPC + RDS + app” stack) once and stamp it out per environment from values, rather than maintaining the directory tree by hand. The vocabulary you have learnt — terraform, include, remote_state, generate, dependency, the functions — carries straight over; Stacks add a higher-level packaging layer on top. It is worth knowing the term and the unit/stack/values shape so you recognise it in newer repos, but the unit-and-dependency fundamentals in this lesson remain the foundation and are not going away.

Hands-on lab

This lab uses local state and the null/random providers so it runs offline, costs nothing, and needs no cloud account — yet exercises every Terragrunt mechanism: include, remote_state (local backend), generate, inputs, dependency, mock_outputs, and run --all. You need terragrunt and terraform (or tofu) on your PATH.

1. Scaffold the modules and the live tree.

mkdir -p tg-lab/modules/network tg-lab/modules/app
mkdir -p tg-lab/live/dev/network tg-lab/live/dev/app
cd tg-lab

2. A network module that produces an output. Create modules/network/main.tf:

variable "cidr" { type = string }
resource "random_id" "vpc" { byte_length = 4 }
output "vpc_id" { value = "vpc-${random_id.vpc.hex}" }
output "cidr"   { value = var.cidr }

3. An app module that consumes it. Create modules/app/main.tf:

variable "vpc_id"   { type = string }
variable "replicas" { type = number }
resource "null_resource" "app" {
  triggers = { vpc_id = var.vpc_id, replicas = var.replicas }
}
output "summary" { value = "app in ${var.vpc_id} x${var.replicas}" }

4. The DRY root. Create live/root.hcl — backend and provider generated once:

remote_state {
  backend = "local"
  generate = { path = "backend.tf", if_exists = "overwrite_terragrunt" }
  config  = { path = "${get_terragrunt_dir()}/terraform.tfstate" }
}
generate "provider" {
  path      = "versions.tf"
  if_exists = "overwrite_terragrunt"
  contents  = <<-EOF
    terraform {
      required_providers {
        random = { source = "hashicorp/random" }
        null   = { source = "hashicorp/null" }
      }
    }
  EOF
}
inputs = { environment = "dev" }

5. The two units. Create live/dev/network/terragrunt.hcl:

include "root" { path = find_in_parent_folders("root.hcl") }
terraform { source = "../../../modules//network" }
inputs = { cidr = "10.10.0.0/16" }

Create live/dev/app/terragrunt.hcl — note the dependency and mock_outputs:

include "root" { path = find_in_parent_folders("root.hcl") }
terraform { source = "../../../modules//app" }

dependency "network" {
  config_path  = "../network"
  mock_outputs = { vpc_id = "vpc-mock0000" }
  mock_outputs_allowed_terraform_commands = ["validate", "plan"]
}

inputs = {
  vpc_id   = dependency.network.outputs.vpc_id
  replicas = 3
}

6. Plan the whole environment. From live/dev, watch Terragrunt build the graph and plan network before app, using the mock vpc_id for app because network has no state yet:

cd live/dev
terragrunt run --all plan

Expected: two plans; the app plan shows vpc_id = "vpc-mock0000" (the mock), proving plan-time mocking works before any apply.

7. Apply the whole environment in dependency order.

terragrunt run --all apply --terragrunt-non-interactive

Expected: network applies first; then app applies reading the real vpc-... output (not the mock). Confirm a backend.tf and versions.tf were generated into each unit:

ls dev/network/.terragrunt-cache/*/*/backend.tf dev/network/versions.tf

8. Read the dependency graph (optional, needs Graphviz).

terragrunt dag graph

Expected: an edge from app to network, confirming the order Terragrunt enforced.

9. Cleanup. Destroy in reverse order, then delete the lab:

terragrunt run --all destroy --terragrunt-non-interactive
cd ../..
rm -rf tg-lab

Cost note: zero. The lab uses the local backend and the null/random providers — nothing is created in any cloud, so there is nothing to bill and the only cleanup is deleting the directory.

Common mistakes & troubleshooting

Symptom	Cause	Fix
`Could not find any terragrunt.hcl / root.hcl in parent folders`	`include` path wrong, or root file misnamed	Pass the exact filename to `find_in_parent_folders("root.hcl")`; ensure the root actually sits in an ancestor directory
`dependency ... has not been applied yet` on plan	Reading a dependency’s outputs before it has state	Add `mock_outputs` + `mock_outputs_allowed_terraform_commands = ["validate","plan"]`
Generated `backend.tf`/`provider.tf` keeps getting overwritten unexpectedly	A hand-written file collides with a `generate`/`remote_state` block	Use `if_exists = "skip"` to protect a hand-written file, or delete it and let Terragrunt own it
Two units write to the same state key	`key` hard-coded instead of derived	Use `key = "${path_relative_to_include()}/terraform.tfstate"` so the path drives the key
`Error: Cycle:` during `run --all`	Circular `dependency` edges	Break the cycle; dependencies must form a DAG — re-layer so producers never depend on consumers
`run --all apply` applies things in the wrong order	An edge expressed as a comment/`inputs` reference Terragrunt can’t see	Make the edge explicit with a `dependency` or `dependencies` block
Stale outputs after changing a producer	Dependency-output caching	Re-run via `run --all` (graph order refreshes), or apply the producer then the consumer
Works locally, fails in CI with auth/region differences	Config depends on `get_env`/local creds	Document and set the required env vars in CI; assert account with `get_aws_account_id()`

Best practices

Name the root root.hcl, not terragrunt.hcl. It prevents the parent being discovered as a unit and makes find_in_parent_folders("root.hcl") unambiguous.
Generate backend and provider once, in the root. That single source of truth is the entire point — never copy a backend or provider into a unit.
Let the path be the identity. Derive state keys with path_relative_to_include(); never hard-code a key.
Pin module sources. Use ?ref=<tag>/?version= on source; treat an unpinned source as a production incident waiting to happen.
Layer locals. Put account-level facts in accounts.hcl, environment facts in env.hcl, and read them with read_terragrunt_config; keep units tiny (a source, a dependency or two, and overrides).
Prefer dependency over hand-wired remote_state data sources for cross-unit values — it gives you ordering for free and one place to mock.
Treat run --all apply as a CI primitive. Apply per-environment from a pipeline with a reviewed plan; avoid casual whole-tree applies against prod.
Keep dependency edges within an environment. Cross-environment reads are almost always a layering bug.
Format and validate. terragrunt hclfmt (formats .hcl) plus terraform fmt/validate via hooks keeps the tree clean.

Security notes

State is sensitive. Terragrunt does not change where state lives — it is still your Terraform backend, and it can contain resource metadata and sometimes secrets. Use an encrypted, access-controlled, locked backend (S3 + use_lockfile/DynamoDB, GCS, or Azure Blob); never the local backend for anything real (the lab’s local backend is for offline learning only).
Generated files can leak secrets. Anything you interpolate into a generate "provider" block’s contents (and into inputs) ends up in a .tf/env on disk in the working directory. Pull secrets from a secrets manager or sops_decrypt_file/get_env at runtime; never commit them, and add .terragrunt-cache/ and generated files to .gitignore.
run_cmd and hooks execute arbitrary shell. A malicious or careless terragrunt.hcl can run anything during init/plan. Review changes to root/unit config like you review code, and be wary of running untrusted Terragrunt repos.
Assert the blast radius. Use get_aws_account_id()/get_aws_caller_identity_arn() to fail fast if a unit is being applied against the wrong account — a cheap guardrail against fat-fingered prod applies.
Least-privilege per environment. Generate a per-environment assume_role in the provider so dev credentials cannot touch prod state or resources.

Interview & exam questions

Is Terragrunt a replacement for Terraform? No. It is a thin wrapper that orchestrates the terraform/tofu binary — no providers, no state store of its own. It adds DRY config generation and dependency-aware multi-unit runs on top of Terraform.
What problem does remote_state solve and how does it stay DRY? It declares the backend once (usually in the root) and generates a backend.tf into each unit at init, computing the per-unit key from path_relative_to_include(). One source of truth replaces an N-place edit.
dependencies vs dependency — what’s the difference? dependencies { paths = [...] } declares ordering only. dependency "<name>" { config_path = ... } declares ordering and exposes the target unit’s outputs as dependency.<name>.outputs.*. Use dependency when you need outputs (almost always).
Why would a plan fail with “dependency has not been applied yet,” and how do you fix it? Terragrunt reads a dependency’s outputs from its state, which doesn’t exist before the producer is applied. Add mock_outputs plus mock_outputs_allowed_terraform_commands = ["validate","plan"] so plan/validate use placeholders while apply uses real outputs.
What does path_relative_to_include() return and why is it load-bearing? The current unit’s path relative to the included parent (e.g. prod/vpc). It lets a single root config derive a unique state key per unit so no unit hard-codes its own key.
What does find_in_parent_folders do, and what changed recently? It returns the path to the nearest ancestor file of the given name. Recent Terragrunt deprecated the no-argument form — always pass the filename, e.g. find_in_parent_folders("root.hcl").
How does run --all decide order, and what happens on destroy? It builds a DAG from dependency/dependencies edges and runs producers before consumers; for destroy it reverses the order so nothing is destroyed while something still depends on it.
How do you generate a DRY provider, and what does if_exists control? With a generate "provider" block whose contents is the provider HCL, in the root. if_exists controls collision behaviour: overwrite_terragrunt (manage Terragrunt-generated files — the safe default), overwrite, skip, or error.
When is Terragrunt the wrong tool? For a single environment / small project where Terraform’s own repetition isn’t yet painful — Terragrunt adds a layer and a learning curve that buys nothing there. It is justified by environment/account/region multiplicity.
Does Terragrunt work with OpenTofu? Yes — set terraform_binary = "tofu" (or TG_TF_PATH=tofu). Terragrunt orchestrates either engine identically.
What are Terragrunt Stacks? The newer direction: a terragrunt.stack.hcl declares unit/stack blocks that Terragrunt generates into a .terragrunt-stack tree, letting you stamp out a reusable bundle of units per environment from values, on top of the same block/function fundamentals.
How do hooks work, and name the three kinds. Hooks run shell commands around Terraform commands inside the terraform block: before_hook (before a command), after_hook (after, with optional run_on_error), and error_hook (only on failure, matching on_errors).

Quick check

True or false: Terragrunt stores Terraform state in its own database.
Which function gives a unit its path relative to the included parent, so you can build the state key?
You need a unit to run after another and read its vpc_id — which block?
What two things must you set so terragrunt run --all plan succeeds before a dependency has ever been applied?
What is the current, non-deprecated command to apply a whole tree of units in dependency order?

Answers

False. State lives in your Terraform backend (S3/GCS/Azure Blob/etc.); Terragrunt only orchestrates and generates the backend config.
path_relative_to_include() — used as key = "${path_relative_to_include()}/terraform.tfstate".
dependency "<name>" { config_path = ... } — it gives ordering and dependency.<name>.outputs.vpc_id. (dependencies would give ordering only.)
mock_outputs = { ... } and mock_outputs_allowed_terraform_commands = ["validate","plan"] on the dependency block.
terragrunt run --all apply (the older terragrunt run-all apply is deprecated).

Exercise

Take the lab’s dev tree and promote it to a real multi-environment shape:

Add a staging and a prod copy of the network+app units, and introduce an env.hcl per environment holding environment, aws_region, and an app replicas value (e.g. dev=1, staging=2, prod=3). Have the root read it with read_terragrunt_config and feed replicas from there so the only per-environment difference lives in env.hcl.
Switch the remote_state backend from local to s3 (or your cloud’s backend), deriving the key from path_relative_to_include() and the bucket name from the env locals — confirm each unit lands at a distinct, path-derived state key.
Add a third unit, db, between network and app; wire app to depend on both network and db, give db a mock_outputs.endpoint, and prove with terragrunt dag graph that the order is network → db → app and that destroy reverses it.
Add a before_hook that runs terraform fmt -check on plan/apply, and an after_hook that prints the applied unit’s relative path. Confirm both fire during run --all apply.

Success looks like: one root.hcl, one env.hcl per environment, tiny units, path-derived state keys, a correct three-node graph, and not a single copied backend/provider block anywhere.

Certification mapping

This lesson supports the HashiCorp Certified: Terraform Associate (003) objectives — though note Terragrunt is a third-party tool and the exam tests Terraform concepts; Terragrunt is the production wrapper that exercises those concepts at scale:

Objective 4 (Terraform modules) and Objective 8 (read/write configuration): Terragrunt is module consumption and config DRY-ness taken to production — source pinning, inputs, and composition.
Objective 7 (state): remote_state generation, per-unit state keys, locking, and backend choice are the exam’s remote-state/locking topics applied for real.
Objective 5 (core workflow): run --all plan/apply/destroy is the core workflow orchestrated across many units; understand how it differs from a single-directory apply.
Cloud DevOps certs (AWS DevOps Engineer DOP-C02, Azure DevOps Engineer AZ-400, Google Cloud DevOps Engineer) test multi-environment IaC and promotion patterns where this Terragrunt layout is directly applicable.

For the exam itself, be crisp on the plain-Terraform equivalents Terragrunt wraps: remote backends and locking, the terraform_remote_state data source (the manual alternative to dependency), workspaces vs directory-per-environment, and module source pinning.

Glossary

Terragrunt — a thin Go wrapper that orchestrates Terraform/OpenTofu, adding DRY config generation and dependency-aware multi-unit runs.
Unit — one leaf directory with a terragrunt.hcl that points at a module, supplies inputs, and includes the root; the smallest thing Terragrunt applies.
Root config (root.hcl) — the shared parent config (backend, provider, common inputs) that every unit includes.
remote_state block — declares the backend once and generates a backend.tf per unit, with a path-derived state key.
generate block — writes an arbitrary .tf file (commonly the provider) into a unit at runtime; remote_state is a specialised form of it.
dependency block — references another unit and exposes its outputs as dependency.<name>.outputs.*, with ordering implied.
dependencies block — declares run-order edges only, no output passing.
mock_outputs — placeholder outputs that satisfy a dependency during listed commands (usually validate/plan) before the producer has state.
path_relative_to_include() — a unit’s path relative to its included parent; used to derive a unique state key.
find_in_parent_folders("name") — returns the path to the nearest ancestor file of that name.
run --all — runs a command across every unit in a tree in dependency-graph order (reverse for destroy).
DAG — directed acyclic graph; the ordering Terragrunt builds from dependency edges. Cycles are an error.
Stacks / unit/stack blocks — Terragrunt’s newer packaging layer (terragrunt.stack.hcl) that generates a reusable bundle of units per environment.
OpenTofu — the open-source fork of Terraform; Terragrunt drives it via terraform_binary = "tofu".

Next steps

You can now keep a multi-environment Terraform estate DRY: backend and provider generated once, units wired by dependency, and a whole tree applied in order with run --all. Next, put it to work end to end in Multi-Environment 3-Tier Infrastructure with Terragrunt & CI/CD Approval Gates, where you compose app modules from a shared library and promote dev → uat → staging → prod behind approval gates. For the failure modes, see Terraform Troubleshooting: State, Providers, Drift, Dependencies & Debugging, and to place Terragrunt on the broader maturity curve read The Terraform Architecting Ladder: From a Single Module to an Enterprise IaC Platform. If you are deciding whether to adopt it at all, Terraform vs Terragrunt vs Ansible vs Pulumi: Which IaC Tool, When? frames the trade-off.

Terragrunt Fundamentals: DRY Configurations, Remote State & Dependencies

Learning objectives

Prerequisites

What Terragrunt is — and is not

The two problems, concretely

Repository layout: live vs modules

The blocks, one by one

`terraform` — where the module comes from

`include` — inheriting the root

`remote_state` — generate the backend once

`generate` — DRY provider (and anything else)

`inputs` — supplying variables

`dependencies` and `dependency` — ordering and outputs

Hooks — `before_hook`, `after_hook`, `error_hook`

The configuration function reference

Architecture overview

Passing outputs between units

run --all and the dependency graph

Terragrunt’s current direction: Stacks and units

Hands-on lab

Common mistakes & troubleshooting

Best practices

Security notes

Interview & exam questions

Quick check

Answers

Exercise

Certification mapping

Glossary

Next steps

Written by Vinod

Comments

Keep Reading

The Terraform Architecting Ladder: From a Single Module to an Enterprise IaC Platform

HashiCorp Terraform Associate (003) Prep Kit: Objectives, Practice Questions & Cheat Sheet

Terraform Fundamentals: HCL, Providers, State & the Core Workflow

Terragrunt Fundamentals: DRY Configurations, Remote State & Dependencies

Learning objectives

Prerequisites

What Terragrunt is — and is not

The two problems, concretely

Repository layout: live vs modules

The blocks, one by one

terraform — where the module comes from

include — inheriting the root

remote_state — generate the backend once

generate — DRY provider (and anything else)

inputs — supplying variables

dependencies and dependency — ordering and outputs

Hooks — before_hook, after_hook, error_hook

The configuration function reference

Architecture overview

Passing outputs between units

run --all and the dependency graph

Terragrunt’s current direction: Stacks and units

Hands-on lab

Common mistakes & troubleshooting

Best practices

Security notes

Interview & exam questions

Quick check

Answers

Exercise

Certification mapping

Glossary

Next steps

Written by Vinod

Comments

Keep Reading

The Terraform Architecting Ladder: From a Single Module to an Enterprise IaC Platform

HashiCorp Terraform Associate (003) Prep Kit: Objectives, Practice Questions & Cheat Sheet

Terraform Fundamentals: HCL, Providers, State & the Core Workflow

`terraform` — where the module comes from

`include` — inheriting the root

`remote_state` — generate the backend once

`generate` — DRY provider (and anything else)

`inputs` — supplying variables

`dependencies` and `dependency` — ordering and outputs

Hooks — `before_hook`, `after_hook`, `error_hook`