Terraform Lesson 20 of 57

Terragrunt Configuration, In Depth: Every Block, Function & Hook in terragrunt.hcl

You already know why Terragrunt exists — to delete the copy-pasted backend, provider, and input wiring that breeds across environments — and you have stood up a live tree, wired a couple of dependency blocks, and run run --all against it. This lesson is the other half of that knowledge: a precise, field-by-field reference for the terragrunt.hcl configuration language itself. Not “here is the block you usually use,” but every block Terragrunt understands, every attribute inside each block, what its default is, what each accepted value does, and the gotcha that bites people. The same exhaustive treatment for the built-in function catalogue — exact signatures, every argument, what each returns — and for hooks, the errors retry/ignore machinery, and the generate-versus-init mechanics that make the magic work.

Think of this as the terragrunt.hcl equivalent of a language spec written for working engineers. The two companion lessons stay deliberately at a working altitude: Terragrunt Fundamentals teaches the blocks well enough to be productive, and Scaling Terragrunt Monorepos teaches orchestration at 200 units. Neither enumerates every attribute, because that would bury the narrative. Here we do exactly that enumeration, so when you hit include_in_copy, mock_outputs_merge_strategy_with_state, if_disabled, the errors block, or read_tfvars_file in a real repository — or an interviewer asks “what does path_relative_from_include return and when would you use it?” — you have the complete map. Everything targets a current Terragrunt release (2026, on the road to 1.0), Terraform 1.9+/OpenTofu, and works identically against either engine.

Learning objectives

By the end of this lesson you will be able to:

Prerequisites

This is an advanced reference, so it assumes the working knowledge from Terragrunt Fundamentals: DRY Configurations, Remote State & Dependencies — what a unit is, the live/modules split, that remote_state and generate write .tf files Terragrunt then hands to the engine, and the basic shape of include and dependency. It also assumes solid Terraform: HCL syntax (blocks, arguments, expressions, for, the type system), backends and state locking, and module sources/versioning. If the orchestration side is what you are after — the DAG, run --all/run --graph, --filter-affected, parallelism, CI — read Scaling Terragrunt Monorepos with Dependency Graphs and run-all instead; this lesson deliberately stays inside the config file and points there for execution. In the KloudVin Terraform & DevOps Zero-to-Hero course this sits in the Terragrunt module as the deep reference between the fundamentals and the multi-environment capstone. You need only a free local toolchain — terragrunt plus terraform or tofu — and the hands-on lab runs entirely on the local backend, so it costs nothing and touches no cloud.

Core concepts: how Terragrunt reads a config

Before the field tables, hold three mechanics in your head, because every attribute below makes sense only against them.

1. Terragrunt parses HCL, then shells out. A terragrunt.hcl is HCL — same lexer, same expression language, same type system as Terraform. Terragrunt evaluates it (resolving includes, locals, functions, and dependency outputs), uses the result to prepare a working directory (download the module named by source, write generated files, fetch dependency outputs into -var values), and then runs terraform/tofu inside that directory. The config language is therefore a preparation language; it never provisions anything itself.

2. The working directory is a copy, in a cache. When terraform { source = ... } points at a module, Terragrunt copies that module into .terragrunt-cache/<hash>/<hash>/ and runs there — not in your unit directory. This is why generated backend.tf/provider.tf land in the cache, why include_in_copy/exclude_from_copy exist (to control what extra files come along), and why get_terragrunt_dir() (your unit) differs from get_working_dir() (the cache).

3. Evaluation order matters. locals are computed first and can read functions and other locals. include brings in a parent and (with expose = true) its evaluated contents. dependency blocks run the target unit’s output to fetch values. inputs is assembled last and exported as TF_VAR_*. Knowing this order explains why you can reference local.x in inputs but must use dependency.x.outputs.* (not a local) for cross-unit values, and why a mock_outputs exists at all (the producer may have no state yet when its outputs are read).

A quick map of the whole vocabulary, grouped by what it does, before we take each in turn:

Group Blocks
Point at and wrap the module terraform (+ before_hook/after_hook/error_hook, extra_arguments)
Generate files remote_state, generate
Compose config include, locals, inputs
Wire units together dependency, dependencies
Control runs & errors errors (retry/ignore), exclude, feature
Package units (Stacks) unit, stack (in terragrunt.stack.hcl)
Discovery & engine catalog, engine

The terraform block: every attribute

The terraform block tells a unit which module to run and wraps that module with hooks and extra CLI arguments. It is the one block almost every unit has.

Attribute / sub-block Type Default What it does · gotcha
source string The module to run: local path, Git (git::...//subdir?ref=), registry (tfr:///ns/name/aws?version=), or generic getter (S3/GCS/HTTP). The // splits the downloaded archive from the subdir inside it. Pin remote sources with ?ref=/?version= — an unpinned source is a reproducibility incident.
extra_arguments block(s) Inject CLI args into specific commands (a -var-file, -lock-timeout, -parallelism). Has its own attributes (below).
before_hook "<name>" block(s) Run a command before the listed Terraform commands.
after_hook "<name>" block(s) Run a command after the listed commands.
error_hook "<name>" block(s) Run a command only when a command errors and the error matches on_errors.
include_in_copy list(string) [] Extra glob patterns of files to copy from the unit dir into the working dir alongside the module (e.g. a .tflint.hcl or a *.tpl the module reads). Terragrunt copies a limited set by default; this widens it.
exclude_from_copy list(string) [] Glob patterns to skip when copying — the inverse, to keep big or secret files out of the cache.
copy_terraform_lock_file bool true Whether to copy .terraform.lock.hcl into the working dir. Set false for remote sources where you do not want the provider-hash lock travelling with the copy.
mutable bool false Allow the cached working-dir copy to be edited in place between runs (an experimental performance/iteration aid). Leave off for reproducible runs.

extra_arguments — its own attributes

extra_arguments is a named sub-block that conditionally appends CLI flags. It is how you attach a -var-file to only plan/apply, or a default -lock-timeout everywhere.

Attribute What it does
commands The Terraform subcommands these args apply to (e.g. ["plan", "apply"], or get_terraform_commands_that_need_vars() to cover all var-taking commands).
arguments A static list of flags to append (e.g. ["-lock-timeout=20m"]).
required_var_files .tfvars files appended as -var-file=... that must exist (error if missing).
optional_var_files .tfvars files appended only if they exist (great for an optional per-region overrides file).
env_vars A map of environment variables to set for those commands.
terraform {
  source = "git::git@github.com:frachtline/infra-modules.git//rds?ref=v1.4.0"

  extra_arguments "common_vars" {
    commands           = get_terraform_commands_that_need_vars()  # all var-taking cmds
    required_var_files = ["${get_parent_terragrunt_dir()}/common.tfvars"]
    optional_var_files = ["${get_terragrunt_dir()}/override.tfvars"]
    arguments          = ["-lock-timeout=20m"]
  }
}

Hooks: before_hook, after_hook, error_hook — exhaustively

Hooks run arbitrary shell commands around a Terraform command. They live inside the terraform block. Use them for cross-cutting concerns: a tflint/policy check before plan, a notification after apply, a diagnostic dump on failure. All three kinds share the same attributes (with on_errors being meaningful only on error_hook).

Attribute Applies to Default What it does · gotcha
commands all — (required) List of Terraform subcommands that trigger the hook — ["plan"], ["apply"], ["init"], etc. A before_hook on ["plan","apply"] fires before each.
execute all — (required) The command + args as a list: ["bash", "-c", "..."] or ["tflint"]. It is exec-style, not a shell string — wrap shell features in ["bash","-c", "..."].
working_dir all the working dir (cache) Directory to run the command in. Default is the module’s working copy; set it to get_terragrunt_dir() to run against your actual unit.
run_on_error before/after false If true, the hook still runs even when a previous hook or the Terraform command failed. The usual way to make an after_hook fire on both success and failure.
suppress_stdout all false Swallow the hook command’s stdout (keep noisy linters out of the run log).
if all true A boolean expression; the hook is skipped when it evaluates false. Gate hooks by environment, command, or a feature flag.
on_errors error_hook only List of regex patterns; the error_hook runs only when the error message matches one (use [".*"] for “any error”).
terraform {
  source = "../../../modules//app"

  # Lint the module before any plan/apply; suppress its chatter.
  before_hook "tflint" {
    commands        = ["plan", "apply"]
    execute         = ["tflint", "--chdir", get_terragrunt_dir()]
    suppress_stdout = true
  }

  # Notify only on a successful apply (run_on_error defaults to false).
  after_hook "notify_success" {
    commands     = ["apply"]
    execute      = ["bash", "-c", "echo applied ${path_relative_to_include()}"]
    run_on_error = false
  }

  # Always run cleanup after apply, success or failure.
  after_hook "cleanup" {
    commands     = ["apply"]
    execute      = ["bash", "-c", "rm -f /tmp/${path_relative_to_include()}.lock"]
    run_on_error = true
  }

  # Capture diagnostics only when plan/apply errors.
  error_hook "diagnose" {
    commands  = ["plan", "apply"]
    execute   = ["bash", "-c", "echo 'failed — dumping TF_LOG context'"]
    on_errors = [".*"]
  }
}

Two subtleties. First, multiple hooks of the same kind run in declaration order (and before_hooks all run before the command, after_hooks all after). Second, hooks execute arbitrary shell during plan — that is power and risk; treat a terragrunt.hcl’s hooks as code to review, and never run untrusted Terragrunt repositories blindly.

remote_state: generate the backend once

remote_state declares the backend once (usually in the root) and writes a backend block into every unit at init time, computing the per-unit state key from the path. It is the block that deletes backend duplication.

Attribute / sub-block Type Default What it does · gotcha
backend string The Terraform backend name: s3, gcs, azurerm, local, etc. Must match a real backend.
config map {} The backend’s settings (bucket, key, region, …). Derive key from path_relative_to_include() so no unit hard-codes its key.
generate object { path = "backend.tf", if_exists = "overwrite_terragrunt" } — where to write the generated backend block and how to handle collisions. If omitted, Terragrunt injects backend config via init -backend-config flags instead of writing a file.
disable_init bool false Skip Terragrunt’s auto-creation of backend resources (the S3 bucket / lock table). Turn on when the backend is provisioned separately and you do not want Terragrunt bootstrapping it.
disable_dependency_optimization bool false By default Terragrunt skips re-running a dependency’s init when fetching its outputs (an optimisation). Disable only if that optimisation causes a stale-output edge case.
encryption map OpenTofu state & plan encryption config: key_provider (pbkdf2, aws_kms, gcp_kms, openbao) plus provider-specific keys. Generates the terraform { encryption {...} } block so encryption-at-rest is DRY too.
# live/root.hcl
remote_state {
  backend      = "s3"
  disable_init = false                     # let TG create the bucket/table on first run
  generate = {
    path      = "backend.tf"
    if_exists = "overwrite_terragrunt"      # manage only TG-generated files
  }
  config = {
    bucket       = "frachtline-tfstate-${local.env.locals.environment}"
    key          = "${path_relative_to_include()}/terraform.tfstate"   # path IS the key
    region       = "ap-south-1"
    encrypt      = true
    use_lockfile = true                      # S3-native locking; no DynamoDB table
  }

  # Optional: OpenTofu state encryption, defined once for every unit.
  encryption = {
    key_provider = "aws_kms"
    kms_key_id   = local.account.locals.state_kms_key_arn
  }
}

The headline is key = "${path_relative_to_include()}/terraform.tfstate": the unit’s path relative to the included parent becomes its unique state key, so live/prod/vpc lands at prod/vpc/terraform.tfstate with zero hard-coding. The fundamentals lesson covers why this DRY-s the backend; the point here is the full attribute set — particularly disable_init (stop auto-bootstrap), disable_dependency_optimization (a stale-output escape hatch), and encryption (DRY state encryption), which the working-altitude lessons do not enumerate.

generate: write any file — every attribute

remote_state is really a specialised generate. The general block writes any file into the working directory at runtime — most commonly the provider, so it too lives in one place. This block has the richest collision/lifecycle surface, and it is where people get surprised.

Attribute Type Default What it does · gotcha
path string Filename to write into the working dir (provider.tf, versions.tf).
contents string The file body. Usually a heredoc with interpolations.
if_exists string error-ish Collision behaviour when the file already exists: overwrite_terragrunt (manage only files TG generated — safe default), overwrite (clobber anything), skip (never touch an existing file), error (fail).
if_disabled string skip What to do with a previously-generated file when this block is disabled: remove, remove_terragrunt (remove only TG-generated), or skip. Pairs with disable.
disable bool false Turn this generate block off (e.g. behind a feature flag or an if-style local) without deleting the block.
comment_prefix string # The comment marker for the signature header TG stamps on generated files (use // for languages where # is invalid).
disable_signature bool false Omit the “generated by Terragrunt” header entirely.
hcl_fmt bool true Run hcl fmt on generated .hcl/.tf so the output is tidy. Turn off if formatting mangles intentional content.
# live/root.hcl  (continued)
generate "provider" {
  path              = "provider.tf"
  if_exists         = "overwrite_terragrunt"
  if_disabled       = "remove_terragrunt"
  comment_prefix    = "# "
  disable_signature = false
  contents          = <<-EOF
    provider "aws" {
      region = "${local.env.locals.aws_region}"
      assume_role { role_arn = "${local.account.locals.role_arn}" }
      default_tags { tags = { managed_by = "terragrunt", env = "${local.env.locals.environment}" } }
    }
    terraform {
      required_providers {
        aws = { source = "hashicorp/aws", version = "~> 5.60" }
      }
    }
  EOF
}

The two attributes worth memorising beyond the fundamentals are if_disabled and disable: together they let you turn a generated file off and clean up the stale file it left behind — without that pair, disabling a generate block leaves an orphaned .tf in the cache that can break the next run. hcl_fmt is the other quiet one: it formats generated HCL so a terragrunt hclfmt --check in CI does not trip over machine-written files.

generate vs init: when files actually appear

A frequent confusion: when do generated files get written, and where? The sequence on any command that prepares a working dir (init, plan, apply, …):

  1. Terragrunt resolves config and copies the source module into .terragrunt-cache/<hash>/<hash>/.
  2. It writes every generate block’s file (and the remote_state-generated backend.tf) into that cache copy, applying if_exists/if_disabled rules.
  3. It runs terraform init in the cache (creating backend resources unless disable_init), then your command.

So generated files live in the cache, not your unit directory — which is why you .gitignore .terragrunt-cache/ and why a remote_state with no generate attribute still works (it passes backend settings as -backend-config CLI flags to init rather than writing a file). If you want the backend as an on-disk file you can read, use the generate form; if you only need init to succeed, the flag form is enough.

include: inheriting and merging parents

include is how a unit pulls in shared parent config (the root, and often an env/region layer) so it does not repeat backend, provider, or common inputs. A unit may have several named includes.

Attribute Type Default What it does · gotcha
path string Path to the parent config. Almost always find_in_parent_folders("root.hcl") so the same line works at any depth.
expose bool false If true, the parent’s evaluated config is readable here as include.<name>.* (e.g. include.env.locals.aws_region). Required to read a parent’s locals.
merge_strategy string shallow How the parent’s inputs/generate/etc. combine with the child’s: no_merge (child only — parent ignored for merging), shallow (top-level keys merged, child wins), deep (recursive merge of maps).
# live/prod/rds/terragrunt.hcl
include "root" {
  path = find_in_parent_folders("root.hcl")
}
include "env" {
  path           = find_in_parent_folders("env.hcl")
  expose         = true            # so we can read include.env.locals.*
  merge_strategy = "deep"          # deep-merge env inputs under root inputs
}

terraform { source = "../../../modules//rds" }

inputs = {
  instance_class = "db.r6g.large"  # overrides any inherited default (child wins)
}

The three things the fundamentals lesson does not spell out: the default merge_strategy is shallow (not no_merge), expose is what makes a parent’s locals visible (without it you can only inherit its emergent config, not read its values), and you can layer multiple includes — a root plus an env plus a region — each merged in turn, which is the idiomatic way to build per-account/per-region/per-env configuration without repetition.

dependency and dependencies: wiring units

These two build the DAG and pass data. The distinction is load-bearing: dependencies is ordering only; dependency is ordering plus the target’s outputs. The orchestration of that graph — how run --all walks it, mock_outputs_merge_strategy_with_state at scale, run --graph — lives in the monorepo lesson; here is the complete attribute table for the blocks themselves.

dependencies (plural) has exactly one attribute:

Attribute What it does
paths A list of unit paths that must run before this one. No data crosses; pure sequencing.

dependency "<name>" (singular) exposes the target’s outputs as dependency.<name>.outputs.* and accepts:

Attribute Type Default What it does · gotcha
config_path string Path to the other unit’s directory (the one holding its terragrunt.hcl).
enabled bool true If false, the dependency is dropped — no ordering edge, no outputs. Use behind a feature flag to make an edge conditional.
skip_outputs bool false Keep the ordering edge but never call output on the target. Combine with mock_outputs to always use mocks (e.g. a unit you order against but whose outputs you do not consume).
mock_outputs map Placeholder outputs used as a fallback when the real outputs are unavailable (producer not yet applied).
mock_outputs_allowed_terraform_commands list(string) all The commands during which mocks may be used — typically ["validate","plan","init"]. Must exclude apply/destroy so a real apply never runs on fake data.
mock_outputs_merge_strategy_with_state string no_merge How mocks combine with partial real state (after you add a new output to an applied module): no_merge, shallow, deep_map_only. shallow is the usual choice.
mock_outputs_merge_with_state bool Deprecated boolean predecessor of the strategy attribute; use mock_outputs_merge_strategy_with_state instead.
# live/prod/app/terragrunt.hcl
dependency "vpc" {
  config_path = "../vpc"
}
dependency "rds" {
  config_path  = "../rds"
  mock_outputs = { endpoint = "mock-endpoint:5432" }
  mock_outputs_allowed_terraform_commands = ["validate", "plan"]
  mock_outputs_merge_strategy_with_state  = "shallow"
}

# Ordering only — wait for the IAM baseline, consume nothing from it.
dependencies {
  paths = ["../../_baseline/iam"]
}

inputs = {
  vpc_id      = dependency.vpc.outputs.vpc_id
  db_endpoint = dependency.rds.outputs.endpoint
}

The attributes the working lessons gloss: enabled (drop an edge conditionally — pairs beautifully with feature flags), skip_outputs (“always mock,” distinct from mock_outputs’s “mock only as fallback”), and the deprecation of the boolean mock_outputs_merge_with_state in favour of the three-valued mock_outputs_merge_strategy_with_state. Mocks are for shape (so validate/plan parse), never for values you depend on at apply time.

inputs and locals

Two small but constant blocks.

inputs is a single map (not a labelled block) that Terragrunt converts into TF_VAR_* environment variables for the module — the Terragrunt equivalent of a .tfvars. Inputs from an included parent and from the unit merge per the include’s merge_strategy, child winning. Because it becomes env vars, every value must be expressible as a string/JSON — complex objects are JSON-encoded automatically.

locals declares local values, exactly like Terraform’s locals, evaluated before the rest of the config. They are where you call read_terragrunt_config(...) to load shared .hcl files and where you compute derived values. A unit can read its own local.*; to read a parent’s locals you must include it with expose = true and reference include.<name>.locals.*.

locals {
  account = read_terragrunt_config(find_in_parent_folders("account.hcl"))
  env     = read_terragrunt_config(find_in_parent_folders("env.hcl"))
  name    = "${local.env.locals.environment}-frachtline"   # derived
}

inputs = {
  name_prefix = local.name
  tags        = { environment = local.env.locals.environment }
}

The errors block: retry and ignore (and what it replaced)

Transient failures — an API throttle, an eventually-consistent IAM role, a flaky registry — used to be handled by top-level retryable_errors, retry_max_attempts, and retry_sleep_interval_sec attributes, plus a skip flag. Those are deprecated (slated for removal in Terragrunt 1.0) in favour of a structured errors block with retry and ignore sub-blocks. The new model is strictly more capable: multiple retry rules, per-rule attempt/sleep settings, ignoring expected errors, and signalling external systems.

errors contains any number of retry "<name>" and ignore "<name>" sub-blocks:

Sub-block Attribute What it does
retry "<name>" retryable_errors List of regex patterns; a matching error triggers a retry.
max_attempts Maximum number of attempts for this rule.
sleep_interval_sec Seconds to wait between attempts.
ignore "<name>" ignorable_errors List of regex patterns whose matching errors are swallowed (treated as non-fatal).
message Optional warning printed when an error is ignored.
signals Map of key/values written to a signals file to notify external systems that an ignore fired.
# live/root.hcl — applies to every unit that includes it
errors {
  retry "transient_cloud" {
    retryable_errors   = [".*RequestLimitExceeded.*", ".* throttl.*", ".*timeout.*"]
    max_attempts       = 3
    sleep_interval_sec = 5
  }
  retry "eventual_iam" {
    retryable_errors   = [".*NoSuchEntity.*", ".*role .* does not exist.*"]
    max_attempts       = 4
    sleep_interval_sec = 10
  }
  ignore "known_benign" {
    ignorable_errors = [".*does not need to be updated.*"]
    message          = "Ignoring benign no-op error"
    signals          = { ignored = "true" }
  }
}

When several retry blocks are present, Terragrunt collects all their retryable_errors patterns for matching and applies the matching block’s own max_attempts/sleep_interval_sec. You can seed the patterns with Terragrunt’s built-in defaults via get_default_retryable_errors() and append your own. Migrate any legacy retryable_errors = [...] you find to a retry block now — it will stop working at 1.0, and terragrunt info / deprecation warnings in your logs are flagging exactly this.

feature and exclude: conditional configuration

Two newer blocks (part of the road to 1.0) that make config conditional without resorting to clever locals gymnastics.

feature "<name>" declares a feature flag with a default, overridable at the CLI (--feature name=value) or via TG_FEATURE. Read it as feature.<name>.value. It is the clean way to toggle a hook, a generate block, or a dependency edge per run.

Block Attribute What it does
feature "<name>" default The flag’s default value (any type via expression); override at runtime with --feature <name>=<value>.

exclude dynamically removes a unit from a run based on a condition — the modern, in-config replacement for scattering --queue-exclude-dir flags or the deprecated skip = true.

Block Attribute Default What it does
exclude if Boolean condition; when true the unit is excluded.
actions Which actions to exclude: ["plan"], ["apply"], ["all"], or ["all_except_output"] (still readable as a dependency).
exclude_dependencies false Also exclude units that depend on this one.
no_run false Prevent the unit running for single (non-run --all) commands too.
feature "enable_waf" {
  default = false
}

# Skip this unit's apply in ephemeral preview environments, but keep its
# outputs readable so dependents can still plan.
exclude {
  if      = local.env.locals.environment == "preview"
  actions = ["all_except_output"]
}

# Use the flag to toggle a generate block (via disable):
generate "waf" {
  path     = "waf.tf"
  disable  = !feature.enable_waf.value
  contents = "# ... WAF resources ..."
}

exclude’s all_except_output is the subtle, valuable mode: the unit will not apply in this run, but its existing outputs remain readable so dependency consumers still get real values — exactly what you want when freezing one tier while iterating on another.

Stacks: unit and stack (and vs Terraform Stacks)

The classic model is a hand-built tree of terragrunt.hcl units wired by dependency. Terragrunt’s newer Stacks let you generate that tree from a declaration: a terragrunt.stack.hcl file lists unit (and nested stack) blocks, and terragrunt stack generate stamps them into a .terragrunt-stack/ directory from values — so a reusable “VPC + RDS + app” bundle is described once and instantiated per environment.

Block Attribute Default What it does
unit "<name>" source Where the unit’s config comes from (a local path or remote template).
path Where to generate it (relative deploy path under .terragrunt-stack/).
values A map fed into the unit to customise it.
no_dot_terragrunt_stack false Generate outside .terragrunt-stack/.
no_validation false Skip validation of the generated unit.
stack "<name>" source / path / values Same shape, but instantiates a whole nested stack.
# terragrunt.stack.hcl
unit "vpc" {
  source = "${get_repo_root()}/units/vpc"
  path   = "vpc"
  values = { cidr = "10.20.0.0/16" }
}
unit "app" {
  source = "${get_repo_root()}/units/app"
  path   = "app"
  values = { replicas = 3 }
}

Crucially, do not confuse Terragrunt Stacks with Terraform/OpenTofu Stacks. They are different layers from different vendors: Terragrunt Stacks (terragrunt.stack.hcl, unit/stack blocks, a Gruntwork feature) generate Terragrunt units; HashiCorp’s Terraform Stacks (*.tfstack.hcl + *.tfdeploy.hcl, deployments/components, a HCP feature) are a native Terraform construct for multi-deployment orchestration. They solve overlapping problems by different means; an interviewer asking “Terragrunt vs Terraform Stacks” wants you to know they are not the same thing. The block/function fundamentals in this lesson carry straight into Terragrunt Stacks; Terraform Stacks are a separate model entirely.

The built-in function catalogue

Terragrunt config is dynamic because of its built-in functions — and it supports every Terraform/OpenTofu built-in function too (merge, lookup, jsonencode, format, try, the for machinery, startswith/endswith/strcontains, etc.), so you have both vocabularies available. Below is the Terragrunt-specific catalogue with exact signatures.

Path and directory functions

Function Signature Returns / does
find_in_parent_folders find_in_parent_folders(name, [fallback]) Walks up from the current dir to the nearest ancestor file named name; returns its path (or fallback if none). Always pass a name — the no-arg form is deprecated.
path_relative_to_include path_relative_to_include([name]) This unit’s path relative to the included parent (e.g. prod/vpc). The state-key workhorse. Optional name selects which include when several exist.
path_relative_from_include path_relative_from_include([name]) The inverse: the included parent’s path relative to this unit (e.g. ../../..). Use to build relative source paths back to a shared modules dir.
get_terragrunt_dir get_terragrunt_dir() Absolute path of the current unit’s directory (where its terragrunt.hcl is). Reference files next to the unit.
get_original_terragrunt_dir get_original_terragrunt_dir() Absolute path of the dir of the originally invoked terragrunt.hcl (differs from the above inside read_terragrunt_config/includes).
get_parent_terragrunt_dir get_parent_terragrunt_dir([name]) Absolute path of the dir holding the included parent (the live-tree root). Anchor paths to the root.
get_working_dir get_working_dir() Absolute path of the cache working dir where Terragrunt actually runs the engine (not your unit dir).

Repository functions

Function Signature Returns / does
get_repo_root get_repo_root() Absolute path to the Git repo root.
get_path_from_repo_root get_path_from_repo_root() Path from the repo root to the current dir.
get_path_to_repo_root get_path_to_repo_root() Relative path from the current dir back to the repo root (e.g. ../../..).
get_platform get_platform() The OS identifier Terragrunt is running on (linux, darwin, windows).

Environment, command, and execution functions

Function Signature Returns / does
get_env get_env(name, [default]) An environment variable’s value, or default (error if absent and no default). Inject CI-provided values without hard-coding.
get_terraform_command get_terraform_command() The Terraform subcommand currently running (plan, apply, …) — for conditional hooks.
get_terraform_cli_args get_terraform_cli_args() The CLI args passed to the current command.
get_terraform_commands_that_need_vars () List of commands that accept -var — handy for extra_arguments.commands.
get_terraform_commands_that_need_input () List of commands that accept -input.
get_terraform_commands_that_need_locking () List of commands that accept -lock-timeout.
get_terraform_commands_that_need_parallelism () List of commands that accept -parallelism.
run_cmd run_cmd(command, [args...]) Shells out and returns stdout (cached per identical invocation). Flags --terragrunt-quiet, --terragrunt-global-cache, --terragrunt-no-cache tune logging/caching.
get_default_retryable_errors get_default_retryable_errors() Terragrunt’s built-in list of retryable-error regexes — seed your errors/retry patterns with these and append.

AWS helper functions

Function Signature Returns / does
get_aws_account_id get_aws_account_id() The account ID of the current AWS credentials. Use to assert the right account or build account-scoped names.
get_aws_account_alias get_aws_account_alias() The account alias (or empty string).
get_aws_caller_identity_arn get_aws_caller_identity_arn() The ARN of the current identity.
get_aws_caller_identity_user_id get_aws_caller_identity_user_id() The UserId of the current identity.

Config-reading, secrets, and version functions

Function Signature Returns / does
read_terragrunt_config read_terragrunt_config(path, [default]) Parse another .hcl file into an object; access its .locals, .inputs, etc. The way you load shared account.hcl/env.hcl.
read_tfvars_file read_tfvars_file(path) Read a .tfvars / .tfvars.json file and return its variables as a map — reuse Terraform-format vars in Terragrunt config.
sops_decrypt_file sops_decrypt_file(path) Decrypt a SOPS-encrypted file at config time; return its contents. Bring secrets in safely (do not commit plaintext).
get_terragrunt_source_cli_flag get_terragrunt_source_cli_flag() The value of the --source CLI flag / TG_SOURCE (override module source globally, e.g. to a local checkout).
mark_as_read mark_as_read(path) Mark a file as “read” by this unit so --queue-include-units-reading <file> fans the change out to it (for files Terragrunt cannot auto-detect).
constraint_check constraint_check(version, constraint) Boolean: does version satisfy constraint (e.g. ">= 1.9.0")? Guard config on tool versions.
deep_merge deep_merge(m1, m2, ...) Recursively merge maps (behind the deep-merge experiment) — deeper than HCL’s merge.

Two standing cautions for the dynamic functions. First, get_env, run_cmd, and the AWS/sops functions make a config’s behaviour depend on the environment it runs in — powerful for CI, but “the same code” can differ per machine, so document those dependencies and assert the blast radius with get_aws_account_id()/get_aws_caller_identity_arn(). Second, prefer find_in_parent_folders("<name>") with an explicit filename — the no-argument form is deprecated.

Architecture overview

Terragrunt config-language map: blocks, functions and hooks

The diagram lays the whole config surface out at once: a single root.hcl carrying remote_state, generate, and errors; a unit’s terragrunt.hcl with its terraform block (and the before_hook/after_hook/error_hook ring around the Terraform command), include arrows up to the root and an env layer, dependency edges to sibling units feeding inputs, and the built-in functions (find_in_parent_folders, path_relative_to_include, read_terragrunt_config, get_env) annotated where each is used. It is the mental index for everything tabulated above — which block holds which attribute, and which function feeds which field.

Hands-on lab

This lab exercises the config language itselfinclude with expose/merge_strategy, generate with if_disabled/disable, a full set of hooks, the errors block, a feature flag, and a handful of functions — all on the local backend with the null/random providers, so it runs offline, costs nothing, and needs no cloud account. You need terragrunt and terraform (or tofu) on your PATH.

1. Scaffold.

mkdir -p tg-cfg-lab/modules/app tg-cfg-lab/live/dev/app
cd tg-cfg-lab

2. A tiny module. Create modules/app/main.tf:

variable "name"     { type = string }
variable "replicas" { type = number }
resource "random_id" "id" { byte_length = 4 }
resource "null_resource" "app" { triggers = { name = var.name, replicas = var.replicas } }
output "app_id" { value = "app-${random_id.id.hex}" }

3. Shared env locals. Create live/dev/env.hcl:

locals {
  environment = "dev"
  replicas    = 1
}

4. The DRY rootgenerate provider, an errors retry rule, a feature flag. Create live/root.hcl:

locals {
  env = read_terragrunt_config(find_in_parent_folders("env.hcl"))
}

remote_state {
  backend  = "local"
  generate = { path = "backend.tf", if_exists = "overwrite_terragrunt" }
  config   = { path = "${get_terragrunt_dir()}/terraform.tfstate" }
}

generate "versions" {
  path        = "versions.tf"
  if_exists   = "overwrite_terragrunt"
  if_disabled = "remove_terragrunt"
  contents    = <<-EOF
    terraform {
      required_providers {
        random = { source = "hashicorp/random" }
        null   = { source = "hashicorp/null" }
      }
    }
  EOF
}

feature "noisy_hooks" {
  default = true
}

errors {
  retry "transient" {
    retryable_errors   = [".*timeout.*", ".*temporarily unavailable.*"]
    max_attempts       = 2
    sleep_interval_sec = 1
  }
}

inputs = {
  name = "frachtline-${local.env.locals.environment}"
}

5. The unit — two includes (one exposed), hooks using if/suppress_stdout/run_on_error, and inputs that mix inherited and derived values. Create live/dev/app/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders("root.hcl")
}
include "env" {
  path           = find_in_parent_folders("env.hcl")
  expose         = true
  merge_strategy = "deep"
}

terraform {
  source = "../../../modules//app"

  before_hook "announce" {
    commands        = ["plan", "apply"]
    execute         = ["bash", "-c", "echo '>> preparing ${path_relative_to_include()}'"]
    suppress_stdout = false
    if              = feature.noisy_hooks.value          # gated by the flag
  }
  after_hook "done" {
    commands     = ["apply"]
    execute      = ["bash", "-c", "echo '<< applied, run_on_error=true'"]
    run_on_error = true
  }
  error_hook "oops" {
    commands  = ["plan", "apply"]
    execute   = ["bash", "-c", "echo 'an error occurred'"]
    on_errors = [".*"]
  }
}

inputs = {
  replicas = include.env.locals.replicas    # read the exposed parent local
}

6. Plan — watch the hook, the generated files, and the functions resolve.

cd live/dev/app
terragrunt plan

Expected: the before_hook prints >> preparing dev/app (proving path_relative_to_include() and the feature-gated if), and the plan shows replicas = 1 (the exposed include.env.locals.replicas) and name = "frachtline-dev" (inherited from the root’s inputs).

7. Apply — confirm after_hook and the generated files in the cache.

terragrunt apply -auto-approve
ls .terragrunt-cache/*/*/backend.tf .terragrunt-cache/*/*/versions.tf

Expected: the after_hook prints << applied..., and both generated files exist inside the cache (not the unit dir) — the generate-vs-init mechanic made visible.

8. Toggle the feature flag off — the hook disappears.

terragrunt plan --feature noisy_hooks=false

Expected: the >> preparing ... line is gone, because the before_hook’s if = feature.noisy_hooks.value is now false.

9. Cleanup.

terragrunt destroy -auto-approve
cd ../../..
rm -rf tg-cfg-lab

Cost note: zero. The lab uses the local backend and the null/random providers — nothing is created in any cloud, so there is nothing to bill; cleanup is deleting the directory.

Common mistakes & troubleshooting

Symptom Cause Fix
find_in_parent_folders: no arguments deprecation warning Calling the no-arg form Always pass the filename: find_in_parent_folders("root.hcl").
Generated .tf not where you expected Files are written to .terragrunt-cache/..., not the unit dir Look in the cache; .gitignore it. Use generate { path = ... } only to control the filename, not the location.
Disabling a generate block breaks the next run The previously-generated file is orphaned in the cache Set if_disabled = "remove_terragrunt" (with disable = true) so the stale file is cleaned up.
Can’t read a parent’s locals include lacks expose = true Add expose = true and reference include.<name>.locals.*.
Child inputs unexpectedly replaced/merged with parent merge_strategy mismatch (default is shallow) Set merge_strategy explicitly: no_merge (child only), shallow, or deep.
retryable_errors/retry_max_attempts warns as deprecated Legacy top-level retry attributes Move them into an errors { retry "..." { retryable_errors=... max_attempts=... } } block.
dependency ... has not been applied yet on plan Reading outputs before the producer has state Add mock_outputs + mock_outputs_allowed_terraform_commands = ["validate","plan"].
A new output on an applied dependency breaks every downstream plan mock_outputs_merge_strategy_with_state = "no_merge" (default) Set it to shallow so mocks fill only the absent new key while real values are used elsewhere.
Hook shell features (` , &&, $VAR`) don’t work execute is exec-style, not a shell
A shared .hcl/template change isn’t detected by --queue-include-units-reading Terragrunt can’t auto-detect the read Call mark_as_read("<path>") in the units that consume it.

Best practices

Security notes

Interview & exam questions

  1. What is the difference between get_terragrunt_dir() and get_working_dir()? get_terragrunt_dir() is your unit’s directory (where its terragrunt.hcl lives); get_working_dir() is the .terragrunt-cache/... directory Terragrunt copies the module into and actually runs the engine in. Generated files land in the latter.

  2. path_relative_to_include() vs path_relative_from_include()? The first returns the unit’s path relative to the included parent (e.g. prod/vpc) — used to derive the state key. The second is the inverse, the parent’s path relative to the unit (e.g. ../../..) — used to build relative source paths back to a shared modules directory.

  3. Name the three hook kinds and the attribute that makes an after_hook fire on failure. before_hook, after_hook, error_hook. run_on_error = true makes an after_hook run even when the command (or a prior hook) failed. error_hook runs only on failure, matching on_errors regexes.

  4. What replaced retryable_errors/retry_max_attempts, and why? The structured errors block with retry/ignore sub-blocks. It allows multiple rules with per-rule max_attempts/sleep_interval_sec, plus ignoring expected errors and signalling external systems — capabilities the flat attributes lacked. The legacy attributes are deprecated, with removal planned for 1.0.

  5. What does if_exists = "overwrite_terragrunt" mean, and how does it differ from overwrite? overwrite_terragrunt only manages/overwrites files Terragrunt itself generated (it will not clobber a hand-written file); overwrite clobbers any file. The other choices are skip (never touch an existing file) and error (fail if it exists).

  6. You disable a generate block. What stops the previously-generated file breaking the next run? Setting if_disabled = "remove_terragrunt" (alongside disable = true) so Terragrunt removes the stale file it generated, rather than leaving an orphan in the cache.

  7. skip_outputs vs mock_outputs on a dependency? skip_outputs = true means never call output on the target (always use mocks / no data) while keeping the ordering edge. mock_outputs provides fallback values used only when real outputs are unavailable. They answer different questions; do not expect skip_outputs to mean “mock only if absent.”

  8. What does mock_outputs_merge_strategy_with_state solve, and what’s the sensible value? When you add a new output to an already-applied dependency, no_merge (the default) makes every downstream plan fail until you re-apply the producer. shallow lets the plan use the mock for just the new key while using real values for the rest. deep_map_only recurses into map outputs.

  9. How do you read a parent’s locals from a child unit? include the parent with expose = true and reference include.<name>.locals.*. Without expose, you inherit the parent’s emergent config but cannot read its values.

  10. Difference between Terragrunt Stacks and Terraform Stacks? Terragrunt Stacks (terragrunt.stack.hcl, unit/stack blocks) are a Gruntwork feature that generates Terragrunt units from a declaration. Terraform/OpenTofu Stacks (*.tfstack.hcl/*.tfdeploy.hcl, components/deployments) are a native HashiCorp construct for multi-deployment orchestration. Different vendors, different layers — not interchangeable.

  11. What is extra_arguments for, and how do you apply a -var-file to only the var-taking commands? It conditionally appends CLI flags/var-files to specific commands. Set commands = get_terraform_commands_that_need_vars() and list the file under required_var_files (must exist) or optional_var_files (only if present).

  12. Why is remote_state without a generate attribute still valid? Without generate, Terragrunt passes the backend settings to terraform init as -backend-config=... flags instead of writing a backend.tf. The generate form is only needed when you want the backend block as an on-disk file.

Quick check

  1. Where do generate/remote_state write their files — your unit directory or somewhere else?
  2. Which include attribute lets a child read the parent’s locals?
  3. Name the modern block that replaced the deprecated retryable_errors attribute.
  4. Which dependency attribute (and value) lets a newly-added output not break every downstream plan?
  5. Which function returns the unit’s path relative to the included parent, used to build the state key?

Answers

  1. The cache.terragrunt-cache/<hash>/<hash>/, the working dir Terragrunt runs the engine in (not the unit dir). .gitignore it.
  2. expose = true — then reference include.<name>.locals.*.
  3. The errors block (with retry/ignore sub-blocks).
  4. mock_outputs_merge_strategy_with_state = "shallow" (the default no_merge is what breaks them).
  5. path_relative_to_include() — used as key = "${path_relative_to_include()}/terraform.tfstate".

Exercise

Take the lab’s dev/app unit and exercise the rest of the config surface:

  1. Add a second unit, net, and wire app to it with a dependency that has mock_outputs, mock_outputs_allowed_terraform_commands = ["validate","plan"], and mock_outputs_merge_strategy_with_state = "shallow". Prove with terragrunt run --all plan that app plans on the mock before net is applied, then on the real output after.
  2. Add an extra_arguments block to app’s terraform block that appends a -lock-timeout=10m to every var-taking command (use get_terraform_commands_that_need_vars()), and confirm it in TF_LOG=debug output.
  3. Add an exclude block to net with actions = ["all_except_output"] gated on a feature flag, and show that with the flag on, run --all apply skips net’s apply but app still reads its real outputs.
  4. Convert the root’s single retry into two retry blocks (one for throttling, one for eventual-consistency) seeded with get_default_retryable_errors(), and a generate "provider" whose disable is driven by a feature flag with if_disabled = "remove_terragrunt"; toggle the flag and confirm the generated file appears and disappears in the cache.

Success looks like: a working dependency with merge-aware mocks, a -lock-timeout injected via extra_arguments, an exclude that freezes a unit while keeping its outputs readable, two targeted retry rules, and a feature-flag-driven generate block that cleans up after itself.

Certification mapping

This lesson supports the HashiCorp Certified: Terraform Associate (003) objectives — with the caveat that Terragrunt is a third-party tool and the exam tests Terraform itself; Terragrunt is the production wrapper that exercises those concepts:

For the exam, be crisp on the Terraform primitives Terragrunt wraps: backends and locking, the terraform_remote_state data source (the manual alternative to dependency), -backend-config/-var-file flags (what extra_arguments injects), and module source pinning.

Glossary

Next steps

You now have the complete terragrunt.hcl reference — every block and its attributes, every built-in function, hooks end to end, the errors retry/ignore model, and the generate-vs-init mechanics. To put the fundamentals in narrative context (the live/modules split, the DRY rationale, the on-ramp), see Terragrunt Fundamentals: DRY Configurations, Remote State & Dependencies. For the orchestration side — how run --all/run --graph walk the DAG, change-aware selective execution, parallelism, and CI at 200 units — read Scaling Terragrunt Monorepos with Dependency Graphs and run-all. Then apply all of it end to end in Multi-Environment 3-Tier Infrastructure with Terragrunt & CI/CD Approval Gates, where this config language drives a real dev→uat→staging→prod promotion pipeline behind graduated approval gates.

TerragruntTerraformOpenTofuHCLHooksDevOps
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments