Terraform Lesson 17 of 57

Terraform Provisioners, In Depth: local-exec, remote-exec, connection, null_resource & terraform_data

Every Terraform codebase eventually reaches the moment where the declarative model runs out and someone wants to do a thing: run a script after a VM boots, copy a config file onto a host, register a freshly created node with an external system, fire a webhook when a resource is destroyed. Terraform’s answer to “but I need to run an imperative step” is the provisioner — a block you attach to a resource that executes a command locally or on the remote machine as part of the create or destroy. Provisioners are the most tempting feature in the language and the one HashiCorp most loudly tells you to avoid. The official documentation opens the provisioners page with the words “Provisioners are a Last Resort,” and they mean it. They are an escape hatch out of the very properties — planning, idempotency, a graph of known resources — that make Terraform worth using.

This lesson covers provisioners exhaustively, because you will read code that uses them and you will occasionally, legitimately, need them. We will go through every provisioner Terraform ships (local-exec, remote-exec, and the file provisioner), every argument each one accepts, the full connection block that remote-exec and file depend on (SSH and WinRM, bastion hosts, host-key checking), the difference between creation-time and destroy-time (when = destroy) provisioners, and the on_failure setting that decides what happens when a provisioner errors. Then we cover the two resources that exist mainly to host provisioners and triggers: the legacy null_resource and its modern, provider-less replacement terraform_data (Terraform 1.4+). Throughout — and this is the point — we keep coming back to why each of these breaks the model, and what you should reach for instead: cloud-init / user_data, Packer baked images, configuration management (Ansible/Chef/Puppet/Salt), and, most often, a native provider resource that already does the thing you were about to shell out for.

Everything here applies equally to OpenTofu, the open-source fork: the provisioner syntax, the connection block, null_resource, and terraform_data are identical. I will assume Terraform 1.4 or later (so terraform_data is available) and note where 1.4 matters.

Learning objectives

After working through this lesson you will be able to:

Prerequisites

You should be comfortable with the Terraform basics — the resource block, the init → plan → apply → destroy workflow, and what state is — at the level of the course’s Terraform Fundamentals lesson. You should understand resource meta-arguments (count, for_each, depends_on, lifecycle), because null_resource/terraform_data and the replace_triggered_by pattern build directly on them; the Resources & Meta-Arguments lesson covers those in depth and is the natural companion to this one. A little familiarity with SSH (key pairs, known-hosts) helps for the remote-exec/file/connection sections, and you will want a free cloud account or a local Docker/VM for the lab. This lesson sits in the HCL module of the Terraform Zero-to-Hero ladder, right after the functions and dynamic-blocks lessons and before the deep dive into state.

Why provisioners are a last resort (read this first)

It is genuinely important to understand the objection before you learn the syntax, because the syntax is the easy part and the judgement is the whole job. Provisioners break Terraform in four specific, concrete ways.

Problem What actually happens Why it hurts
Invisible to plan A provisioner’s commands are opaque strings. terraform plan cannot look inside command = "..." or inline = [...] and tell you what they will do. The plan shows that the resource (or null_resource) will be created — never the effect of the script. You lose Terraform’s single best safety feature: knowing the consequences of apply before you run it.
Not idempotent A provisioner runs once, at create time (or once at destroy time). Terraform records “this provisioner ran” — not “the system is now in state X.” If the script half-fails, or the world drifts, re-running apply will not re-run the provisioner. The whole premise of Terraform (converge to a desired state, safe to re-run) is gone for whatever the provisioner touched.
Tainting on failure If a creation-time provisioner fails (and on_failure is the default fail), Terraform marks the resource tainted — it is left created in the real world but flagged for destroy-and-recreate on the next apply. A flaky post-create script can force the destruction of a perfectly good (possibly stateful) resource.
Hidden, out-of-graph side effects Provisioners reach outside the dependency graph — they SSH to hosts, call external APIs, write local files. Terraform doesn’t know what they touched, so it can’t order, refresh, or clean up those effects. Drift Terraform can never see; ordering bugs; destroy-time logic that silently doesn’t run (see below).

HashiCorp’s own guidance is blunt: use provisioners only when there is no other option, and for most needs there is another option. The decision table below is the one to internalise — it maps every common “I’ll just use a provisioner” instinct to what you should do instead. We expand on each alternative in the final sections.

You want to… Don’t reach for a provisioner — use…
Run a setup script when a VM first boots (install packages, write config, start a service) cloud-init / user_data (aws_instance.user_data, azurerm_linux_virtual_machine.custom_data, GCP metadata.startup-script)
Have software/config already present in the image (golden image) Packer (or the cloud’s image builder) — bake it once, boot it many times
Configure many hosts consistently and re-converge over time Configuration management — Ansible, Chef, Puppet, Salt — run after Terraform creates the hosts
Create/modify a resource in a service Terraform supports The native provider resource (there’s almost certainly one)
Pass data out of Terraform to a script or other tool output values, or write a file with local_file and read it elsewhere
Run a one-off API call there’s no resource for A provider’s *_api/http resource, the http data source (read-only), or a small custom provider — and only then local-exec
Wait for something to become ready A provider’s native wait/health attributes, time_sleep, or a data source that polls — rarely a provisioner loop

Keep that table in mind as we go. Now the mechanics.

The two kinds of provisioner and where they attach

Provisioners come in two broad families:

  1. local-exec — runs a command on the machine running Terraform (your laptop, the CI runner). It never touches the remote resource directly; it just runs a local process (which might itself call out, e.g. aws, kubectl, curl).
  2. Remote provisioners — remote-exec and file — act on the resource being created, over a network connection you describe in a connection block. remote-exec runs commands on the remote host; file copies files to it.

A provisioner block is nested inside a resource block. It can attach to any resource, but two resources exist specifically to carry provisioners when you have nothing else to hang them on: null_resource and terraform_data (covered later). Every provisioner — regardless of type — accepts the two meta-arguments when (creation-time vs destroy-time) and on_failure (fail vs continue), which we treat in their own sections.

resource "aws_instance" "web" {
  ami           = "ami-0abcd1234"
  instance_type = "t3.micro"

  # A provisioner attached to this resource:
  provisioner "local-exec" {
    command = "echo 'instance ${self.id} created at ${self.public_ip}'"
  }
}

Note self — inside a provisioner, self refers to the resource the provisioner is attached to. This is the only place self is valid, and it exists precisely because destroy-time provisioners can’t reference other resources (more on that later). Use self.attribute to read the parent resource’s attributes.

local-exec: run a command on the Terraform machine

local-exec invokes a local executable after the resource is created (or destroyed). It is the least objectionable provisioner because it doesn’t need network access to the resource, but it is still imperative, still invisible to plan, and still non-idempotent. Here is the complete argument surface.

Argument Type Required? What it does / default
command string Yes (unless interpreter-only) The command line to execute. This is the script source — Terraform passes it to the interpreter. Multi-line via heredoc is allowed.
working_dir string No Directory to run the command in. Default: the module directory. Relative paths resolve from there.
environment map(string) No Extra environment variables for the child process, merged onto the inherited environment. Values are strings; non-string must be tostring()-ed.
interpreter list(string) No The interpreter + leading args used to run command. Default is platform-dependent: ["/bin/sh", "-c"] on Unix, ["cmd", "/C"] on Windows. Set it to use python, pwsh, bash -eu -o pipefail, etc.
when destroy No If destroy, the command runs on destroy instead of create. (Meta-argument; see below.)
on_failure continue/fail No What to do if the command exits non-zero. Default fail. (Meta-argument; see below.)
quiet bool No (local-exec only) If true, suppresses echoing the command text to the log/UI (the output still streams). Default false. Use it to avoid leaking sensitive command lines.

A fully featured example:

resource "terraform_data" "seed_db" {
  triggers_replace = [var.schema_version]   # re-run when the schema version changes

  provisioner "local-exec" {
    command     = "./scripts/seed.sh"
    working_dir = "${path.module}/db"
    interpreter = ["/bin/bash", "-eu", "-o", "pipefail"]   # fail fast on errors/unset vars

    environment = {
      DB_HOST     = var.db_endpoint
      DB_PASSWORD = var.db_password   # sensitive — see security notes
      PGSSLMODE   = "require"
    }

    quiet      = true        # don't echo the command (it may contain secrets in argv)
    on_failure = fail        # default; abort the apply if seeding fails
  }
}

Key points and gotchas for local-exec:

remote-exec: run commands on the remote host

remote-exec connects to the resource (via the connection block) and runs commands on it. It is used to bootstrap a freshly created server. It requires a working connection, which means the host must be reachable and accept your credentials at apply time from wherever Terraform runs — already a fragile coupling. It accepts exactly one of three mutually exclusive arguments describing what to run:

Argument Type What it does
inline list(string) A list of command strings executed in order on the remote host. Simplest for a few commands.
script string A local path to a single script file. Terraform copies it to the remote host and executes it.
scripts list(string) A list of local script paths, copied and executed in order.

You may set only one of inline, script, or scripts. Like all provisioners, remote-exec also takes when and on_failure. It does not take local-exec’s command/interpreter/environment/working_dir — to set environment variables remotely you export them inside inline/the script; to choose an interpreter you use a shebang in the script.

resource "aws_instance" "web" {
  ami           = var.ami_id
  instance_type = "t3.micro"
  key_name      = aws_key_pair.deploy.key_name
  # ... vpc, subnet, security group allowing inbound SSH from the Terraform runner ...

  connection {
    type        = "ssh"
    host        = self.public_ip
    user        = "ubuntu"
    private_key = file("~/.ssh/deploy_key")
  }

  provisioner "remote-exec" {
    inline = [
      "set -euo pipefail",
      "sudo apt-get update -y",
      "sudo apt-get install -y nginx",
      "echo 'hello from ${self.id}' | sudo tee /var/www/html/index.html",
      "sudo systemctl enable --now nginx",
    ]
    on_failure = fail
  }
}

Critical things to know about remote-exec:

The file provisioner: copy files to the remote host

The file provisioner copies a file or directory from the Terraform machine (or inline content) to the remote resource, again over the connection. Its arguments:

Argument Type What it does
source string Local path to a file or directory to upload. Mutually exclusive with content. A trailing slash on a directory source controls whether the dir itself or its contents are copied (rsync-style).
content string Literal content to write to destination (use instead of source for small, generated files — e.g. from templatefile()). Creates a file; cannot create directories.
destination string Required. The absolute path on the remote host to write to. The remote user must have permission; intermediate dirs are not created for content.
provisioner "file" {
  content     = templatefile("${path.module}/app.conf.tftpl", { port = var.port })
  destination = "/etc/app/app.conf"
  connection { /* ... same connection block ... */ }
}

Gotchas: the destination directory must already exist (for content, and typically for single-file source); permissions are the remote user’s, so you often pair file (copy as the login user to /tmp) with remote-exec (sudo mv into place); and over WinRM the path uses backslashes. As with the others — invisible to plan, runs once, no idempotency. If you find yourself copying config files this way, that’s exactly what cloud-init write_files or a golden image is for.

The connection block: SSH, WinRM, bastions, and host keys

Both remote-exec and the file provisioner need to reach the host, and they do it through a connection block. You can put connection inside a single provisioner (scoped to it) or directly in the resource block (shared by all provisioners on that resource). The block’s first decision is type: ssh (default, Linux/Unix) or winrm (Windows).

Common and SSH fields

Field Applies to Default What it does
type both ssh Connection type: ssh or winrm.
host both — (required) Address to connect to. Usually self.public_ip, self.private_ip, or a DNS name.
port both 22 (ssh) / 5985 or 5986 (winrm) TCP port.
user both root (ssh) / Administrator (winrm) Login user.
password both Login password (prefer keys for SSH).
timeout both 5m How long to wait for the connection to become available (e.g. while the box boots). Format like "10m".
script_path both platform temp path Remote path where copied scripts are staged before execution. Override if the default temp dir is noexec or read-only.
private_key ssh PEM-encoded private key contents (use file(...), not a path). Preferred over password.
certificate ssh A signed certificate (PEM) to use with private_key, for SSH CA setups.
agent ssh true if SSH_AUTH_SOCK set Use the local SSH agent for auth.
agent_identity ssh Preferred identity (public key) when the agent holds several.
host_key ssh The expected host public key for verification. If unset, Terraform does not verify the host key (TOFU/none) — a real MITM exposure (see security notes).
target_platform ssh unix unix or windows (SSH onto Windows). Affects path handling.

WinRM-specific fields

Field Default What it does
https false Use HTTPS (port 5986) instead of HTTP (5985). Strongly recommended.
insecure false Skip TLS certificate verification (don’t, in production).
use_ntlm false Use NTLM authentication.
cacert CA certificate (PEM) to validate the WinRM server’s TLS cert.

Bastion / jump-host fields (SSH)

When the target has no public IP, Terraform can hop through a bastion. These fields describe the jump host; they default to their non-bastion_ counterparts where sensible.

Field Default What it does
bastion_host Address of the bastion/jump host. Setting this enables bastion mode.
bastion_host_key Expected host key of the bastion (verify it!).
bastion_port value of port Bastion SSH port.
bastion_user value of user Bastion login user.
bastion_password value of password Bastion password.
bastion_private_key value of private_key Bastion private key (PEM contents).
bastion_certificate value of certificate Signed cert for the bastion.

A connection through a bastion to a private instance:

connection {
  type                = "ssh"
  user                = "ubuntu"
  host                = self.private_ip          # target has no public IP
  private_key         = file("~/.ssh/app_key")
  host_key            = var.target_host_key      # verify the target

  bastion_host        = aws_instance.bastion.public_ip
  bastion_user        = "ec2-user"
  bastion_private_key = file("~/.ssh/bastion_key")
  bastion_host_key    = var.bastion_host_key     # verify the bastion
  timeout             = "10m"                      # allow time for boot
}

The connection block is where most remote-exec/file pain lives: timeouts because the box isn’t ready, auth failures because the key/user is wrong, and silent insecurity because host_key/bastion_host_key were left unset. All of which, again, you avoid entirely by not opening an inbound connection at all and using user_data instead.

Creation-time vs destroy-time provisioners (when)

By default a provisioner is a creation-time provisioner: it runs after the resource is created, during apply. Setting when = destroy makes it a destroy-time provisioner that runs before the resource is destroyed.

provisioner "local-exec" {
  when    = destroy
  command = "curl -X DELETE https://registry.example.com/nodes/${self.id}"
}

Destroy-time provisioners are useful for deregistration — pulling a node out of a load balancer or external inventory before the VM disappears — but they come with strict, easily-violated rules:

The classic safe pattern for destroy-time deregistration captures the needed identifiers in triggers so self can see them:

resource "terraform_data" "lb_registration" {
  input = {
    node_id = aws_instance.app.id
    lb_name = var.lb_name           # captured into state so destroy-time self can read it
  }
  triggers_replace = [aws_instance.app.id]

  provisioner "local-exec" {                       # register on create
    command = "lb register --node ${self.input.node_id} --lb ${self.input.lb_name}"
  }
  provisioner "local-exec" {                       # deregister on destroy
    when    = destroy
    command = "lb deregister --node ${self.input.node_id} --lb ${self.input.lb_name}"
  }
}

A historical note you’ll see in old code: there used to be when = create written explicitly. create is the default and writing it is unnecessary; only destroy changes behaviour.

on_failure: what happens when a provisioner errors

Each provisioner takes on_failure, which controls Terraform’s reaction to a non-zero exit / error from that provisioner:

Value Behaviour Effect on the resource
fail (default) Abort the apply with an error. For a creation-time provisioner, Terraform marks the resource tainted — it stays created in the cloud but is scheduled for destroy-and-recreate on the next apply. For a destroy-time provisioner, the failure halts the destroy (the resource is not destroyed).
continue Log the error but carry on as if the provisioner succeeded. The resource is not tainted; the apply/destroy proceeds. Use for best-effort steps whose failure shouldn’t block (e.g. a non-critical notification).
provisioner "local-exec" {
  command    = "send-slack-notification.sh ${self.id}"
  on_failure = continue          # a failed notification must not taint the instance
}

The tainting behaviour is the dangerous one and the reason to be deliberate. Imagine a remote-exec that installs software on a database VM and occasionally times out on a slow apt mirror. With the default on_failure = fail, that transient failure taints the VM — and the next apply will destroy and recreate your database. This is exactly the class of foot-gun that motivates “provisioners are a last resort”: a non-infra side effect is allowed to trigger destruction of infrastructure. If a step is genuinely best-effort, set continue; if it’s critical, ask whether user_data/Packer/config-management — where a failure is observable and re-runnable without destroying the box — is the better home for it.

null_resource: the original provisioner host

Sometimes you want to run a provisioner that isn’t naturally tied to any one real resource — run a script after several resources exist, or re-run a command whenever some input changes. The historical tool for this is null_resource, from the hashicorp/null provider. It does nothing on its own; it exists to carry provisioners and to expose a triggers map that controls when it is replaced (and therefore when its provisioners re-run).

resource "null_resource" "cluster_bootstrap" {
  # Replace (and re-run provisioners) whenever any of these change:
  triggers = {
    cluster_instance_ids = join(",", aws_instance.cluster[*].id)
    bootstrap_script_hash = filemd5("${path.module}/bootstrap.sh")
  }

  connection {
    type        = "ssh"
    host        = aws_instance.cluster[0].public_ip
    user        = "ubuntu"
    private_key = file("~/.ssh/deploy_key")
  }

  provisioner "remote-exec" {
    inline = ["sudo /opt/bootstrap.sh"]
  }
}

How triggers works — this is the whole point of null_resource:

null_resource works fine and you will see it everywhere, but it has a real cost: it pulls in the null provider as a dependency (one more plugin to download, pin in the lock file, and keep updated). That is exactly the friction terraform_data removes.

terraform_data: the modern, provider-less replacement

Since Terraform 1.4, the built-in terraform_data resource replaces null_resource for almost every use. It is a managed resource built into Terraform itself — no provider, nothing to install or pin — and it does two jobs null_resource couldn’t do as cleanly:

  1. It can store and pass through arbitrary data via its input/output attributes (not just a string map).
  2. It triggers replacement via triggers_replace, which accepts any value type (not just a map(string)).
Attribute Direction What it is
input argument (in) Any value you want this resource to hold/pass through. Stored in state.
output attribute (out) Echoes input back as a computed value — handy as an explicit dependency anchor or to “freeze” a value.
triggers_replace argument (in) Any value (string, list, object, …). When it changes, terraform_data is replaced, re-running attached provisioners — exactly like null_resource.triggers, but type-flexible.
id attribute (out) A generated unique string id.

Three idiomatic uses:

(a) As a provisioner host (the null_resource replacement):

resource "terraform_data" "bootstrap" {
  triggers_replace = [
    join(",", aws_instance.cluster[*].id),
    filemd5("${path.module}/bootstrap.sh"),     # any list/object works — no join-to-string needed
  ]

  provisioner "local-exec" {
    command = "ansible-playbook -i '${join(",", aws_instance.cluster[*].public_ip)},' bootstrap.yml"
  }
}

(b) To pass a value through and force replacement when it changes (no provisioner at all):

resource "terraform_data" "version" {
  input = var.revision           # stored and echoed via .output
}

resource "aws_instance" "app" {
  # Re-create the instance whenever the revision changes, using lifecycle:
  lifecycle {
    replace_triggered_by = [terraform_data.version]
  }
  # ...
}

This terraform_data + replace_triggered_by combination (Terraform 1.2+) is the modern, plan-visible way to say “recreate X when Y changes” — and it’s strictly better than a null_resource provisioner for that job because the replacement shows up in the plan. Prefer it whenever your goal is “re-create this resource on a trigger” rather than “run an imperative script.”

© To store data not tracked by any real resource (so you can detect changes), e.g. caching a value Terraform should react to.

null_resource terraform_data
Source hashicorp/null provider (must be declared/installed) Built into Terraform (1.4+) — no provider
Trigger field triggersmap(string) only triggers_replaceany type
Data pass-through none inputoutput (any type)
Hosts provisioners? yes yes
Recommendation legacy; fine but superseded preferred for new code

The practical migration: replace resource "null_resource" "x" with resource "terraform_data" "x", change triggers = {...} to triggers_replace = [...] (you no longer need to coerce everything to strings or join), and drop the null provider from required_providers if nothing else uses it. Functionally identical, one fewer dependency. (OpenTofu supports terraform_data as well.)

Embedded diagram

Terraform provisioners decision and execution model: local-exec vs remote-exec/file, connection, when/on_failure, null_resource vs terraform_data, and the alternatives that usually replace them

The diagram traces the decision you should actually make: start from “do I really need an imperative step?”, route the common cases to cloud-init/user_data, Packer, config management, or a native resource, and only fall through to provisioners as the last resort — then show, for the genuine cases, how local-exec (local machine) and remote-exec/file (remote host via connection) execute, how when/on_failure gate them, and where null_resource/terraform_data + triggers sit as the host.

Hands-on lab

This lab is free and entirely local — no cloud account, no SSH targets, no inbound ports. We use local-exec, null_resource, and terraform_data so you can feel the trigger/idempotency behaviour without spending a rupee or opening a firewall. (We deliberately don’t do a live remote-exec lab, because the right lesson is that you rarely should — but the syntax above is complete for when you must.)

1. Set up. Create a working directory and a main.tf:

terraform {
  required_version = ">= 1.4"
  # No providers needed at all — terraform_data is built in,
  # and local-exec doesn't need one either.
}

variable "revision" {
  type        = string
  default     = "v1"
  description = "Bump this to see triggers_replace re-run the provisioner."
}

# (a) terraform_data hosting a local-exec, gated by a trigger:
resource "terraform_data" "build" {
  triggers_replace = [var.revision]

  provisioner "local-exec" {
    command = "echo 'BUILD ran for revision ${self.triggers_replace[0]} at $(date)' >> build.log"
  }
}

# (b) The legacy equivalent, for comparison:
resource "null_resource" "build_legacy" {
  triggers = { revision = var.revision }

  provisioner "local-exec" {
    command = "echo 'LEGACY ran for revision ${self.triggers.revision}' >> build.log"
  }
}

# (c) A destroy-time provisioner (note: only self is allowed):
resource "terraform_data" "cleanup_demo" {
  input = "demo-node-42"
  provisioner "local-exec" {
    when    = destroy
    command = "echo 'DEREGISTER ${self.input}' >> build.log"
  }
}

output "build_id" {
  value = terraform_data.build.output   # demonstrates input->output pass-through is null here; see note
}

Note: because we set triggers_replace/input but not input on build, terraform_data.build.output is null — that’s expected; output only echoes input. The cleanup_demo resource sets input so its destroy-time self.input works.

2. Init and apply.

terraform init        # downloads only the null provider (for build_legacy); terraform_data needs none
terraform apply -auto-approve
cat build.log

Expected: build.log now contains a BUILD ran for revision v1 ... line and a LEGACY ran for revision v1 line. The DEREGISTER line is not there yet (it’s a destroy-time provisioner).

3. Prove idempotency is gone for provisioner effects. Run apply again with no changes:

terraform apply -auto-approve
cat build.log

Expected: no new lines. The provisioners did not re-run, because nothing in triggers_replace/triggers changed. This is the key lesson — provisioners run on create/replace, not on every apply.

4. Trigger a re-run by changing the trigger.

terraform apply -auto-approve -var 'revision=v2'
cat build.log

Expected: two new lines (BUILD ran for revision v2, LEGACY ran for revision v2) because the trigger changed, replacing both resources and re-running their create-time provisioners.

5. Watch a destroy-time provisioner fire.

terraform destroy -auto-approve
cat build.log

Expected: a new DEREGISTER demo-node-42 line appears — the when = destroy provisioner ran during teardown.

6. Validation checklist.

Cleanup. terraform destroy -auto-approve (already done in step 5) removes the state-tracked resources; then delete the working directory and build.log. There are no cloud resources to remove.

Cost note. ₹0 / $0. Everything is local — terraform_data and local-exec create nothing in any cloud. The null provider is a local plugin. This is the cheapest possible Terraform lab.

Common mistakes & troubleshooting

Symptom Likely cause Fix
Edited a remote-exec/local-exec block but apply does nothing Editing a provisioner does not re-trigger it; the resource already exists Move the provisioner to a null_resource/terraform_data with a triggers/triggers_replace that captures what changed; or force it with terraform apply -replace=ADDRESS
remote-exec hangs then fails with “timeout - last error: dial tcp … i/o timeout” Host not reachable from the Terraform runner: no public IP/bastion, security group/NSG blocks the runner’s IP, or SSH not up yet Allow inbound from the runner, add a bastion_host, raise connection { timeout }, or — better — switch to user_data so no inbound is needed
A flaky post-create script destroyed my VM on the next apply Creation-time provisioner failed with default on_failure = fail, which tainted the resource Set on_failure = continue for best-effort steps; move critical setup to user_data/Packer where failure doesn’t taint
Destroy-time provisioner never ran The resource/module block was deleted from code before destroying, so the provisioner went with it; or it referenced var.* and errored Destroy while the block exists; only reference self, count.index, each.* in destroy-time provisioners (stash needed values in triggers)
“Invalid reference: … cannot be used in a destroy-time provisioner” Used var.*, local.*, or another resource inside a when = destroy provisioner Capture those values in triggers/input and read them via self.triggers[...] / self.input
local-exec script “works locally, fails in CI” Wrong working_dir, missing interpreter (/bin/sh ≠ bash), or a tool not installed on the runner Set working_dir with path.module, set interpreter = ["/bin/bash", "-eu", "-o", "pipefail"], install/declare the tool
Secrets visible in plan/apply logs or state Provisioner command/environment/triggers carry secrets; Terraform logs the command and stores triggers in plaintext state Use quiet = true (local-exec), pass secrets via environment not command argv, avoid putting secrets in triggers; treat state as sensitive
host_key/bastion_host_key warnings ignored; worried about MITM connection does no host-key verification when those are unset Set host_key/bastion_host_key to the expected keys; never disable verification in production

Best practices

Security notes

The alternatives, in full (use these instead)

This section is the positive half of “provisioners are a last resort.” For each common need, here is the tool that does it without breaking the model.

The honest summary: in modern Terraform, a well-architected configuration uses user_data/Packer for machine setup, native resources for everything the provider supports, configuration management for fleet convergence, and reaches for a provisioner — usually a small local-exec on a terraform_data — only for the rare imperative glue that has no declarative home.

Interview & exam questions

  1. Why does HashiCorp call provisioners a “last resort”? Because they break Terraform’s core guarantees: they’re invisible to plan, they’re not idempotent (run once on create/replace, not re-applied to converge), a failed create-time provisioner taints the resource (forcing destroy/recreate), and they cause side effects outside the dependency graph that Terraform can’t track, refresh, or clean up.

  2. What’s the difference between local-exec and remote-exec? local-exec runs a command on the machine running Terraform (laptop/CI runner); remote-exec runs commands on the resource being created, over a connection block (SSH/WinRM). remote-exec additionally requires network reachability and credentials to the host.

  3. Name the three mutually exclusive ways remote-exec specifies what to run. inline (a list of commands), script (one local script file, copied and run), and scripts (a list of local scripts, copied and run in order). Exactly one may be set.

  4. How do you make a provisioner run again after you’ve changed it? Editing a provisioner does not re-trigger it. Either terraform apply -replace=ADDRESS to force the resource’s recreation, or host the provisioner on a null_resource/terraform_data and change its triggers/triggers_replace (e.g. include filemd5() of the script so editing the script re-runs it).

  5. What does on_failure do, and what’s the danger of the default? It controls the reaction to a provisioner error: fail (default) aborts and taints a create-time resource (scheduling destroy/recreate); continue logs and proceeds. The danger: a transient post-create script failure can taint and thereby destroy/recreate otherwise-healthy (possibly stateful) infrastructure.

  6. What are the rules for a when = destroy provisioner? It runs before the resource is destroyed; it may reference only self, count.index, and each.key/each.value (not var, local, or other resources); and it must still be present in the code at destroy time — deleting the block means it never runs. Stash external values in triggers/input to read via self.

  7. What is self and where is it valid? Inside a provisioner (or connection) block, self refers to the resource the provisioner is attached to (self.id, self.public_ip). It exists chiefly so destroy-time provisioners, which can’t reference other objects, can still read their own resource’s attributes.

  8. null_resource vs terraform_data — compare them. Both do nothing themselves and exist to host provisioners/triggers. null_resource comes from the hashicorp/null provider and its triggers is a map(string). terraform_data is built into Terraform (1.4+) — no provider — its triggers_replace accepts any type, and it adds input/output pass-through. Prefer terraform_data in new code.

  9. How does the triggers map control behaviour? Terraform stores triggers in state; on each plan it compares the new map to the stored one, and if any value changed it replaces the resource (destroy+recreate), which re-runs its provisioners. Unchanged triggers mean the provisioners don’t re-run. { always = timestamp() } forces every-apply runs.

  10. You need to run a setup script when a VM boots. What should you use, and why not remote-exec? Use cloud-init / user_data (custom_data on Azure, startup-script on GCP): the instance runs it itself on first boot, needing no inbound SSH, no host-key verification, no tainting, and it’s far more robust in CI. remote-exec requires opening inbound access from the (possibly dynamic) runner IP and is fragile and insecure by comparison.

  11. What’s the modern, plan-visible way to recreate a resource when some value changes, without a provisioner? lifecycle { replace_triggered_by = [terraform_data.x] } where terraform_data.x holds the value (often via input/triggers_replace). Unlike a null_resource provisioner, the resulting replacement appears in terraform plan.

  12. How do secrets leak through provisioners, and how do you mitigate? Through the command line in logs (mitigate with quiet = true and by passing via environment, not argv), through triggers/input stored in plaintext state (don’t put secrets there; encrypt/restrict the backend), and onto remote disks via file/remote-exec. Also enable host-key verification so credentials aren’t exposed to a MITM.

Quick check

  1. True or false: editing the inline of a remote-exec provisioner will re-run it on the next apply.
  2. Which provisioner needs a connection block: local-exec or remote-exec?
  3. What is the default value of on_failure, and what does it do to a create-time resource on error?
  4. Which built-in resource replaces null_resource, and which provider does it require?
  5. Inside a when = destroy provisioner, which references are you allowed to use?

Answers

  1. False. Editing a provisioner does not re-trigger it; the resource already exists. Use -replace or a triggers/triggers_replace change.
  2. remote-exec (and the file provisioner). local-exec runs on the Terraform machine and needs no connection.
  3. fail (the default): it aborts the apply and marks the resource tainted (scheduled for destroy-and-recreate).
  4. terraform_data (Terraform 1.4+) — it requires no provider (it’s built in), unlike null_resource which needs hashicorp/null.
  5. Only self, count.index, and each.key/each.value — not var.*, local.*, or other resources.

Exercise

Take a configuration that currently uses a null_resource with a remote-exec to install and start nginx on an EC2/VM via SSH (a very common legacy pattern). Refactor it twice and write up the trade-offs:

  1. Modernise the host: convert the null_resource to terraform_data, change triggers to triggers_replace (include filesha256() of the bootstrap script so editing it re-runs), and remove the null provider from required_providers. Confirm init/plan behave identically.
  2. Eliminate the provisioner entirely: move the nginx install/config into cloud-init user_data (use templatefile() to render a cloud-config with packages: [nginx], a write_files for the index page, and runcmd to enable the service). Remove the connection, the inbound SSH rule, and the remote-exec altogether.
  3. Write three to four sentences comparing the two end states on: plan visibility, idempotency/re-run behaviour, security (inbound ports, host-key verification), and CI robustness. State which you’d ship and why. (Expected conclusion: the user_data version wins on every axis — this is the lesson.)

Certification mapping

This lesson maps to the HashiCorp Certified: Terraform Associate (003) exam. It directly serves the objective “Read, generate, and modify configuration” — specifically describe when to use provisioners and the provisioner types (local-exec, remote-exec, the file provisioner), understand that provisioners are a last resort, and the behaviour of creation-time vs when = destroy provisioners and on_failure. It also touches “Understand Terraform basics” (provider vs provisioner — a frequent exam distractor: null_resource needs the null provider, while terraform_data is built in) and the meta-argument knowledge from the resources lesson (triggers, replace_triggered_by). Expect at least one exam item that asks which alternative to provisioners is appropriate for first-boot configuration (answer: cloud-init/user_data) and one on what on_failure = continue versus the default does. The later Terraform Associate Prep Kit drills these as practice questions.

Glossary

Next steps

You now know provisioners cold — every type, every option, when (rarely) to use them, and the better tool for each job. The natural next move is to go deep on the thing provisioners most often abuse and leak through: state. Continue with Terraform State, In Depth: the State File, the state Commands, Locking & Sensitive Data, which explains the file that records “this provisioner ran,” how triggers/input land in it as plaintext, the terraform state commands, locking, and how to keep sensitive data out of harm’s way. For the meta-arguments referenced throughout (triggers, replace_triggered_by, create_before_destroy), revisit Terraform Resources & Meta-Arguments, In Depth: count, for_each, depends_on & lifecycle.

TerraformProvisionersnull_resourceterraform_datacloud-initOpenTofu
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments