Terraform Provisioners, In Depth: local-exec, remote-exec, connection, null_resource & terraform_data

Every Terraform codebase eventually reaches the moment where the declarative model runs out and someone wants to do a thing: run a script after a VM boots, copy a config file onto a host, register a freshly created node with an external system, fire a webhook when a resource is destroyed. Terraform’s answer to “but I need to run an imperative step” is the provisioner — a block you attach to a resource that executes a command locally or on the remote machine as part of the create or destroy. Provisioners are the most tempting feature in the language and the one HashiCorp most loudly tells you to avoid. The official documentation opens the provisioners page with the words “Provisioners are a Last Resort,” and they mean it. They are an escape hatch out of the very properties — planning, idempotency, a graph of known resources — that make Terraform worth using.

This lesson covers provisioners exhaustively, because you will read code that uses them and you will occasionally, legitimately, need them. We will go through every provisioner Terraform ships (local-exec, remote-exec, and the file provisioner), every argument each one accepts, the full connection block that remote-exec and file depend on (SSH and WinRM, bastion hosts, host-key checking), the difference between creation-time and destroy-time (when = destroy) provisioners, and the on_failure setting that decides what happens when a provisioner errors. Then we cover the two resources that exist mainly to host provisioners and triggers: the legacy null_resource and its modern, provider-less replacement terraform_data (Terraform 1.4+). Throughout — and this is the point — we keep coming back to why each of these breaks the model, and what you should reach for instead: cloud-init / user_data, Packer baked images, configuration management (Ansible/Chef/Puppet/Salt), and, most often, a native provider resource that already does the thing you were about to shell out for.

Everything here applies equally to OpenTofu, the open-source fork: the provisioner syntax, the connection block, null_resource, and terraform_data are identical. I will assume Terraform 1.4 or later (so terraform_data is available) and note where 1.4 matters.

Learning objectives

After working through this lesson you will be able to:

Explain why provisioners are a last resort — exactly how they break Terraform’s plan, idempotency, and dependency model — and articulate the better alternative for each common use case.
Use every provisioner: local-exec (with command, working_dir, environment, interpreter, when, on_failure), remote-exec (inline, script, scripts), and the file provisioner (source/content → destination).
Configure the connection block in full for both SSH and WinRM, including bastion/jump-host fields, host-key verification, agents, and timeouts.
Distinguish creation-time from destroy-time provisioners (when = destroy), understand the strict rules and gotchas of destroy-time provisioners, and control error handling with on_failure = continue | fail.
Use null_resource with triggers to run provisioners on a schedule of your choosing, and replace it with terraform_data (input, output, triggers_replace) where appropriate.
Choose the right tool — cloud-init/user_data, Packer, configuration management, or a native resource — so that you almost never need a provisioner at all.

Prerequisites

You should be comfortable with the Terraform basics — the resource block, the init → plan → apply → destroy workflow, and what state is — at the level of the course’s Terraform Fundamentals lesson. You should understand resource meta-arguments (count, for_each, depends_on, lifecycle), because null_resource/terraform_data and the replace_triggered_by pattern build directly on them; the Resources & Meta-Arguments lesson covers those in depth and is the natural companion to this one. A little familiarity with SSH (key pairs, known-hosts) helps for the remote-exec/file/connection sections, and you will want a free cloud account or a local Docker/VM for the lab. This lesson sits in the HCL module of the Terraform Zero-to-Hero ladder, right after the functions and dynamic-blocks lessons and before the deep dive into state.

Why provisioners are a last resort (read this first)

It is genuinely important to understand the objection before you learn the syntax, because the syntax is the easy part and the judgement is the whole job. Provisioners break Terraform in four specific, concrete ways.

Problem	What actually happens	Why it hurts
Invisible to `plan`	A provisioner’s commands are opaque strings. `terraform plan` cannot look inside `command = "..."` or `inline = [...]` and tell you what they will do. The plan shows that the resource (or `null_resource`) will be created — never the effect of the script.	You lose Terraform’s single best safety feature: knowing the consequences of `apply` before you run it.
Not idempotent	A provisioner runs once, at create time (or once at destroy time). Terraform records “this provisioner ran” — not “the system is now in state X.” If the script half-fails, or the world drifts, re-running `apply` will not re-run the provisioner.	The whole premise of Terraform (converge to a desired state, safe to re-run) is gone for whatever the provisioner touched.
Tainting on failure	If a creation-time provisioner fails (and `on_failure` is the default `fail`), Terraform marks the resource tainted — it is left created in the real world but flagged for destroy-and-recreate on the next apply.	A flaky post-create script can force the destruction of a perfectly good (possibly stateful) resource.
Hidden, out-of-graph side effects	Provisioners reach outside the dependency graph — they SSH to hosts, call external APIs, write local files. Terraform doesn’t know what they touched, so it can’t order, refresh, or clean up those effects.	Drift Terraform can never see; ordering bugs; destroy-time logic that silently doesn’t run (see below).

HashiCorp’s own guidance is blunt: use provisioners only when there is no other option, and for most needs there is another option. The decision table below is the one to internalise — it maps every common “I’ll just use a provisioner” instinct to what you should do instead. We expand on each alternative in the final sections.

You want to…	Don’t reach for a provisioner — use…
Run a setup script when a VM first boots (install packages, write config, start a service)	cloud-init / `user_data` (`aws_instance.user_data`, `azurerm_linux_virtual_machine.custom_data`, GCP `metadata.startup-script`)
Have software/config already present in the image (golden image)	Packer (or the cloud’s image builder) — bake it once, boot it many times
Configure many hosts consistently and re-converge over time	Configuration management — Ansible, Chef, Puppet, Salt — run after Terraform creates the hosts
Create/modify a resource in a service Terraform supports	The native provider resource (there’s almost certainly one)
Pass data out of Terraform to a script or other tool	`output` values, or write a file with `local_file` and read it elsewhere
Run a one-off API call there’s no resource for	A provider’s *`_api`/`http` resource, the `http` data source** (read-only), or a small custom provider — and only then `local-exec`
Wait for something to become ready	A provider’s native `wait`/health attributes, `time_sleep`, or a data source that polls — rarely a provisioner loop

Keep that table in mind as we go. Now the mechanics.

The two kinds of provisioner and where they attach

Provisioners come in two broad families:

local-exec — runs a command on the machine running Terraform (your laptop, the CI runner). It never touches the remote resource directly; it just runs a local process (which might itself call out, e.g. aws, kubectl, curl).
Remote provisioners — remote-exec and file — act on the resource being created, over a network connection you describe in a connection block. remote-exec runs commands on the remote host; file copies files to it.

A provisioner block is nested inside a resource block. It can attach to any resource, but two resources exist specifically to carry provisioners when you have nothing else to hang them on: null_resource and terraform_data (covered later). Every provisioner — regardless of type — accepts the two meta-arguments when (creation-time vs destroy-time) and on_failure (fail vs continue), which we treat in their own sections.

resource "aws_instance" "web" {
  ami           = "ami-0abcd1234"
  instance_type = "t3.micro"

  # A provisioner attached to this resource:
  provisioner "local-exec" {
    command = "echo 'instance ${self.id} created at ${self.public_ip}'"
  }
}

Note self — inside a provisioner, self refers to the resource the provisioner is attached to. This is the only place self is valid, and it exists precisely because destroy-time provisioners can’t reference other resources (more on that later). Use self.attribute to read the parent resource’s attributes.

`local-exec`: run a command on the Terraform machine

local-exec invokes a local executable after the resource is created (or destroyed). It is the least objectionable provisioner because it doesn’t need network access to the resource, but it is still imperative, still invisible to plan, and still non-idempotent. Here is the complete argument surface.

Argument	Type	Required?	What it does / default
`command`	string	Yes (unless `interpreter`-only)	The command line to execute. This is the script source — Terraform passes it to the interpreter. Multi-line via heredoc is allowed.
`working_dir`	string	No	Directory to run the command in. Default: the module directory. Relative paths resolve from there.
`environment`	map(string)	No	Extra environment variables for the child process, merged onto the inherited environment. Values are strings; non-string must be `tostring()`-ed.
`interpreter`	list(string)	No	The interpreter + leading args used to run `command`. Default is platform-dependent: `["/bin/sh", "-c"]` on Unix, `["cmd", "/C"]` on Windows. Set it to use `python`, `pwsh`, `bash -eu -o pipefail`, etc.
`when`	`destroy`	No	If `destroy`, the command runs on destroy instead of create. (Meta-argument; see below.)
`on_failure`	`continue`/`fail`	No	What to do if the command exits non-zero. Default `fail`. (Meta-argument; see below.)
`quiet`	bool	No	(local-exec only) If `true`, suppresses echoing the command text to the log/UI (the output still streams). Default `false`. Use it to avoid leaking sensitive command lines.

A fully featured example:

resource "terraform_data" "seed_db" {
  triggers_replace = [var.schema_version]   # re-run when the schema version changes

  provisioner "local-exec" {
    command     = "./scripts/seed.sh"
    working_dir = "${path.module}/db"
    interpreter = ["/bin/bash", "-eu", "-o", "pipefail"]   # fail fast on errors/unset vars

    environment = {
      DB_HOST     = var.db_endpoint
      DB_PASSWORD = var.db_password   # sensitive — see security notes
      PGSSLMODE   = "require"
    }

    quiet      = true        # don't echo the command (it may contain secrets in argv)
    on_failure = fail        # default; abort the apply if seeding fails
  }
}

Key points and gotchas for local-exec:

interpreter matters more than people think. The default /bin/sh -c is not bash; arrays, pipefail, and [[ ]] won’t work. For anything non-trivial, set interpreter = ["/bin/bash", "-eu", "-o", "pipefail"] so a failing command in the middle of a pipeline actually fails the provisioner.
command runs through a shell, so quoting and interpolation are a security and correctness minefield. Interpolating untrusted or attacker-influenced values into command is a shell-injection risk. Prefer passing data via environment (which is not re-parsed by the shell) over splicing it into the command string.
Working directory defaults to the module dir, not your CWD. Use path.module, path.root, or path.cwd deliberately. Relative script paths are a common source of “works on my laptop, fails in CI.”
local-exec is platform-coupled. A command of rm -rf or cmd /C del ties your config to an OS. CI runners and laptops differ; this is a portability tax provisioners quietly add.
The output is not a value. local-exec cannot return data into Terraform. If you need the script’s output as a Terraform value, you want the external data source or to write a file and read it — not a provisioner. (And if you’re doing that, reconsider whether Terraform should own this at all.)

`remote-exec`: run commands on the remote host

remote-exec connects to the resource (via the connection block) and runs commands on it. It is used to bootstrap a freshly created server. It requires a working connection, which means the host must be reachable and accept your credentials at apply time from wherever Terraform runs — already a fragile coupling. It accepts exactly one of three mutually exclusive arguments describing what to run:

Argument	Type	What it does
`inline`	list(string)	A list of command strings executed in order on the remote host. Simplest for a few commands.
`script`	string	A local path to a single script file. Terraform copies it to the remote host and executes it.
`scripts`	list(string)	A list of local script paths, copied and executed in order.

You may set only one of inline, script, or scripts. Like all provisioners, remote-exec also takes when and on_failure. It does not take local-exec’s command/interpreter/environment/working_dir — to set environment variables remotely you export them inside inline/the script; to choose an interpreter you use a shebang in the script.

resource "aws_instance" "web" {
  ami           = var.ami_id
  instance_type = "t3.micro"
  key_name      = aws_key_pair.deploy.key_name
  # ... vpc, subnet, security group allowing inbound SSH from the Terraform runner ...

  connection {
    type        = "ssh"
    host        = self.public_ip
    user        = "ubuntu"
    private_key = file("~/.ssh/deploy_key")
  }

  provisioner "remote-exec" {
    inline = [
      "set -euo pipefail",
      "sudo apt-get update -y",
      "sudo apt-get install -y nginx",
      "echo 'hello from ${self.id}' | sudo tee /var/www/html/index.html",
      "sudo systemctl enable --now nginx",
    ]
    on_failure = fail
  }
}

Critical things to know about remote-exec:

It needs the host to be reachable from the Terraform runner. That means a public IP or a bastion, a security group/NSG/firewall that allows the connection from the runner’s source IP, and the SSH/WinRM service already running. In CI, the runner’s egress IP may be dynamic — a real operational headache that user_data simply does not have.
inline commands do not share a shell session unless you make them. Each entry is run in sequence, but you should still start with set -euo pipefail (or chain with &&) so a failure aborts rather than silently continuing.
script/scripts copy files first, so they implicitly use the file provisioner mechanism under the hood and need the same working connection and a writable temp dir on the host.
There is no environment argument. Export variables inside the commands ("export FOO=bar") — but remember the next inline entry is a fresh invocation, so set them in the same entry or in the script.
It still isn’t idempotent. Run the apply, the box gets nginx. Re-run apply — nothing happens (the resource already exists). Change the inline block — still nothing happens, because editing a provisioner does not force the resource to re-provision. To re-run, you must taint/replace the resource (terraform apply -replace=...) or move the logic onto a null_resource/terraform_data with triggers. This is the single most common provisioner surprise.

The `file` provisioner: copy files to the remote host

The file provisioner copies a file or directory from the Terraform machine (or inline content) to the remote resource, again over the connection. Its arguments:

Argument	Type	What it does
`source`	string	Local path to a file or directory to upload. Mutually exclusive with `content`. A trailing slash on a directory source controls whether the dir itself or its contents are copied (rsync-style).
`content`	string	Literal content to write to `destination` (use instead of `source` for small, generated files — e.g. from `templatefile()`). Creates a file; cannot create directories.
`destination`	string	Required. The absolute path on the remote host to write to. The remote user must have permission; intermediate dirs are not created for `content`.

provisioner "file" {
  content     = templatefile("${path.module}/app.conf.tftpl", { port = var.port })
  destination = "/etc/app/app.conf"
  connection { /* ... same connection block ... */ }
}

Gotchas: the destination directory must already exist (for content, and typically for single-file source); permissions are the remote user’s, so you often pair file (copy as the login user to /tmp) with remote-exec (sudo mv into place); and over WinRM the path uses backslashes. As with the others — invisible to plan, runs once, no idempotency. If you find yourself copying config files this way, that’s exactly what cloud-init write_files or a golden image is for.

The `connection` block: SSH, WinRM, bastions, and host keys

Both remote-exec and the file provisioner need to reach the host, and they do it through a connection block. You can put connection inside a single provisioner (scoped to it) or directly in the resource block (shared by all provisioners on that resource). The block’s first decision is type: ssh (default, Linux/Unix) or winrm (Windows).

Common and SSH fields

Field	Applies to	Default	What it does
`type`	both	`ssh`	Connection type: `ssh` or `winrm`.
`host`	both	— (required)	Address to connect to. Usually `self.public_ip`, `self.private_ip`, or a DNS name.
`port`	both	`22` (ssh) / `5985` or `5986` (winrm)	TCP port.
`user`	both	`root` (ssh) / `Administrator` (winrm)	Login user.
`password`	both	—	Login password (prefer keys for SSH).
`timeout`	both	`5m`	How long to wait for the connection to become available (e.g. while the box boots). Format like `"10m"`.
`script_path`	both	platform temp path	Remote path where copied scripts are staged before execution. Override if the default temp dir is noexec or read-only.
`private_key`	ssh	—	PEM-encoded private key contents (use `file(...)`, not a path). Preferred over `password`.
`certificate`	ssh	—	A signed certificate (PEM) to use with `private_key`, for SSH CA setups.
`agent`	ssh	`true` if `SSH_AUTH_SOCK` set	Use the local SSH agent for auth.
`agent_identity`	ssh	—	Preferred identity (public key) when the agent holds several.
`host_key`	ssh	—	The expected host public key for verification. If unset, Terraform does not verify the host key (TOFU/none) — a real MITM exposure (see security notes).
`target_platform`	ssh	`unix`	`unix` or `windows` (SSH onto Windows). Affects path handling.

WinRM-specific fields

Field	Default	What it does
`https`	`false`	Use HTTPS (port 5986) instead of HTTP (5985). Strongly recommended.
`insecure`	`false`	Skip TLS certificate verification (don’t, in production).
`use_ntlm`	`false`	Use NTLM authentication.
`cacert`	—	CA certificate (PEM) to validate the WinRM server’s TLS cert.

Bastion / jump-host fields (SSH)

When the target has no public IP, Terraform can hop through a bastion. These fields describe the jump host; they default to their non-bastion_ counterparts where sensible.

Field	Default	What it does
`bastion_host`	—	Address of the bastion/jump host. Setting this enables bastion mode.
`bastion_host_key`	—	Expected host key of the bastion (verify it!).
`bastion_port`	value of `port`	Bastion SSH port.
`bastion_user`	value of `user`	Bastion login user.
`bastion_password`	value of `password`	Bastion password.
`bastion_private_key`	value of `private_key`	Bastion private key (PEM contents).
`bastion_certificate`	value of `certificate`	Signed cert for the bastion.

A connection through a bastion to a private instance:

connection {
  type                = "ssh"
  user                = "ubuntu"
  host                = self.private_ip          # target has no public IP
  private_key         = file("~/.ssh/app_key")
  host_key            = var.target_host_key      # verify the target

  bastion_host        = aws_instance.bastion.public_ip
  bastion_user        = "ec2-user"
  bastion_private_key = file("~/.ssh/bastion_key")
  bastion_host_key    = var.bastion_host_key     # verify the bastion
  timeout             = "10m"                      # allow time for boot
}

The connection block is where most remote-exec/file pain lives: timeouts because the box isn’t ready, auth failures because the key/user is wrong, and silent insecurity because host_key/bastion_host_key were left unset. All of which, again, you avoid entirely by not opening an inbound connection at all and using user_data instead.

Creation-time vs destroy-time provisioners (`when`)

By default a provisioner is a creation-time provisioner: it runs after the resource is created, during apply. Setting when = destroy makes it a destroy-time provisioner that runs before the resource is destroyed.

provisioner "local-exec" {
  when    = destroy
  command = "curl -X DELETE https://registry.example.com/nodes/${self.id}"
}

Destroy-time provisioners are useful for deregistration — pulling a node out of a load balancer or external inventory before the VM disappears — but they come with strict, easily-violated rules:

The provisioner config must still exist at destroy time. If you delete the resource block (or the whole module) from your code, the destroy-time provisioner goes with it and never runs. To retire a resource that has a destroy-time provisioner, you typically must terraform destroy (or -target destroy) it while the block is still present, then remove the code.
Destroy-time provisioners cannot reference anything but self, count.index, and each.key/each.value. Since Terraform 0.12.x this is enforced: you cannot use var.*, other resources, or local.* inside a destroy-time provisioner, because those may not exist (or may differ) at destroy. This is why self exists. If you need an external value at destroy time, the trick is to stash it in triggers on a null_resource/terraform_data (which is captured in state) and read it via self.triggers["key"].
A resource can have both creation-time and destroy-time provisioners; they’re distinguished solely by when.
create_before_destroy interacts with ordering. With lifecycle { create_before_destroy = true }, the new resource is created (and its create-time provisioners run) before the old one’s destroy-time provisioners. Reason about ordering carefully when both are in play.

The classic safe pattern for destroy-time deregistration captures the needed identifiers in triggers so self can see them:

resource "terraform_data" "lb_registration" {
  input = {
    node_id = aws_instance.app.id
    lb_name = var.lb_name           # captured into state so destroy-time self can read it
  }
  triggers_replace = [aws_instance.app.id]

  provisioner "local-exec" {                       # register on create
    command = "lb register --node ${self.input.node_id} --lb ${self.input.lb_name}"
  }
  provisioner "local-exec" {                       # deregister on destroy
    when    = destroy
    command = "lb deregister --node ${self.input.node_id} --lb ${self.input.lb_name}"
  }
}

A historical note you’ll see in old code: there used to be when = create written explicitly. create is the default and writing it is unnecessary; only destroy changes behaviour.

`on_failure`: what happens when a provisioner errors

Each provisioner takes on_failure, which controls Terraform’s reaction to a non-zero exit / error from that provisioner:

Value	Behaviour	Effect on the resource
`fail` (default)	Abort the apply with an error.	For a creation-time provisioner, Terraform marks the resource tainted — it stays created in the cloud but is scheduled for destroy-and-recreate on the next apply. For a destroy-time provisioner, the failure halts the destroy (the resource is not destroyed).
`continue`	Log the error but carry on as if the provisioner succeeded.	The resource is not tainted; the apply/destroy proceeds. Use for best-effort steps whose failure shouldn’t block (e.g. a non-critical notification).

provisioner "local-exec" {
  command    = "send-slack-notification.sh ${self.id}"
  on_failure = continue          # a failed notification must not taint the instance
}

The tainting behaviour is the dangerous one and the reason to be deliberate. Imagine a remote-exec that installs software on a database VM and occasionally times out on a slow apt mirror. With the default on_failure = fail, that transient failure taints the VM — and the next apply will destroy and recreate your database. This is exactly the class of foot-gun that motivates “provisioners are a last resort”: a non-infra side effect is allowed to trigger destruction of infrastructure. If a step is genuinely best-effort, set continue; if it’s critical, ask whether user_data/Packer/config-management — where a failure is observable and re-runnable without destroying the box — is the better home for it.

`null_resource`: the original provisioner host

Sometimes you want to run a provisioner that isn’t naturally tied to any one real resource — run a script after several resources exist, or re-run a command whenever some input changes. The historical tool for this is null_resource, from the hashicorp/null provider. It does nothing on its own; it exists to carry provisioners and to expose a triggers map that controls when it is replaced (and therefore when its provisioners re-run).

resource "null_resource" "cluster_bootstrap" {
  # Replace (and re-run provisioners) whenever any of these change:
  triggers = {
    cluster_instance_ids = join(",", aws_instance.cluster[*].id)
    bootstrap_script_hash = filemd5("${path.module}/bootstrap.sh")
  }

  connection {
    type        = "ssh"
    host        = aws_instance.cluster[0].public_ip
    user        = "ubuntu"
    private_key = file("~/.ssh/deploy_key")
  }

  provisioner "remote-exec" {
    inline = ["sudo /opt/bootstrap.sh"]
  }
}

How triggers works — this is the whole point of null_resource:

triggers is a map of strings. Terraform stores it in state.
On each plan, Terraform compares the current triggers map to the stored one. If any value changed, the null_resource is replaced (destroyed and recreated), which re-runs its creation-time provisioners (and destroy-time ones during the destroy half).
If triggers is unchanged, nothing happens — the provisioners do not re-run.
A common idiom is triggers = { always_run = timestamp() } to force the provisioner to run on every apply (because timestamp() always differs) — useful but a code smell; it means you’ve made an imperative step that runs unconditionally.
Reference other resources inside triggers (like aws_instance.cluster[*].id) to wire the bootstrap to the lifecycle of the things it bootstraps.

null_resource works fine and you will see it everywhere, but it has a real cost: it pulls in the null provider as a dependency (one more plugin to download, pin in the lock file, and keep updated). That is exactly the friction terraform_data removes.

`terraform_data`: the modern, provider-less replacement

Since Terraform 1.4, the built-in terraform_data resource replaces null_resource for almost every use. It is a managed resource built into Terraform itself — no provider, nothing to install or pin — and it does two jobs null_resource couldn’t do as cleanly:

It can store and pass through arbitrary data via its input/output attributes (not just a string map).
It triggers replacement via triggers_replace, which accepts any value type (not just a map(string)).

Attribute	Direction	What it is
`input`	argument (in)	Any value you want this resource to hold/pass through. Stored in state.
`output`	attribute (out)	Echoes `input` back as a computed value — handy as an explicit dependency anchor or to “freeze” a value.
`triggers_replace`	argument (in)	Any value (string, list, object, …). When it changes, `terraform_data` is replaced, re-running attached provisioners — exactly like `null_resource.triggers`, but type-flexible.
`id`	attribute (out)	A generated unique string id.

Three idiomatic uses:

(a) As a provisioner host (the null_resource replacement):

resource "terraform_data" "bootstrap" {
  triggers_replace = [
    join(",", aws_instance.cluster[*].id),
    filemd5("${path.module}/bootstrap.sh"),     # any list/object works — no join-to-string needed
  ]

  provisioner "local-exec" {
    command = "ansible-playbook -i '${join(",", aws_instance.cluster[*].public_ip)},' bootstrap.yml"
  }
}

(b) To pass a value through and force replacement when it changes (no provisioner at all):

resource "terraform_data" "version" {
  input = var.revision           # stored and echoed via .output
}

resource "aws_instance" "app" {
  # Re-create the instance whenever the revision changes, using lifecycle:
  lifecycle {
    replace_triggered_by = [terraform_data.version]
  }
  # ...
}

This terraform_data + replace_triggered_by combination (Terraform 1.2+) is the modern, plan-visible way to say “recreate X when Y changes” — and it’s strictly better than a null_resource provisioner for that job because the replacement shows up in the plan. Prefer it whenever your goal is “re-create this resource on a trigger” rather than “run an imperative script.”

© To store data not tracked by any real resource (so you can detect changes), e.g. caching a value Terraform should react to.

	`null_resource`	`terraform_data`
Source	`hashicorp/null` provider (must be declared/installed)	Built into Terraform (1.4+) — no provider
Trigger field	`triggers` — `map(string)` only	`triggers_replace` — any type
Data pass-through	none	`input` → `output` (any type)
Hosts provisioners?	yes	yes
Recommendation	legacy; fine but superseded	preferred for new code

The practical migration: replace resource "null_resource" "x" with resource "terraform_data" "x", change triggers = {...} to triggers_replace = [...] (you no longer need to coerce everything to strings or join), and drop the null provider from required_providers if nothing else uses it. Functionally identical, one fewer dependency. (OpenTofu supports terraform_data as well.)

Embedded diagram

Terraform provisioners decision and execution model: local-exec vs remote-exec/file, connection, when/on_failure, null_resource vs terraform_data, and the alternatives that usually replace them

The diagram traces the decision you should actually make: start from “do I really need an imperative step?”, route the common cases to cloud-init/user_data, Packer, config management, or a native resource, and only fall through to provisioners as the last resort — then show, for the genuine cases, how local-exec (local machine) and remote-exec/file (remote host via connection) execute, how when/on_failure gate them, and where null_resource/terraform_data + triggers sit as the host.

Hands-on lab

This lab is free and entirely local — no cloud account, no SSH targets, no inbound ports. We use local-exec, null_resource, and terraform_data so you can feel the trigger/idempotency behaviour without spending a rupee or opening a firewall. (We deliberately don’t do a live remote-exec lab, because the right lesson is that you rarely should — but the syntax above is complete for when you must.)

1. Set up. Create a working directory and a main.tf:

terraform {
  required_version = ">= 1.4"
  # No providers needed at all — terraform_data is built in,
  # and local-exec doesn't need one either.
}

variable "revision" {
  type        = string
  default     = "v1"
  description = "Bump this to see triggers_replace re-run the provisioner."
}

# (a) terraform_data hosting a local-exec, gated by a trigger:
resource "terraform_data" "build" {
  triggers_replace = [var.revision]

  provisioner "local-exec" {
    command = "echo 'BUILD ran for revision ${self.triggers_replace[0]} at $(date)' >> build.log"
  }
}

# (b) The legacy equivalent, for comparison:
resource "null_resource" "build_legacy" {
  triggers = { revision = var.revision }

  provisioner "local-exec" {
    command = "echo 'LEGACY ran for revision ${self.triggers.revision}' >> build.log"
  }
}

# (c) A destroy-time provisioner (note: only self is allowed):
resource "terraform_data" "cleanup_demo" {
  input = "demo-node-42"
  provisioner "local-exec" {
    when    = destroy
    command = "echo 'DEREGISTER ${self.input}' >> build.log"
  }
}

output "build_id" {
  value = terraform_data.build.output   # demonstrates input->output pass-through is null here; see note
}

Note: because we set triggers_replace/input but not input on build, terraform_data.build.output is null — that’s expected; output only echoes input. The cleanup_demo resource sets input so its destroy-time self.input works.

2. Init and apply.

terraform init        # downloads only the null provider (for build_legacy); terraform_data needs none
terraform apply -auto-approve
cat build.log

Expected: build.log now contains a BUILD ran for revision v1 ... line and a LEGACY ran for revision v1 line. The DEREGISTER line is not there yet (it’s a destroy-time provisioner).

3. Prove idempotency is gone for provisioner effects. Run apply again with no changes:

terraform apply -auto-approve
cat build.log

Expected: no new lines. The provisioners did not re-run, because nothing in triggers_replace/triggers changed. This is the key lesson — provisioners run on create/replace, not on every apply.

4. Trigger a re-run by changing the trigger.

terraform apply -auto-approve -var 'revision=v2'
cat build.log

Expected: two new lines (BUILD ran for revision v2, LEGACY ran for revision v2) because the trigger changed, replacing both resources and re-running their create-time provisioners.

5. Watch a destroy-time provisioner fire.

terraform destroy -auto-approve
cat build.log

Expected: a new DEREGISTER demo-node-42 line appears — the when = destroy provisioner ran during teardown.

6. Validation checklist.

After step 2: build.log has exactly 2 lines.
After step 3: still 2 lines (idempotency proof).
After step 4: 4 lines.
After step 5: 5 lines, the last being DEREGISTER demo-node-42.

Cleanup. terraform destroy -auto-approve (already done in step 5) removes the state-tracked resources; then delete the working directory and build.log. There are no cloud resources to remove.

Cost note. ₹0 / $0. Everything is local — terraform_data and local-exec create nothing in any cloud. The null provider is a local plugin. This is the cheapest possible Terraform lab.

Common mistakes & troubleshooting

Symptom	Likely cause	Fix
Edited a `remote-exec`/`local-exec` block but `apply` does nothing	Editing a provisioner does not re-trigger it; the resource already exists	Move the provisioner to a `null_resource`/`terraform_data` with a `triggers`/`triggers_replace` that captures what changed; or force it with `terraform apply -replace=ADDRESS`
`remote-exec` hangs then fails with “timeout - last error: dial tcp … i/o timeout”	Host not reachable from the Terraform runner: no public IP/bastion, security group/NSG blocks the runner’s IP, or SSH not up yet	Allow inbound from the runner, add a `bastion_host`, raise `connection { timeout }`, or — better — switch to `user_data` so no inbound is needed
A flaky post-create script destroyed my VM on the next apply	Creation-time provisioner failed with default `on_failure = fail`, which tainted the resource	Set `on_failure = continue` for best-effort steps; move critical setup to `user_data`/Packer where failure doesn’t taint
Destroy-time provisioner never ran	The resource/module block was deleted from code before destroying, so the provisioner went with it; or it referenced `var.*` and errored	Destroy while the block exists; only reference `self`, `count.index`, `each.*` in destroy-time provisioners (stash needed values in `triggers`)
“Invalid reference: … cannot be used in a destroy-time provisioner”	Used `var.`, `local.`, or another resource inside a `when = destroy` provisioner	Capture those values in `triggers`/`input` and read them via `self.triggers[...]` / `self.input`
`local-exec` script “works locally, fails in CI”	Wrong `working_dir`, missing interpreter (`/bin/sh` ≠ bash), or a tool not installed on the runner	Set `working_dir` with `path.module`, set `interpreter = ["/bin/bash", "-eu", "-o", "pipefail"]`, install/declare the tool
Secrets visible in plan/apply logs or state	Provisioner `command`/`environment`/`triggers` carry secrets; Terraform logs the command and stores triggers in plaintext state	Use `quiet = true` (local-exec), pass secrets via `environment` not `command` argv, avoid putting secrets in `triggers`; treat state as sensitive
`host_key`/`bastion_host_key` warnings ignored; worried about MITM	`connection` does no host-key verification when those are unset	Set `host_key`/`bastion_host_key` to the expected keys; never disable verification in production

Best practices

Default to not using a provisioner. Walk the decision table at the top. For “configure a box,” use cloud-init/user_data; for “ship software in the image,” use Packer; for “manage many hosts over time,” use Ansible/Chef/Puppet/Salt after Terraform; for “do a thing in a supported service,” use the native resource. Provisioners are the answer only when none of those fit.
Prefer terraform_data over null_resource in new code — same capability, no extra provider, type-flexible triggers_replace, plus input/output pass-through.
Prefer replace_triggered_by over a trigger-only null_resource when your goal is “recreate X when Y changes” — it’s visible in the plan.
Make triggers explicit and meaningful. Use filemd5()/filesha256() of the script so changing the script re-runs it; capture the resource ids you depend on. Avoid timestamp()/uuid() “always run” triggers except as a conscious, documented choice.
Use local-exec over remote-exec when you can. A local aws/kubectl/curl against an API is far more robust than opening SSH to a booting box. local-exec needs no inbound network and no host-key dance.
Harden local-exec scripts: set a strict interpreter (bash -eu -o pipefail), set working_dir, pass data via environment (not interpolated into command), and use quiet = true if the command line could contain secrets.
Be deliberate about on_failure. Use continue for best-effort, fail (default) only where a failure genuinely should stop everything — and remember fail taints create-time resources.
Keep destroy-time provisioners minimal and self-only. Use them for deregistration; stash any external identifiers in triggers/input; never delete the block before you’ve destroyed the resource.
Document why. A provisioner in a PR should carry a comment explaining why none of the alternatives worked. Future-you (and your reviewers) need it.

Security notes

Provisioners run with the Terraform runner’s privileges and reach outside the graph. A local-exec command runs whatever shell you give it on the CI runner or your laptop — treat the command string as code, and never interpolate untrusted input into it (shell injection).
Secrets leak in three places. (1) The command line is echoed to logs unless you set quiet = true (local-exec) — and even then argv may be visible to other processes; prefer environment. (2) triggers/input are written to state in plaintext — don’t put secrets there. (3) Anything you pass to a remote host via file/remote-exec transits the connection and lands on disk there. Encrypt your state backend, restrict who can read it, and treat plan/apply logs as sensitive.
connection host-key verification is off by default. With host_key/bastion_host_key unset, Terraform will connect to any host answering at that address — a man-in-the-middle can intercept your credentials and commands. Always set the expected host keys for remote-exec/file over SSH; use https = true and a cacert (never insecure = true) for WinRM.
Prefer keys and short-lived credentials over passwords in connection; use the SSH agent rather than embedding private_key where possible; scope inbound firewall rules to the runner’s IP and remove them after bootstrap (or avoid inbound entirely with user_data).
Every inbound port you open for remote-exec is attack surface. user_data/cloud-init needs no inbound SSH/WinRM at all — that alone is a strong security reason to prefer it.

The alternatives, in full (use these instead)

This section is the positive half of “provisioners are a last resort.” For each common need, here is the tool that does it without breaking the model.

cloud-init / user_data / custom_data / startup scripts. Every major cloud lets you pass a script or cloud-init config that the VM runs on first boot, executed by the instance itself — no inbound connection, no host-key dance, no tainting. In Terraform that’s aws_instance.user_data (or user_data_base64), azurerm_linux_virtual_machine.custom_data, and GCP’s metadata = { startup-script = ... }. cloud-init’s declarative write_files, packages, and runcmd cover the vast majority of “install X, write config Y, start service Z.” This replaces almost every remote-exec/file you’ll ever be tempted to write. Pair it with templatefile() to render the script from Terraform values.
Packer (golden / immutable images). If the same packages and config are needed on every boot, bake them into the image once with Packer (or the cloud’s image builder) and have Terraform boot that image. Boots are then fast and identical, there’s nothing to run at create time, and you get immutable infrastructure (replace, don’t mutate). Use cloud-init only for the small last-mile, instance-specific bits.
Configuration management (Ansible, Chef, Puppet, Salt). For ongoing convergence across a fleet — not just first boot — let Terraform create the hosts and a dedicated CM tool configure them. Ansible in particular pairs well: Terraform outputs the inventory (or you use a dynamic inventory), Ansible runs afterwards and re-converges idempotently. This keeps “what exists” (Terraform) separate from “how it’s configured” (CM), which is the clean separation provisioners blur.
Native provider resources. Before shelling out, search the provider: there is very likely a resource for what you want (DNS record, IAM policy, LB target-group attachment, database, Kubernetes object via the kubernetes provider, a GitHub repo, a Datadog monitor…). A native resource is planned, idempotent, refreshable, and destroyable — everything a provisioner is not.
http data source / restapi/http providers / external data source. For a genuine one-off API interaction with no resource, the read-only http data source can GET data; community providers like Mastercard/restapi or apparentlymart/http can manage API-backed resources declaratively; and the external data source can run a program that returns JSON as a value (read-only, at plan/refresh time) — often what people actually wanted from local-exec.
time_sleep / native readiness. To wait for eventual consistency, prefer a provider’s native health/wait_for attributes or the time_sleep resource over a remote-exec polling loop.

The honest summary: in modern Terraform, a well-architected configuration uses user_data/Packer for machine setup, native resources for everything the provider supports, configuration management for fleet convergence, and reaches for a provisioner — usually a small local-exec on a terraform_data — only for the rare imperative glue that has no declarative home.

Interview & exam questions

Why does HashiCorp call provisioners a “last resort”? Because they break Terraform’s core guarantees: they’re invisible to plan, they’re not idempotent (run once on create/replace, not re-applied to converge), a failed create-time provisioner taints the resource (forcing destroy/recreate), and they cause side effects outside the dependency graph that Terraform can’t track, refresh, or clean up.
What’s the difference between local-exec and remote-exec? local-exec runs a command on the machine running Terraform (laptop/CI runner); remote-exec runs commands on the resource being created, over a connection block (SSH/WinRM). remote-exec additionally requires network reachability and credentials to the host.
Name the three mutually exclusive ways remote-exec specifies what to run. inline (a list of commands), script (one local script file, copied and run), and scripts (a list of local scripts, copied and run in order). Exactly one may be set.
How do you make a provisioner run again after you’ve changed it? Editing a provisioner does not re-trigger it. Either terraform apply -replace=ADDRESS to force the resource’s recreation, or host the provisioner on a null_resource/terraform_data and change its triggers/triggers_replace (e.g. include filemd5() of the script so editing the script re-runs it).
What does on_failure do, and what’s the danger of the default? It controls the reaction to a provisioner error: fail (default) aborts and taints a create-time resource (scheduling destroy/recreate); continue logs and proceeds. The danger: a transient post-create script failure can taint and thereby destroy/recreate otherwise-healthy (possibly stateful) infrastructure.
What are the rules for a when = destroy provisioner? It runs before the resource is destroyed; it may reference only self, count.index, and each.key/each.value (not var, local, or other resources); and it must still be present in the code at destroy time — deleting the block means it never runs. Stash external values in triggers/input to read via self.
What is self and where is it valid? Inside a provisioner (or connection) block, self refers to the resource the provisioner is attached to (self.id, self.public_ip). It exists chiefly so destroy-time provisioners, which can’t reference other objects, can still read their own resource’s attributes.
null_resource vs terraform_data — compare them. Both do nothing themselves and exist to host provisioners/triggers. null_resource comes from the hashicorp/null provider and its triggers is a map(string). terraform_data is built into Terraform (1.4+) — no provider — its triggers_replace accepts any type, and it adds input/output pass-through. Prefer terraform_data in new code.
How does the triggers map control behaviour? Terraform stores triggers in state; on each plan it compares the new map to the stored one, and if any value changed it replaces the resource (destroy+recreate), which re-runs its provisioners. Unchanged triggers mean the provisioners don’t re-run. { always = timestamp() } forces every-apply runs.
You need to run a setup script when a VM boots. What should you use, and why not remote-exec? Use cloud-init / user_data (custom_data on Azure, startup-script on GCP): the instance runs it itself on first boot, needing no inbound SSH, no host-key verification, no tainting, and it’s far more robust in CI. remote-exec requires opening inbound access from the (possibly dynamic) runner IP and is fragile and insecure by comparison.
What’s the modern, plan-visible way to recreate a resource when some value changes, without a provisioner? lifecycle { replace_triggered_by = [terraform_data.x] } where terraform_data.x holds the value (often via input/triggers_replace). Unlike a null_resource provisioner, the resulting replacement appears in terraform plan.
How do secrets leak through provisioners, and how do you mitigate? Through the command line in logs (mitigate with quiet = true and by passing via environment, not argv), through triggers/input stored in plaintext state (don’t put secrets there; encrypt/restrict the backend), and onto remote disks via file/remote-exec. Also enable host-key verification so credentials aren’t exposed to a MITM.

Quick check

True or false: editing the inline of a remote-exec provisioner will re-run it on the next apply.
Which provisioner needs a connection block: local-exec or remote-exec?
What is the default value of on_failure, and what does it do to a create-time resource on error?
Which built-in resource replaces null_resource, and which provider does it require?
Inside a when = destroy provisioner, which references are you allowed to use?

Answers

False. Editing a provisioner does not re-trigger it; the resource already exists. Use -replace or a triggers/triggers_replace change.
remote-exec (and the file provisioner). local-exec runs on the Terraform machine and needs no connection.
fail (the default): it aborts the apply and marks the resource tainted (scheduled for destroy-and-recreate).
terraform_data (Terraform 1.4+) — it requires no provider (it’s built in), unlike null_resource which needs hashicorp/null.
Only self, count.index, and each.key/each.value — not var.*, local.*, or other resources.

Exercise

Take a configuration that currently uses a null_resource with a remote-exec to install and start nginx on an EC2/VM via SSH (a very common legacy pattern). Refactor it twice and write up the trade-offs:

Modernise the host: convert the null_resource to terraform_data, change triggers to triggers_replace (include filesha256() of the bootstrap script so editing it re-runs), and remove the null provider from required_providers. Confirm init/plan behave identically.
Eliminate the provisioner entirely: move the nginx install/config into cloud-init user_data (use templatefile() to render a cloud-config with packages: [nginx], a write_files for the index page, and runcmd to enable the service). Remove the connection, the inbound SSH rule, and the remote-exec altogether.
Write three to four sentences comparing the two end states on: plan visibility, idempotency/re-run behaviour, security (inbound ports, host-key verification), and CI robustness. State which you’d ship and why. (Expected conclusion: the user_data version wins on every axis — this is the lesson.)

Certification mapping

This lesson maps to the HashiCorp Certified: Terraform Associate (003) exam. It directly serves the objective “Read, generate, and modify configuration” — specifically describe when to use provisioners and the provisioner types (local-exec, remote-exec, the file provisioner), understand that provisioners are a last resort, and the behaviour of creation-time vs when = destroy provisioners and on_failure. It also touches “Understand Terraform basics” (provider vs provisioner — a frequent exam distractor: null_resource needs the null provider, while terraform_data is built in) and the meta-argument knowledge from the resources lesson (triggers, replace_triggered_by). Expect at least one exam item that asks which alternative to provisioners is appropriate for first-boot configuration (answer: cloud-init/user_data) and one on what on_failure = continue versus the default does. The later Terraform Associate Prep Kit drills these as practice questions.

Glossary

Provisioner — A block nested in a resource that runs an imperative action (a command or file copy) at create or destroy time. The “last resort” escape hatch out of Terraform’s declarative model.
local-exec — A provisioner that runs a command on the machine running Terraform (not on the resource).
remote-exec — A provisioner that runs commands on the remote resource via a connection (SSH/WinRM); takes inline, script, or scripts.
file provisioner — Copies a local file/directory or inline content to a destination on the remote resource over the connection.
connection block — Describes how to reach the remote host (type ssh/winrm, host, user, private_key/password, host_key, bastion fields, timeout). Used by remote-exec and file.
self — Inside a provisioner/connection, a reference to the resource the provisioner is attached to (e.g. self.id). The only references allowed in destroy-time provisioners (with count.index/each.*).
Creation-time provisioner — Runs after the resource is created (the default; when unset or create).
Destroy-time provisioner — when = destroy; runs before the resource is destroyed. Strict reference rules; must still exist in code at destroy time.
on_failure — Reaction to a provisioner error: fail (default; aborts and taints a create-time resource) or continue (logs and proceeds).
Taint — A flag marking a resource for destroy-and-recreate on the next apply; set automatically when a create-time provisioner fails with on_failure = fail.
null_resource — A do-nothing resource from the hashicorp/null provider used to host provisioners; replaced (re-running provisioners) when its triggers (map(string)) change.
triggers — The map(string) on null_resource whose change forces replacement and re-runs the provisioners.
terraform_data — A built-in (Terraform 1.4+) provider-less resource replacing null_resource; has input/output pass-through and triggers_replace (any type).
triggers_replace — The any-typed value on terraform_data whose change forces replacement.
replace_triggered_by — A lifecycle meta-argument that forces a resource’s replacement when a referenced object (often a terraform_data) changes — the plan-visible alternative to a trigger-only null_resource.
cloud-init / user_data — A first-boot script/config the VM runs itself; the preferred alternative to remote-exec/file for machine setup (user_data/custom_data/startup-script).
Packer — A tool to build golden/immutable machine images so configuration is baked in rather than run at create time.

Next steps

You now know provisioners cold — every type, every option, when (rarely) to use them, and the better tool for each job. The natural next move is to go deep on the thing provisioners most often abuse and leak through: state. Continue with Terraform State, In Depth: the State File, the state Commands, Locking & Sensitive Data, which explains the file that records “this provisioner ran,” how triggers/input land in it as plaintext, the terraform state commands, locking, and how to keep sensitive data out of harm’s way. For the meta-arguments referenced throughout (triggers, replace_triggered_by, create_before_destroy), revisit Terraform Resources & Meta-Arguments, In Depth: count, for_each, depends_on & lifecycle.

Terraform Provisioners, In Depth: local-exec, remote-exec, connection, null_resource & terraform_data

Learning objectives

Prerequisites

Why provisioners are a last resort (read this first)

The two kinds of provisioner and where they attach

`local-exec`: run a command on the Terraform machine

`remote-exec`: run commands on the remote host

The `file` provisioner: copy files to the remote host

The `connection` block: SSH, WinRM, bastions, and host keys

Common and SSH fields

WinRM-specific fields

Bastion / jump-host fields (SSH)

Creation-time vs destroy-time provisioners (`when`)

`on_failure`: what happens when a provisioner errors

`null_resource`: the original provisioner host

`terraform_data`: the modern, provider-less replacement

Embedded diagram

Hands-on lab

Common mistakes & troubleshooting

Best practices

Security notes

The alternatives, in full (use these instead)

Interview & exam questions

Quick check

Answers

Exercise

Certification mapping

Glossary

Next steps

Written by Vinod

Comments

Terraform Provisioners, In Depth: local-exec, remote-exec, connection, null_resource & terraform_data

Learning objectives

Prerequisites

Why provisioners are a last resort (read this first)

The two kinds of provisioner and where they attach

local-exec: run a command on the Terraform machine

remote-exec: run commands on the remote host

The file provisioner: copy files to the remote host

The connection block: SSH, WinRM, bastions, and host keys

Common and SSH fields

WinRM-specific fields

Bastion / jump-host fields (SSH)

Creation-time vs destroy-time provisioners (when)

on_failure: what happens when a provisioner errors

null_resource: the original provisioner host

terraform_data: the modern, provider-less replacement

Embedded diagram

Hands-on lab

Common mistakes & troubleshooting

Best practices

Security notes

The alternatives, in full (use these instead)

Interview & exam questions

Quick check

Answers

Exercise

Certification mapping

Glossary

Next steps

Written by Vinod

Comments

`local-exec`: run a command on the Terraform machine

`remote-exec`: run commands on the remote host

The `file` provisioner: copy files to the remote host

The `connection` block: SSH, WinRM, bastions, and host keys

Creation-time vs destroy-time provisioners (`when`)

`on_failure`: what happens when a provisioner errors

`null_resource`: the original provisioner host

`terraform_data`: the modern, provider-less replacement