Terraform Module: Azure Azure Policy (Definition & Assignment) — codified guardrails you can ship per resource group

Quick take — A reusable hashicorp/azurerm ~> 4.0 module that bundles a custom azurerm_policy_definition with a resource-group-scoped assignment, system-assigned identity, and remediation-ready parameters for repeatable governance. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "azurerm" {
  features {}
}

module "policy" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-policy?ref=v1.0.0"

  name              = "..."  # Name of the custom policy definition (1-64 chars, alpha…
  display_name      = "..."  # Human-friendly display name for the definition.
  policy_rule       = "..."  # The `if`/`then` policyRule object (HCL, JSON-encoded in…
  assignment_name   = "..."  # Assignment name (1-24 chars to stay within the identity…
  resource_group_id = "..."  # Full resource ID of the target resource group.
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Azure Policy is the engine that evaluates your resources against rules and reports — or enforces — compliance. A policy definition is the rule itself: a JSON document with a policyRule (an if/then condition tree), a set of typed parameters, and an effect such as Audit, Deny, Modify, DeployIfNotExists, or AuditIfNotExists. A policy assignment is what actually puts that definition to work by binding it to a scope (management group, subscription, resource group) and supplying concrete parameter values.

On their own those two halves are fiddly to wire up by hand. The definition needs valid embedded JSON, the assignment needs the policy_definition_id to line up exactly, and any Modify / DeployIfNotExists effect needs a managed identity with the right role assignments before remediation will ever succeed. Get one of those wrong and you get silent non-compliance or a 403 during remediation.

This module wraps azurerm_policy_definition plus azurerm_resource_group_policy_assignment into a single, var-driven unit. You pass the policy rule, parameters, the effect, and the target resource group; the module creates the definition, assigns it at the resource-group scope, optionally provisions a system-assigned identity for remediation-capable effects, and exposes the identity principal ID so a caller can grant the role the policy needs. The result is a guardrail you can drop into any resource group with three lines of HCL.

When to use it

You want custom governance rules that the built-in policy catalog does not cover (e.g. “every storage account must have a data_classification tag” or “deny public IPs in this app’s RG”).
You are managing policy at the resource-group blast radius — per-team, per-app, or per-environment — rather than fleet-wide at the management-group level.
You need a Modify or DeployIfNotExists effect that auto-remediates drift (tag inheritance, diagnostic settings, TLS enforcement) and therefore needs a managed identity wired correctly every time.
You want the same rule applied consistently across many RGs via a for_each, with per-scope parameter overrides and a per-scope enforcement toggle (Default vs DoNotEnforce).
You are standing up a landing zone where governance is codified next to the workload it protects, reviewed in the same PR, and versioned with ?ref= tags.

If you only need to assign an existing built-in or already-published definition, you do not need the definition half — but having both behind one interface keeps the common “custom rule + its assignment” case to a single module call.

Module structure

terraform-module-azure-policy/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # policy definition + RG assignment + identity
├── variables.tf     # var-driven inputs with validation
└── outputs.tf       # definition/assignment ids + identity principal

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 4.0"
    }
  }
}

main.tf

locals {
  # Effects that perform a change require a managed identity + a location.
  remediation_effects = ["Modify", "DeployIfNotExists"]
  needs_identity      = contains(local.remediation_effects, var.effect)

  # Inject the operator-selected effect as the default for the standard
  # "effect" parameter so a single definition supports Audit/Deny/Modify.
  effect_parameter = {
    effect = {
      type = "String"
      metadata = {
        displayName = "Effect"
        description = "Enable or disable the execution of the policy."
      }
      allowedValues = [
        "Audit",
        "Deny",
        "Modify",
        "DeployIfNotExists",
        "AuditIfNotExists",
        "Disabled",
      ]
      defaultValue = var.effect
    }
  }

  merged_parameters = merge(local.effect_parameter, var.additional_parameters)
}

resource "azurerm_policy_definition" "this" {
  name         = var.name
  display_name = var.display_name
  description  = var.description
  policy_type  = "Custom"
  mode         = var.mode

  management_group_id = var.management_group_id

  metadata = jsonencode(merge(
    {
      category = var.category
      version  = var.policy_version
    },
    var.metadata
  ))

  parameters  = jsonencode(local.merged_parameters)
  policy_rule = jsonencode(var.policy_rule)
}

resource "azurerm_resource_group_policy_assignment" "this" {
  name                 = var.assignment_name
  display_name         = coalesce(var.assignment_display_name, var.display_name)
  description          = var.description
  resource_group_id    = var.resource_group_id
  policy_definition_id = azurerm_policy_definition.this.id

  # Default = enforce; DoNotEnforce = "what-if" dry run without changes/denies.
  enforce      = var.enforce
  location     = local.needs_identity ? var.location : null
  not_scopes   = var.not_scopes

  # Merge the chosen effect with any caller-supplied parameter values.
  parameters = jsonencode(merge(
    { effect = { value = var.effect } },
    var.assignment_parameter_values
  ))

  dynamic "identity" {
    for_each = local.needs_identity ? [1] : []
    content {
      type = "SystemAssigned"
    }
  }

  dynamic "non_compliance_message" {
    for_each = var.non_compliance_message == null ? [] : [var.non_compliance_message]
    content {
      content = non_compliance_message.value
    }
  }
}

variables.tf

variable "name" {
  description = "Name of the custom policy definition (used as the resource identifier)."
  type        = string

  validation {
    condition     = can(regex("^[A-Za-z0-9-_]{1,64}$", var.name))
    error_message = "name must be 1-64 chars: letters, digits, hyphen or underscore."
  }
}

variable "display_name" {
  description = "Human-friendly display name for the policy definition."
  type        = string
}

variable "description" {
  description = "Description shown for both the definition and the assignment."
  type        = string
  default     = "Managed by Terraform via the kloudvin azure-policy module."
}

variable "mode" {
  description = "Policy mode: 'All', 'Indexed', or a Microsoft.Kind.* resource-provider mode."
  type        = string
  default     = "All"

  validation {
    condition     = contains(["All", "Indexed"], var.mode) || startswith(var.mode, "Microsoft.")
    error_message = "mode must be 'All', 'Indexed', or a 'Microsoft.*' provider mode."
  }
}

variable "category" {
  description = "Policy category surfaced in the Azure Portal (e.g. 'Tags', 'Storage')."
  type        = string
  default     = "General"
}

variable "policy_version" {
  description = "Semantic version stamped into the definition metadata."
  type        = string
  default     = "1.0.0"
}

variable "metadata" {
  description = "Extra key/value pairs merged into the definition metadata block."
  type        = map(string)
  default     = {}
}

variable "policy_rule" {
  description = "The policyRule object (if/then). Passed as HCL and JSON-encoded internally."
  type        = any
}

variable "additional_parameters" {
  description = "Extra definition parameters (the standard 'effect' parameter is injected automatically)."
  type        = any
  default     = {}
}

variable "effect" {
  description = "Effect applied by the assignment."
  type        = string
  default     = "Audit"

  validation {
    condition = contains(
      ["Audit", "Deny", "Modify", "DeployIfNotExists", "AuditIfNotExists", "Disabled"],
      var.effect
    )
    error_message = "effect must be one of Audit, Deny, Modify, DeployIfNotExists, AuditIfNotExists, Disabled."
  }
}

variable "management_group_id" {
  description = "Optional management group to host the definition. Null = current subscription."
  type        = string
  default     = null
}

variable "assignment_name" {
  description = "Name of the resource-group policy assignment (max 24 chars when an identity is used)."
  type        = string

  validation {
    condition     = length(var.assignment_name) >= 1 && length(var.assignment_name) <= 24
    error_message = "assignment_name must be 1-24 characters (Azure limit for assignments with an identity)."
  }
}

variable "assignment_display_name" {
  description = "Display name for the assignment. Defaults to the definition display_name."
  type        = string
  default     = null
}

variable "resource_group_id" {
  description = "Full resource ID of the resource group to scope the assignment to."
  type        = string

  validation {
    condition     = can(regex("^/subscriptions/.+/resourceGroups/.+$", var.resource_group_id))
    error_message = "resource_group_id must be a full /subscriptions/.../resourceGroups/... resource ID."
  }
}

variable "location" {
  description = "Azure region for the assignment's managed identity (required for Modify/DeployIfNotExists)."
  type        = string
  default     = null
}

variable "enforce" {
  description = "true = enforce (Default); false = dry-run (DoNotEnforce) — audits without denying/changing."
  type        = bool
  default     = true
}

variable "not_scopes" {
  description = "List of resource IDs to exclude from the assignment scope."
  type        = list(string)
  default     = []
}

variable "assignment_parameter_values" {
  description = "Values for any additional_parameters, shaped as { paramName = { value = ... } }."
  type        = any
  default     = {}
}

variable "non_compliance_message" {
  description = "Custom message shown to users when a resource is non-compliant. Null = Azure default."
  type        = string
  default     = null
}

outputs.tf

output "definition_id" {
  description = "Resource ID of the custom policy definition."
  value       = azurerm_policy_definition.this.id
}

output "definition_name" {
  description = "Name of the custom policy definition."
  value       = azurerm_policy_definition.this.name
}

output "assignment_id" {
  description = "Resource ID of the resource-group policy assignment."
  value       = azurerm_resource_group_policy_assignment.this.id
}

output "assignment_name" {
  description = "Name of the resource-group policy assignment."
  value       = azurerm_resource_group_policy_assignment.this.name
}

output "identity_principal_id" {
  description = "Principal (object) ID of the assignment's system-assigned identity, or null if the effect needs no identity."
  value = try(
    azurerm_resource_group_policy_assignment.this.identity[0].principal_id,
    null
  )
}

output "identity_tenant_id" {
  description = "Tenant ID of the assignment's system-assigned identity, or null."
  value = try(
    azurerm_resource_group_policy_assignment.this.identity[0].tenant_id,
    null
  )
}

How to use it

This example deploys a Modify policy that appends a cost_center tag to every resource group in scope, then grants the policy’s identity the Tag Contributor role so remediation can actually write the tag.

data "azurerm_resource_group" "app" {
  name = "rg-payments-prod"
}

module "azure_policy_definition_assignment_costcenter" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-policy?ref=v1.0.0"

  name            = "require-costcenter-tag"
  display_name    = "Append cost_center tag to resource groups"
  category        = "Tags"
  mode            = "All"
  effect          = "Modify"
  location        = "centralindia"

  assignment_name   = "rg-costcenter-tag"
  resource_group_id = data.azurerm_resource_group.app.id

  non_compliance_message = "Resource groups must carry a cost_center tag for chargeback."

  policy_rule = {
    if = {
      field  = "tags['cost_center']"
      exists = "false"
    }
    then = {
      effect = "[parameters('effect')]"
      details = {
        roleDefinitionIds = [
          "/providers/Microsoft.Authorization/roleDefinitions/4a9ae827-6dc8-4573-8ac7-8239d42aa03f"
        ]
        operations = [
          {
            operation = "add"
            field     = "tags['cost_center']"
            value     = "PAYMENTS-1042"
          }
        ]
      }
    }
  }
}

# Downstream: use the identity_principal_id output to grant the role
# the Modify effect needs so remediation tasks succeed.
resource "azurerm_role_assignment" "policy_tag_writer" {
  scope                = data.azurerm_resource_group.app.id
  role_definition_name = "Tag Contributor"
  principal_id         = module.azure_policy_definition_assignment_costcenter.identity_principal_id
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root config — live/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "azurerm"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...azurerm state bucket/container + key per path...
  }
}

2. Module config — live/prod/policy/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-policy?ref=v1.0.0"
}

inputs = {
  name = "..."
  display_name = "..."
  policy_rule = "..."
  assignment_name = "..."
  resource_group_id = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/policy && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name	Type	Default	Required	Description
`name`	`string`	—	yes	Name of the custom policy definition (1-64 chars, alphanumeric/`-`/`_`).
`display_name`	`string`	—	yes	Human-friendly display name for the definition.
`description`	`string`	`"Managed by Terraform..."`	no	Description for both definition and assignment.
`mode`	`string`	`"All"`	no	`All`, `Indexed`, or a `Microsoft.*` provider mode.
`category`	`string`	`"General"`	no	Category shown in the Azure Portal.
`policy_version`	`string`	`"1.0.0"`	no	Semantic version stamped into definition metadata.
`metadata`	`map(string)`	`{}`	no	Extra key/value pairs merged into definition metadata.
`policy_rule`	`any`	—	yes	The `if`/`then` policyRule object (HCL, JSON-encoded internally).
`additional_parameters`	`any`	`{}`	no	Extra definition parameters beyond the auto-injected `effect`.
`effect`	`string`	`"Audit"`	no	One of `Audit`, `Deny`, `Modify`, `DeployIfNotExists`, `AuditIfNotExists`, `Disabled`.
`management_group_id`	`string`	`null`	no	Management group to host the definition; null = current subscription.
`assignment_name`	`string`	—	yes	Assignment name (1-24 chars to stay within the identity limit).
`assignment_display_name`	`string`	`null`	no	Display name for the assignment; defaults to `display_name`.
`resource_group_id`	`string`	—	yes	Full resource ID of the target resource group.
`location`	`string`	`null`	no	Region for the identity; required for `Modify`/`DeployIfNotExists`.
`enforce`	`bool`	`true`	no	`true` = enforce; `false` = dry-run (`DoNotEnforce`).
`not_scopes`	`list(string)`	`[]`	no	Resource IDs to exclude from the assignment scope.
`assignment_parameter_values`	`any`	`{}`	no	Values for `additional_parameters`, as `{ name = { value = ... } }`.
`non_compliance_message`	`string`	`null`	no	Custom non-compliance message; null = Azure default.

Outputs

Name	Description
`definition_id`	Resource ID of the custom policy definition.
`definition_name`	Name of the custom policy definition.
`assignment_id`	Resource ID of the resource-group policy assignment.
`assignment_name`	Name of the resource-group policy assignment.
`identity_principal_id`	Principal (object) ID of the assignment’s system-assigned identity, or null.
`identity_tenant_id`	Tenant ID of the assignment’s system-assigned identity, or null.

Enterprise scenario

A retail platform team runs ~40 short-lived “preview environment” resource groups, one per feature branch, each created and destroyed by the CI pipeline. Compliance requires that every preview RG enforces TLS 1.2 on storage and inherits the parent subscription’s data_classification tag. The team for_each’s this module over their environment map: each RG gets a Modify assignment with its own identity, and a single shared azurerm_role_assignment loop grants every identity Tag Contributor at the RG scope. Because the rule lives next to the workload in the same Terraform stack, governance is created and torn down atomically with the environment — no orphaned assignments, no manual portal clicks, and full audit history in the PR.

Best practices

Match the identity to the effect. Only Modify and DeployIfNotExists need a managed identity; this module provisions one automatically for those effects and leaves Audit/Deny identity-free. Always pair a remediation effect with the role assignment its roleDefinitionIds imply, or remediation tasks fail with 403.
Ship as Audit, promote to Deny/Modify. Roll a new rule out in audit mode first, watch the compliance pane for false positives, then flip effect. The enforce = false (DoNotEnforce) toggle gives you a true dry-run for enforcing effects without denying or changing live resources.
Keep assignment_name ≤ 24 characters. Azure rejects assignments that carry an identity when the name exceeds 24 chars — the module validates this, but choose short, scope-prefixed names like rg-costcenter-tag up front.
Use the narrowest mode. Indexed skips resource types that do not support tags/locations and avoids noisy “not applicable” evaluations; reserve All for policies that must see resource groups and subscriptions themselves (as the tag example does).
Version definitions deliberately. Bump policy_version and pin module consumers to a ?ref= tag so a rule change is a reviewable, releasable event — not a surprise that re-evaluates 40 RGs the moment someone runs apply.
Lean on not_scopes instead of disabling. Carve out break-glass resources (a logging RG, a vendor’s managed resources) with not_scopes rather than weakening the rule globally; exclusions stay explicit and auditable.