Azure IaC

Building a Platform Layer with Azure Verified Modules and Terraform

Most teams that adopt Azure Verified Modules (AVM) stop at “I called a module and got a resource.” The real win is using AVM as the substrate for an opinionated platform layer your application teams consume without ever touching a raw azurerm_ resource. This guide builds that layer: composing AVM resource modules into your own pattern modules, pinning them sanely, testing them, and shipping them through a private registry.

Why AVM exists: the resource vs. pattern split

AVM is Microsoft’s effort to replace the sprawl of inconsistent community modules with a single, owned, specification-driven set. Two things make it worth building on:

There are two module classes you actually compose with:

Class Terraform registry prefix Scope
Resource module Azure/avm-res-<service>-<resource>/azurerm One logical resource + its directly dependent child resources
Pattern module Azure/avm-ptn-<pattern>/azurerm A multi-resource architecture (e.g. a hub-spoke, an AKS landing zone)

The mental model: resource modules are LEGO bricks; pattern modules are pre-built assemblies. Your platform layer is a third tier — your own pattern modules, composed from AVM resource bricks, that encode your org’s non-negotiables. You generally do not fork AVM; you wrap it.

A bare resource-module call looks like this:

module "kv" {
  source  = "Azure/avm-res-keyvault-vault/azurerm"
  version = "0.9.1"

  name                = "kv-platform-eus-01"
  resource_group_name = azurerm_resource_group.platform.name
  location            = "eastus"
  tenant_id           = data.azurerm_client_config.current.tenant_id
}

Reading an AVM module’s interface

Before wrapping anything, read the interface — not the README prose, the actual variables. The AVM spec means resource modules share a recognizable set of optional inputs beyond the resource-specific ones:

That last one deserves a callout.

enable_telemetry: AVM modules deploy a tiny, empty ARM deployment whose name encodes the module and version. It sends no resource data to Microsoft — it lets the team measure module usage. It is harmless, but in locked-down subscriptions where Microsoft.Resources/deployments is policy-denied, it will fail a plan. Decide your org default once (we set it false and bake that into our wrappers) rather than per-call.

Inspect the real inputs instead of guessing — pull the module and read its variables:

terraform init
terraform providers schema -json > /dev/null   # sanity-check provider wiring
# Read the module's own variables directly:
find .terraform/modules/kv -name 'variables.tf' -exec grep -E '^variable' {} +

Pinning and dependency strategy

AVM resource modules are pre-1.0. This breaks the intuition most people have about ~>.

# DANGEROUS for a 0.x module:
version = "~> 0.9"   # allows 0.9.x AND 0.10.0, 0.11.0, ...

For 0.x releases, ~> 0.9 is equivalent to >= 0.9.0, < 1.0.0. Because AVM treats the minor segment as the breaking-change segment while below 1.0, that constraint happily pulls in a breaking 0.10.0. The constraint that actually pins to a non-breaking range is the three-part form:

# Allows 0.9.1 .. 0.9.x, blocks 0.10.0:
version = "~> 0.9.1"

My rule across the platform repo:

Automate the bumps with Renovate so you review upgrades instead of chasing them. Renovate understands Terraform registry sources natively:

{
  "$schema": "https://docs.renovatebot.com/renovate-schema.json",
  "extends": ["config:recommended"],
  "terraform": { "enabled": true },
  "packageRules": [
    {
      "matchManagers": ["terraform"],
      "matchPackageNames": ["/^Azure/avm-/"],
      "groupName": "azure-verified-modules",
      "schedule": ["before 9am on monday"]
    }
  ]
}

Each Renovate PR becomes a single reviewable unit: the version bump plus the terraform plan your CI attaches as a comment.

Wrapping resource modules into pattern modules

Here is the core of the platform layer. We want app teams to ask for “a spoke” and get a VNet, a Key Vault, and a storage account — all with private endpoints, diagnostics, and tags already correct. They should not be able to opt out of those.

Directory layout:

platform-modules/
└── spoke-landing-zone/
    ├── main.tf          # composes AVM resource modules
    ├── variables.tf     # the narrow contract app teams see
    ├── outputs.tf
    ├── versions.tf      # required_providers + required_version
    └── tests/
        └── defaults.tftest.hcl

The wrapper’s main.tf composes AVM bricks and injects org policy. Note enable_telemetry, diagnostic_settings, and private_endpoints are set by us, not passed through from the caller:

locals {
  base_tags = merge(var.tags, {
    managedBy = "platform-team"
    module    = "spoke-landing-zone"
  })
}

module "vnet" {
  source  = "Azure/avm-res-network-virtualnetwork/azurerm"
  version = "0.8.1"

  name                = "vnet-${var.workload}-${var.location_short}"
  resource_group_name = var.resource_group_name
  location            = var.location
  address_space       = var.address_space
  tags                = local.base_tags
  enable_telemetry    = false

  subnets = {
    pe = {
      name             = "snet-private-endpoints"
      address_prefixes = [var.pe_subnet_prefix]
    }
  }
}

module "kv" {
  source  = "Azure/avm-res-keyvault-vault/azurerm"
  version = "0.9.1"

  name                = "kv-${var.workload}-${var.location_short}"
  resource_group_name = var.resource_group_name
  location            = var.location
  tenant_id           = var.tenant_id
  tags                = local.base_tags
  enable_telemetry    = false

  # Org default: no public access, ever.
  public_network_access_enabled = false

  diagnostic_settings = {
    central = {
      name                  = "to-law"
      workspace_resource_id = var.log_analytics_workspace_id
    }
  }

  private_endpoints = {
    vault = {
      subnet_resource_id            = module.vnet.subnets["pe"].resource_id
      private_dns_zone_resource_ids = [var.kv_private_dns_zone_id]
    }
  }
}

module "sa" {
  source  = "Azure/avm-res-storage-storageaccount/azurerm"
  version = "0.6.4"

  name                = "st${var.workload}${var.location_short}"
  resource_group_name = var.resource_group_name
  location            = var.location
  tags                = local.base_tags
  enable_telemetry    = false

  public_network_access_enabled = false
  shared_access_key_enabled     = false   # force Entra auth

  diagnostic_settings = {
    central = {
      name                  = "to-law"
      workspace_resource_id = var.log_analytics_workspace_id
    }
  }
}

The version numbers above are illustrative pins from the time of writing. Resolve the current ones for your repo from the registry and pin them exactly — never copy version strings from a blog post into production. (Yes, including this one.)

Enforcing org defaults as non-negotiable inputs

The discipline that makes a platform layer valuable is what the wrapper does not expose. Compare the AVM surface (dozens of inputs) to your variables.tf:

variable "workload" {
  type        = string
  description = "Short workload name, used in resource naming."
  validation {
    condition     = can(regex("^[a-z0-9]{2,12}$", var.workload))
    error_message = "workload must be 2-12 lowercase alphanumeric chars."
  }
}

variable "tags" {
  type        = map(string)
  description = "Caller tags; merged with mandatory platform tags."
  validation {
    condition     = contains(keys(var.tags), "costCenter") && contains(keys(var.tags), "owner")
    error_message = "tags must include costCenter and owner."
  }
}

variable "log_analytics_workspace_id" {
  type        = string
  description = "Central LAW resource ID for diagnostic settings."
}
# ... resource_group_name, location, location_short, tenant_id,
#     address_space, pe_subnet_prefix, kv_private_dns_zone_id

There is no public_network_access_enabled, no enable_telemetry, no way to skip diagnostics. App teams cannot ship a publicly exposed Key Vault through this module because the lever does not exist. That is the entire point — guardrails as types, validated at plan, not as a wiki page nobody reads. The validation blocks turn “please remember to tag things” into a hard failure.

Testing modules: terraform test and Terratest

Two layers, two tools.

Native terraform test is fast, runs in-process, and is perfect for plan-level contract assertions — “does the wrapper produce the right shape?” No deployment needed:

# tests/defaults.tftest.hcl
run "defaults_are_locked_down" {
  command = plan

  variables {
    workload                   = "checkout"
    location                   = "eastus"
    location_short             = "eus"
    resource_group_name        = "rg-checkout"
    tenant_id                  = "00000000-0000-0000-0000-000000000000"
    address_space              = ["10.20.0.0/24"]
    pe_subnet_prefix           = "10.20.0.0/27"
    log_analytics_workspace_id = "/subscriptions/.../workspaces/law-central"
    kv_private_dns_zone_id     = "/subscriptions/.../privateDnsZones/privatelink.vaultcore.azure.net"
    tags                       = { costCenter = "1234", owner = "team@contoso.com" }
  }

  assert {
    condition     = module.kv.... == false  # assert the resolved public-access value
    error_message = "Key Vault must never allow public network access."
  }
}

Run it with:

terraform init
terraform test

Terratest (Go) is for real end-to-end validation against an ephemeral subscription — apply, assert against live Azure, destroy. Use it in CI nightly, not on every push:

func TestSpokeLandingZone(t *testing.T) {
    opts := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
        TerraformDir: "../examples/default",
    })
    defer terraform.Destroy(t, opts)
    terraform.InitAndApply(t, opts)

    kvURI := terraform.Output(t, opts, "key_vault_uri")
    assert.Contains(t, kvURI, "vault.azure.net")
}

In CI, authenticate with OIDC workload identity federation (no stored secrets), and target a disposable subscription so a failed destroy never pollutes a real environment:

az login --service-principal -u "$ARM_CLIENT_ID" \
  --tenant "$ARM_TENANT_ID" --federated-token "$IDTOKEN"
export ARM_USE_OIDC=true
export ARM_SUBSCRIPTION_ID="$EPHEMERAL_SUB_ID"
cd test && go test -timeout 45m ./...

Publishing to a private registry

Wrappers are useless if teams copy-paste them. Publish them and consume by source/version. Two common backends:

Terraform Cloud / Enterprise private registry. Modules must live in repos named terraform-<provider>-<name> and are published from git tags that are valid semver. Tag, push, and the registry ingests the version:

git tag v1.3.0
git push origin v1.3.0

Consumers then reference it through the registry hostname:

module "spoke" {
  source  = "app.terraform.io/contoso/spoke-landing-zone/azurerm"
  version = "~> 1.3.0"
  # ... only the narrow contract inputs
}

Azure DevOps. There is no native Terraform registry product, so the pragmatic pattern is consuming wrappers as versioned git sources (a tag ref) pointed at Azure Repos, fronted by a CI pipeline that runs validate/test on tag:

module "spoke" {
  source = "git::https://dev.azure.com/contoso/_git/platform-modules//spoke-landing-zone?ref=v1.3.0"
}

Consumption contract: semver is a promise. Bump patch for fixes, minor for additive inputs/outputs, major for anything that changes or removes an input or alters resource addresses. The moment you rename a wrapper variable, that’s a major — app teams pinned with ~> must opt in.

Migration path: replacing hand-rolled modules without state churn

The objection that kills AVM adoption: “we have hundreds of resources in state; switching modules means destroy/recreate.” It does not — if you use moved blocks. When you swap your old module "storage" for the AVM wrapper, the resource address changes (e.g. module.storage.azurerm_storage_account.this becomes module.sa.azurerm_storage_account.this[0]). Tell Terraform it’s the same object:

moved {
  from = module.storage.azurerm_storage_account.this
  to   = module.sa.azurerm_storage_account.this[0]
}

moved blocks are declarative and version-controlled — they survive across the whole team, unlike a one-off terraform state mv. For resources that AVM creates as a child but you previously managed standalone (or that exist in Azure but not in state), use an import block instead:

import {
  to = module.sa.azurerm_storage_account.this[0]
  id = "/subscriptions/<sub>/resourceGroups/rg-checkout/providers/Microsoft.Storage/storageAccounts/stcheckouteus"
}

Migrate one module type at a time, behind a PR, and read the plan. A correct migration shows the resource moving with zero destroy/create lines — only in-place diffs for AVM’s added defaults (diagnostics, etc.).

Enterprise scenario

A retail platform team rolled out the spoke-landing-zone wrapper to 40+ app repos, pinning Azure/avm-res-storage-storageaccount/azurerm at an exact 0.6.x. A Renovate PR bumped a single minor — and the terraform plan in CI showed every storage account scheduled for destroy/create. The cause: that AVM release changed the storage account’s internal resource address (it moved the resource under a for_each map). Pinning exact versions had not saved them; the upgrade itself was the breaking event, and a naive merge would have nuked 40 production data planes.

The fix was to absorb the address change inside the wrapper with a moved block keyed to the new index, shipped in the same version bump so consumers inherited it transparently:

moved {
  from = module.sa.azurerm_storage_account.this
  to   = module.sa.azurerm_storage_account.this["default"]
}

They then made this class of failure impossible to miss: CI fails the build if a plan contains any destroy/create line that is not explicitly acknowledged in the PR.

terraform plan -no-color -out tfplan
terraform show -json tfplan \
  | jq -e '[.resource_changes[]
            | select(.change.actions == ["delete","create"]
                  or .change.actions == ["create","delete"])] | length == 0' \
  || { echo "::error::Unacknowledged replace in plan"; exit 1; }

The lesson: exact pins control when you take an upgrade, not whether it is safe. Every AVM bump in a shared wrapper is read as a potential state migration, with a moved-aware plan gate standing between Renovate and prod.

Verify

Confirm the platform layer behaves before anyone consumes it.

# 1. Wrapper plans clean against AVM-pinned versions
terraform init -upgrade && terraform validate

# 2. Contract tests pass (plan-level, no deploy)
terraform test

# 3. The migration is a no-op for existing resources
terraform plan | grep -E 'will be (destroyed|created)' || echo "No destroy/create — safe"

# 4. The published version resolves from the registry
terraform init   # in a consumer repo pinning ~> 1.3.0

A healthy result: terraform test green, the grep in step 3 prints “No destroy/create — safe”, and step 4 pulls your wrapper plus the exact AVM versions you pinned inside it.

Platform-layer readiness checklist

Pitfalls

The payoff is an architecture where app teams ship spokes, vaults, and storage that are private, tagged, and observable by construction — and your platform team upgrades the whole estate by merging one pinned-version PR.

AzureAVMTerraformIaCModulesBicep

Comments

Keep Reading