Azure Landing Zone: Governance — Azure Policy Initiatives, Cost Guardrails, Compliance Frameworks & Tag Enforcement

Where this fits

The Azure Landing Zone (ALZ) conceptual architecture splits into eight design areas across two themes — Environment design (Identity, Network Topology and Connectivity, Resource Organization) and Governance & operations (Security, Management, Governance, Platform Automation and DevOps, plus the cross-cutting Billing and Microsoft Entra Tenant decision). Governance is part 7, and it is the design area that turns intent into enforced reality: it takes the management group hierarchy you built in Resource Organization and the controls you scoped in Security and Management, and it expresses them as policy-as-code guardrails, cost controls, and compliance baselines that apply automatically and inescapably. Microsoft’s own framing is blunt — Governance exists to replace change-advisory-board gatekeeping with automated guardrails and continuous compliance auditing so application teams can move fast inside boundaries they cannot remove. This article goes deep on the five engines that make that real: Azure Policy and initiatives, cost controls and budgets, the audit/deny/deployIfNotExists guardrail decision, compliance-framework mapping, and tag enforcement.

Azure Landing Zone Design Areas — animated overview

Azure Policy and policy initiatives

What it is

Azure Policy is the rules engine of the landing zone. A policy definition is a single rule with an if condition (a logical test over resource properties — type, location, tags, kind, ARM field aliases) and a then effect that fires when the condition matches. A policy initiative (a.k.a. a policy set definition) is a named bundle of many definitions, exposing a consolidated set of parameters and rolling up into a single compliance percentage. You assign a definition or initiative to a scope — a management group, subscription, or resource group — and it flows down through inheritance to every descendant, where it evaluates existing resources and intercepts new deployments at the ARM control plane.

Azure Policy is complementary to RBAC, not a substitute: RBAC governs who can perform an action; Policy governs what the resulting resource is allowed to look like. A user with Owner can still be blocked from creating a public-IP NIC or an unencrypted disk by a Deny policy — that separation is the whole point.

Why it matters

Policy is the only mechanism in Azure that is simultaneously preventive (it can refuse non-compliant deployments before they exist), detective (it continuously audits the live estate and produces a compliance score), and corrective (it can deploy or modify resources to bring them into line). Without it, governance degrades into manual reviews, tribal knowledge, and drift. With it, a control written once at the intermediate-root management group is enforced on the 4,000th subscription exactly as on the first — and cannot be weakened by a child scope, only made stricter.

Why initiatives, not loose policies

You assign initiatives, not dozens of individual policies, for four reasons:

Reason	Without initiatives	With an initiative
Assignment sprawl	N separate assignments per scope to manage and exclude	One assignment, one set of exclusions
Compliance rollup	N separate compliance figures, no single number	One initiative compliance % you can report to auditors
Parameter consistency	Each policy parameterized independently	Shared initiative parameters set the same value once
Scope limits	You burn through the per-scope assignment ceiling fast	One assignment counts as one against the limit

Azure enforces limits here — there are caps on policy/initiative definitions and assignments per scope — which is exactly why the ALZ pattern is “few initiatives, assigned high, parameterized per scope” rather than “many policies, assigned everywhere.”

How to do it well

Define at the top, assign at the right altitude. Create custom definitions and initiatives at the top-level (intermediate-root) management group so they are available to be assigned at any inherited scope. Then assign at the highest appropriate level and use exclusions at lower levels only where genuinely needed. This is Microsoft’s explicit recommendation and it minimises the exclusion-management tax.
Prefer built-ins to cut operational overhead. Microsoft ships hundreds of maintained built-in definitions and initiatives. Use them first; write custom definitions only for gaps. The ALZ accelerators ship a curated set of custom ALZ policies and the default assignments (the alzDefaults Bicep module / the equivalent Terraform module) so you inherit a battle-tested baseline rather than authoring from zero.
Cap assignments at the root MG. Limit how many assignments you place at the very top to avoid drowning in inherited-scope exclusions; push workload-specific guardrails down to the archetype MGs (Corp, Online, Sandbox).
Treat policy as code. Definitions, initiatives, and assignments live in Git, deploy via pipeline (Bicep/Terraform), and ship with What-If/compliance-impact review before merge. Use the enforcement mode = DoNotEnforce (audit-only “what-would-happen”) trick to land a new Deny policy in report mode first, measure impact, then flip to enforce.
Group resource-provider registration and SKU/region allow-lists into Policy too — Azure Policy can control resource-provider registrations and restrict allowed locations, types, and SKUs at MG/subscription scope.

Concrete artifacts, decisions, and tools

Artifacts: the custom initiative definitions (JSON/Bicep), the policy-assignment matrix (which initiative → which MG → which parameters/exclusions/enforcement mode), the policy-as-code repo and pipeline, and an exemption register.
Decisions: built-in vs custom per control; assignment altitude per initiative; which initiatives ship in DoNotEnforce first; per-scope parameter values; who holds Resource Policy Contributor for delegated app-level governance.
Tools/services: Azure Policy (definitions, initiatives, assignments, exemptions), the ALZ default policy assignments module, Azure Resource Graph for compliance queries at tenant scale, AzAdvertizer (to discover/track built-in policies, initiatives, and aliases), and Azure Governance Visualizer (to map and version-check your whole policy estate against the latest ALZ release).

Cost controls and budgets

What it is

Cost governance in the landing zone is delivered through Microsoft Cost Management + Billing: budgets (a cost or usage threshold evaluated against a scope — billing account, MCA invoice section / EA enrollment account, subscription, resource group, or a management-group cost view), alerts and action groups that fire at percentage thresholds of actual or forecast spend, and the commercial levers Azure exposes to bend the cost curve — Reservations, the Azure savings plan for compute, Azure Hybrid Benefit, Spot VMs, and dev/test subscriptions. Tags are the connective tissue: they make cost allocatable to a cost center, application, or business unit.

Why it matters

A landing zone that governs security but not spend produces a different kind of incident — the surprise invoice. Cost governance has to be architected in at the platform layer, not bolted on per app, because the people who can prevent overspend (the platform team, via budgets and policy) are not the people generating it (app teams). Done well, budgets give you forecast-based early warning (alert at 60% of projected month-end, not after the money is gone), commitment discounts cut the bill structurally, and tag-driven allocation turns one opaque invoice into accurate showback/chargeback per business unit.

The two halves: guardrails (preventive) and budgets (detective)

Cost control splits cleanly into prevent and detect:

Control	Mechanism	Effect
Restrict expensive SKUs/regions	Azure Policy `Deny` on `Microsoft.Compute/virtualMachines/sku.name`, allowed-locations, allowed resource types	Stops a wrong-SKU/wrong-region deploy before it costs anything
Auto-tier / expire data	Azure Storage lifecycle management rules	Moves blobs to cool/cold/archive or deletes at end of lifecycle
Right-size & shut down idle	Azure Advisor cost recommendations; autoscale; start/stop automation	Continuous structural savings
Budget alerting	Cost Management budgets + action groups at 60/80/100% actual & forecast	Detective early-warning, routed to owners
Commitment discounts	Reservations (up to ~72% vs PAYG), savings plan for compute (up to ~65%), Hybrid Benefit, Spot	Structural reduction of the run-rate

The crucial nuance: a Cost Management budget does not stop spend — it alerts (and can trigger an automation runbook via the action group, e.g. to deallocate dev VMs). To actually prevent cost you need Azure Policy allow-lists for SKUs/regions/types. Mature landing zones use both: Policy to cap what can be created, budgets to watch what it costs.

How to do it well

Bake budgets into subscription vending. Every vended landing-zone subscription is born with a budget (often a per-environment default), an action group wired to the owning team + the FinOps function, and thresholds on actual and forecast spend. No subscription exists without a budget.
Choose the right budget scope. Subscription/RG budgets for app teams; management-group cost views for portfolio rollups; billing-account / invoice-section budgets for the enterprise envelope.
Allocate with tags, not guesswork. Enforce CostCenter, Application, and Environment tags (see tag enforcement below) so Cost Management can slice spend accurately; free-text or missing tags are the enemy of clean chargeback.
Layer the discount levers deliberately. Combine the savings plan for compute (flexible, covers VM/ACI/App Service/Functions Premium across regions and sizes) with Reservations (deeper discount for stable, specific resources), apply Hybrid Benefit to Windows Server/SQL with Software Assurance, and route interruption-tolerant batch to Spot.
Tie cost into the WAF Cost Optimization pillar so reviews are recurring, not one-off.

Concrete artifacts, decisions, and tools

Artifacts: the budget + action-group definitions templated in the vending module; a tagging-to-cost-allocation map; a SKU/region allow-list policy initiative; a commitment-purchase plan (reservation/savings-plan coverage targets); monthly showback reports.
Decisions: default budget values per environment and currency (e.g. ₹-denominated); alert thresholds and recipients; which workloads are reservation/savings-plan candidates vs Spot; chargeback vs showback model.
Tools/services: Microsoft Cost Management + Billing (budgets, alerts, cost analysis, exports to a storage account / Power BI), Azure Policy (SKU/region/type allow-lists), Azure Advisor (cost recommendations), Azure Storage lifecycle management, and action groups (to notify or trigger remediation runbooks).

Guardrails: audit vs deny vs deployIfNotExists (and the rest)

What it is

The effect is the verb of a policy — what actually happens when a resource matches. Azure Policy supports these effects, and choosing the right one per control is the single most consequential decision in the whole Governance design area:

Effect	What it does	Preventive / Detective / Corrective	Remediatable?
Audit	Logs a non-compliant entry; allows the deployment	Detective	No
AuditIfNotExists	Audits when a related resource is missing/misconfigured (e.g. no diagnostic setting on a VM)	Detective	No
Deny	Blocks the create/update at the control plane	Preventive	No
DenyAction	Blocks an action — today only `DELETE` — to protect critical resources from deletion	Preventive	No
Modify	Adds/updates/removes properties or tags on create/update; can fix existing resources via a remediation task	Corrective	Yes
Append	Adds fields/properties (e.g. a default tag) at create time; cannot remediate existing resources	Corrective (create-time)	No
DeployIfNotExists (DINE)	Deploys an ARM template when a related resource is missing (e.g. auto-deploy a diagnostic setting, enable Defender plan, install an agent)	Corrective	Yes
Manual	Tracks attestation-based controls Azure can’t evaluate automatically (process/people controls); you attest compliance	Detective (attested)	N/A
Disabled	Turns the policy off for testing without deleting the assignment	—	—

Why the choice matters

Pick the wrong effect and you either break the business (a premature Deny that blocks legitimate work) or achieve nothing (an Audit on a control that needed teeth). The art is matching effect to control intent and blast radius:

Audit / AuditIfNotExists — for visibility-first controls and anything you’re not yet ready to enforce. Always land a new restrictive guardrail in audit (or Deny with enforcement mode DoNotEnforce) first, measure the non-compliant count via the compliance dashboard, fix or exempt the backlog, then escalate to Deny.
Deny — for hard, non-negotiable invariants: no public IP on Corp, no resources outside centralindia/southindia, no unencrypted storage, no classic resources. Deny is preventive but does not fix what already exists — it only stops new violations.
DenyAction — narrowly, to protect critical resources from deletion (locked-down hub firewall, central Log Analytics workspace, key vaults), since the only supported action is DELETE.
DeployIfNotExists — for “make it so” automation: auto-deploy diagnostic settings to a central workspace, auto-enable Defender for Cloud plans, auto-configure backup, deploy a Log Analytics agent / AMA. DINE is how the ALZ baseline keeps the whole estate observable and protected without app teams lifting a finger.
Modify — the preferred effect for tag management and property fixes, because (unlike Append) it supports multiple operation types and can remediate existing resources via a remediation task.

The operational reality of corrective effects

DeployIfNotExists and Modify are the only remediatable effects — they alone can fix the existing estate, the others only affect new deployments. Two facts shape how you operate them:

New/updated resources trigger DINE/Modify evaluation automatically after a configurable evaluationDelay — by default DINE waits ~10 minutes (and you can extend it for resources that provision slowly, so the existence check runs after the dependency is ready).
Pre-existing non-compliant resources are not auto-fixed — you must create a remediation task (which runs the deployment/modification using a managed identity that needs the right RBAC role, e.g. Contributor or a targeted role on the scope). Ongoing remediation is typically driven from the pipeline or a scheduled job so newly-discovered drift gets swept up.

Effects like Audit, AuditIfNotExists, Deny, DenyAction, Disabled, and Manual have no remediation capability — non-compliance found through them is resolved by changing the resource yourself (or attesting, for Manual).

Concrete artifacts, decisions, and tools

Artifacts: an effect-per-control register (control → chosen effect → rationale → enforcement mode → rollout stage); remediation-task definitions and the managed-identity role assignments they require; the evaluationDelay settings for slow resources.
Decisions: which guardrails start in Audit/DoNotEnforce vs go straight to Deny; what gets DenyAction deletion protection; which baseline controls use DINE auto-deploy; remediation cadence and ownership.
Tools/services: Azure Policy (all effects), Azure Policy remediation tasks + managed identities, Microsoft Defender for Cloud (whose plans/agents are frequently turned on via DINE), Azure Monitor / Log Analytics (the DINE target for diagnostics), and the compliance dashboard for measuring effect impact before escalation.

Compliance frameworks

What it is

A compliance framework in Azure terms is a regulatory-compliance initiative — a built-in policy set that maps an external standard’s controls to concrete Azure Policy definitions (mostly Audit/AuditIfNotExists, some DeployIfNotExists). Azure ships maintained initiatives for the big regimes, and the default ALZ baseline assigns the Microsoft cloud security benchmark (MCSB) — the successor to the Azure Security Benchmark — as the foundational guardrail set. On top of MCSB you layer the regimes your business is actually subject to.

Framework	Typical applicability	How it shows up in Azure
Microsoft cloud security benchmark (MCSB)	Everyone — the ALZ default baseline	Built-in initiative, the backbone of Defender for Cloud’s secure score
PCI-DSS	Card/payment data	Built-in regulatory-compliance initiative
HIPAA / HITRUST	US healthcare PHI	Built-in initiative
SOC 2 (Trust Services Criteria)	SaaS / service orgs	Built-in initiative
ISO/IEC 27001	Broad infosec certification	Built-in initiative
NIST SP 800-53 / CSF	US federal, regulated industries	Built-in initiative
CIS Microsoft Azure Foundations Benchmark	Hardening baseline	Built-in initiative
Local/sovereign (RBI, GDPR-aligned, sovereign-cloud sets)	Region-specific obligations	Built-in + custom initiatives, often pinned at geo MG tiers

Why it matters

Microsoft’s guidance is to map regulatory and compliance requirements to Azure Policy definitions and Azure role assignments and to assign the initiatives at the management-group level from day zero — because retrofitting compliance onto a populated estate is dramatically more expensive than inheriting it from the start. Assigning, say, the PCI-DSS and ISO 27001 initiatives at the right MG means every subscription that lands beneath it is born measured against those controls, and you get a continuous, auditor-ready compliance percentage instead of a once-a-year scramble.

How to do it well

Separate the baseline from the regime. Keep MCSB assigned broadly (it underpins Defender for Cloud’s secure score), and scope each regulatory initiative to where it applies — e.g. PCI-DSS only on the management group / subscriptions that actually process card data, not tenant-wide, to avoid noise and irrelevant findings.
Understand these are mostly audit initiatives. Regulatory-compliance built-ins largely report posture; they do not, by themselves, enforce every control. Pair them with your own Deny/DINE enforcement initiative for the controls you must guarantee (encryption, public-network blocks, diagnostic deployment).
Use Defender for Cloud’s Regulatory Compliance dashboard to track each framework’s control-by-control status, evidence, and trend — it surfaces the same initiatives with workflow, exemptions, and export.
Map controls to RBAC too. Some compliance controls are about who can do what; satisfy those with Azure RBAC custom roles and Entra ID Governance (access reviews, entitlement management, PIM), not Policy.
Track drift in the framework itself. Standards and the built-in initiatives that implement them get updated; use AzAdvertizer and Azure Governance Visualizer’s policy-version checker to stay current.

Concrete artifacts, decisions, and tools

Artifacts: a control-to-policy/RBAC mapping matrix per framework; the regulatory-initiative assignments at their scopes; an exemption + attestation register (with justification and expiry) for Manual controls and approved deviations; periodic compliance-export evidence packs.
Decisions: which frameworks apply and at which scope; baseline (MCSB, always) vs regime-specific placement; what you enforce (Deny/DINE) vs merely audit; exemption approval workflow and review cadence.
Tools/services: Azure Policy regulatory-compliance initiatives, Microsoft Defender for Cloud (Regulatory Compliance dashboard + secure score), Microsoft Purview (for data-governance/classification aspects of compliance), Azure RBAC + Microsoft Entra ID Governance, and AzAdvertizer / Azure Governance Visualizer for control discovery and version tracking.

Tag enforcement

What it is

Tags are key/value metadata on resources, resource groups, and subscriptions that carry what the name can’t or shouldn’t — owner, cost center, environment, data classification, criticality. Tag enforcement is the use of Azure Policy to guarantee tags exist, hold allowed values, and inherit correctly, because — critically — tags do not inherit by default: a resource does not automatically pick up its resource group’s or subscription’s tags.

Why it matters

Tags are the join key for almost every downstream governance function: cost allocation/showback in Cost Management, operational routing (who to page), automation targeting (start/stop, backup selection, sandbox expiry), and compliance reporting (which resources hold regulated data). A landing zone with weak tagging produces half-empty cost reports and unattributable resources. Because most tagged-resource values are governance-critical and tags don’t propagate natively, enforcement is non-optional — and Microsoft calls it out explicitly in the Governance design area, including using the append mode to enforce required tags and Modify to manage them.

The enforcement pattern (the part that actually works)

Policy intent	Built-in policy	Effect	Why this effect
Block resources missing a required tag	Require a tag and its value on resources	Deny	Stop non-compliant creation at source
Block RGs missing a required tag	Require a tag on resource groups	Deny	RGs are the inheritance source for resources
Make resources inherit a tag from their RG	Inherit a tag from the resource group	Modify	Tags don’t inherit natively; Modify can remediate existing
Make resources inherit a tag from the subscription	Inherit a tag from the subscription	Modify	Same, sourced from the subscription
Backfill a default tag value	Add or replace a tag on resources	Modify	Set org defaults; remediatable

The deliberate design: Deny on mandatory tags at the RG/subscription level (the authoritative source), Modify (inherit) so child resources automatically acquire CostCenter/Environment/etc. from their parent, and remediation tasks to backfill the existing estate — not just new resources. Use Modify over Append for tags because Modify supports more operations and remediates existing resources, whereas Append only acts at create time.

How to do it well

Split tags into mandatory (policy-enforced) and recommended. Enumerate allowed values for mandatory tags (Environment ∈ {Production, UAT, Dev, Sandbox}, CostCenter matches a pattern) — free-text tags wreck cost reports.
Enforce at the source, inherit downward. Deny missing tags on resource groups and subscriptions; Modify/inherit onto child resources. This gets you both coverage and low friction (app teams set a few tags on the RG, everything beneath inherits).
Bundle into one tagging initiative assigned at the intermediate-root MG, parameterized with your mandatory-tag list, and run remediation tasks to sweep the legacy estate.
Audit continuously with Azure Resource Graph. KQL across the whole tenant tells you tag compliance in seconds — far faster than the Policy dashboard for ad-hoc questions.
Couple tags to cost from the start so the tagging strategy is driven by the chargeback model, not invented in isolation.

Concrete artifacts, decisions, and tools

Artifacts: the mandatory-tag list + allowed-value sets; the tagging policy initiative (Deny + Modify/inherit + add-default) and its assignment; remediation tasks for the existing estate; Resource Graph compliance queries.
Decisions: which tags are mandatory vs recommended; allowed values per tag; inherit-from-RG vs inherit-from-subscription per tag; remediation cadence and the managed-identity role for remediation.
Tools/services: Azure Policy (built-in tag policies + custom initiative), Microsoft Cost Management (where tags pay off, via cost-by-tag views), Azure Resource Graph (tenant-wide tag audit in KQL), and the CAF resource-tagging guidance for the canonical strategy.

Real-world enterprise scenario

Northwind Logistics Cloud is a fictional pan-Asia freight and supply-chain platform: ~2,300 employees, a regulated-data profile (payments via a card-processing arm, plus India/Singapore data-residency obligations), and 80+ application teams running on the ALZ Bicep accelerator. Their Cloud Centre of Excellence (CCoE) works the Governance design area onto the management group hierarchy already deployed in Resource Organization (northwind intermediate root → Platform, Landing Zones {Corp, Online}, Sandbox, Decommissioned).

Azure Policy and initiatives. The CCoE deploys the ALZ default policy assignments module as its baseline, then authors three custom initiatives (defined at the northwind MG so they’re assignable anywhere): a Security Guardrails initiative, a Cost Guardrails initiative, and a Tagging initiative. Every new restrictive policy lands first as a Deny with enforcement mode DoNotEnforce for a two-week soak, during which they read the compliance dashboard, fix or exempt the backlog (each exemption carries a justification and a 90-day expiry), then flip to enforce. They hold assignments at the root MG to a deliberate minimum and push workload-specific rules down to Corp/Online. App-platform leads get Resource Policy Contributor scoped to their own subscription for app-level governance. Artifact: a policy-assignment matrix — MCSB + ALZ defaults + the three custom initiatives at northwind, residency Deny (allowed-locations centralindia, southindia, southeastasia) pinned at geo sub-tiers under Corp.

Cost controls and budgets. Every vended subscription is born with a ₹/S$ budget and an action group alerting the owning team + FinOps at 60/80/100% of actual and forecast spend; dev/test subscriptions additionally trigger an automation runbook at 100% forecast that deallocates idle VMs. The Cost Guardrails initiative Denys VM SKUs outside an approved list and blocks expensive regions. Northwind commits to a savings plan for compute covering ~70% of steady-state compute, layers Reservations on the always-on SQL and the hub firewall, applies Hybrid Benefit to Windows/SQL, and routes nightly route-optimization batch to Spot VMs. Cost Management is sliced by CostCenter and Application for monthly showback. Outcome: a structural ~31% reduction in compute run-rate within two quarters and zero surprise invoices, because budgets give forecast-based early warning and Policy caps what can be created.

Guardrails (effect choices). Their effect-per-control register: Deny for the non-negotiables (no public IP on Corp, residency, encryption-at-rest, no classic resources); DenyAction/DELETE protecting the central Log Analytics workspace, the hub Azure Firewall, and the platform key vaults; DeployIfNotExists to auto-deploy diagnostic settings to the central workspace, auto-enable Defender for Cloud plans on every subscription, and configure backup — with evaluationDelay extended to 30 minutes on a slow-provisioning data service; Modify for tag inheritance. Remediation tasks (running under a managed identity with a scoped Contributor role) sweep the brownfield estate weekly. The audit-first discipline means not a single Deny rollout has blocked legitimate work.

Compliance frameworks. MCSB stays assigned tenant-wide (it drives the Defender for Cloud secure score, which the CISO tracks at 78% and climbing). The PCI-DSS regulatory initiative is scoped only to the card-processing management group/subscriptions, ISO/IEC 27001 at northwind for the certification audit, and a custom residency initiative at the India/Singapore geo tiers. The CCoE maps each framework’s controls to Policy and RBAC (access reviews + PIM via Entra ID Governance), tracks status in Defender for Cloud’s Regulatory Compliance dashboard, and exports quarterly evidence packs. A Manual-effect set with an attestation register covers the process controls Azure can’t evaluate.

Tag enforcement. Five mandatory tags — Environment, CostCenter, Owner, Application, DataClassification — with enumerated allowed values, enforced by the Tagging initiative: Deny missing tags on resource groups and subscriptions, Modify/inherit so child resources acquire CostCenter and Environment from their RG, add-default for ManagedBy=bicep. Remediation tasks tagged ~21,000 pre-existing resources; Azure Resource Graph KQL audits compliance live.

Measurable outcome after two quarters: Defender for Cloud secure score 62% → 78%; 100% mandatory-tag compliance on new resources, 97% across the legacy estate (audited via Resource Graph); first PCI-DSS and ISO 27001 audits passed on continuous evidence rather than a manual scramble; compute run-rate down ~31% via savings-plan/reservation/Spot layering; zero residency violations (allowed-locations pinned and inherited at the geo MG tiers); change-advisory-board reviews for cloud deployments eliminated, replaced by automated guardrails.

Deliverables & checklist

Common pitfalls

Going straight to Deny and breaking the business. A restrictive policy flipped to enforce on a populated estate blocks legitimate work and triggers a fire drill. Always land it as Audit (or Deny with enforcement mode DoNotEnforce) first, measure the non-compliant count, fix or exempt the backlog, then enforce.
Expecting Deny/Audit to fix existing resources. Only DeployIfNotExists and Modify are remediatable — and even they don’t auto-fix pre-existing resources without a remediation task (and a managed identity with the right RBAC). Plan remediation explicitly; don’t assume the live estate self-heals.
Confusing budgets with spending caps. A Cost Management budget alerts; it does not stop spend. To actually prevent cost you need Azure Policy SKU/region/type allow-lists. Use both — budgets to watch, Policy to cap — or you’ll get the alert after the money is gone.
Assigning regulatory initiatives everywhere. Most regulatory-compliance built-ins are audit-only and noisy if blanket-applied. Scope each regime (PCI-DSS, etc.) to where it actually applies, keep MCSB as the broad baseline, and pair audit initiatives with your own Deny/DINE enforcement for the controls you must guarantee.
Assuming tags inherit. Resources do not inherit their RG’s or subscription’s tags by default, so cost reports come out half-empty. Enforce mandatory tags with Deny at the RG/subscription source, propagate with Modify/inherit, and run remediation tasks on the legacy estate. Prefer Modify over Append for tags.
Hoarding assignments at the root management group. Piling policies onto the Tenant Root / intermediate root forces endless exclusions at inherited scopes and risks hitting assignment limits. Define high, assign at the right altitude, exclude low — push workload-specific guardrails down to the archetype MGs.

What’s next

With governance guardrails, cost controls, and compliance baselines enforced as code, part 8 of the Azure Landing Zone Design Areas turns to Platform Automation and DevOps — the GitOps pipelines, IaC modules, and subscription-vending machinery that deploy and continuously reconcile everything you’ve designed across the previous seven design areas.

Azure Landing Zone: Governance — Azure Policy Initiatives, Cost Guardrails, Compliance Frameworks & Tag Enforcement

Where this fits

Azure Policy and policy initiatives

What it is

Why it matters

Why initiatives, not loose policies

How to do it well

Concrete artifacts, decisions, and tools

Cost controls and budgets

What it is

Why it matters

The two halves: guardrails (preventive) and budgets (detective)

How to do it well

Concrete artifacts, decisions, and tools

Guardrails: audit vs deny vs deployIfNotExists (and the rest)

What it is

Why the choice matters

The operational reality of corrective effects

Concrete artifacts, decisions, and tools

Compliance frameworks

What it is

Why it matters

How to do it well

Concrete artifacts, decisions, and tools

Tag enforcement

What it is

Why it matters

The enforcement pattern (the part that actually works)

How to do it well

Concrete artifacts, decisions, and tools

Real-world enterprise scenario

Deliverables & checklist

Common pitfalls

What’s next

Written by Vinod

Comments

Keep Reading

The AWS Architecting Ladder: From a Static Site to Multi-Region Active-Active

The Azure Architecting Ladder: From a Simple Web App to Mission-Critical

Azure Architecture Case Studies: Real Proposal Walkthroughs (Easy → Complex)