Where this fits
In the Azure Cloud Adoption Framework, Govern is the methodology that keeps a cloud estate inside the guardrails the business needs as it scales — it is the ongoing counterweight to the one-time build of Ready and the workload velocity of Migrate and Innovate. Where Ready stands up the landing zone, Govern is the continuous discipline of deciding which corporate policies matter, encoding them as enforceable controls (overwhelmingly through Azure Policy assigned across a management group hierarchy), measuring compliance, and tightening the guardrails as risk tolerance and the estate both evolve. Govern is deliberately not a project with an end date; CAF frames it as a cycle you bootstrap with a governance MVP and iterate forever alongside adoption.

The Govern methodology and the Governance Benchmark
What it is
CAF describes Govern as a five-step process that a small, accountable cloud governance team runs continuously. The first step is performed once to stand the function up; the remaining four repeat as a cycle:
- Build a cloud governance team — a small, cross-functional team (security, identity, finance/FinOps, platform, compliance) with a named executive sponsor (typically CIO/CTO), explicit authority to define policy and remediate non-compliance, and a defined scope expressed as a RACI matrix against the platform and workload teams.
- Assess cloud risks — identify and prioritise the risks unique to your organisation across every governance domain: regulatory compliance, security, identity, operations, cost, data, resource management, and increasingly AI.
- Document cloud governance policies — for each material risk, write a plain-language policy statement (a corporate policy) that is comprehensive, enforceable, and aligned to business need — before you reach for a tool.
- Enforce cloud governance policies — translate each statement into an automated control wherever feasible (Azure Policy, RBAC, Defender for Cloud, Entra ID Governance) and a manual control only where automation is impractical.
- Monitor cloud compliance — measure adherence, surface drift, alert on violations, and feed findings back into the next risk assessment.
The mental model underneath every step is the Policy Compliance Process inherited from CAF’s classic governance model: Business risk → Corporate policy → Policy statements → Design/enforcement → Monitor & enforce. You never start by writing an Azure Policy definition; you start by naming a risk the business actually cares about, and only then decide how much automation and enforcement that risk justifies.
The Governance Benchmark (the Cloud Governance assessment on the CAF site) is the diagnostic that bootstraps and re-grades this loop. It is a short questionnaire that scores your current versus desired state across the governance domains, exposes the gaps, and links each gap directly to the relevant CAF guidance and the cloud governance guides. Run it before your first MVP to find the highest-risk gaps, then re-run it each quarter to measure whether the loop is actually closing them.
Why it matters
The methodology exists to stop two opposite failure modes. The first is governance theatre — a 200-page policy binder no tool enforces, so the estate drifts freely. The second is tooling without intent — engineers assigning a wall of Azure Policy definitions nobody can trace back to a business risk, which blocks teams, generates noise, and gets disabled the first time it breaks a deployment. By forcing risk → policy statement → control in that order, the methodology guarantees every guardrail is justified, measurable, and removable when the risk changes. The Benchmark matters because it turns “are we well-governed?” — an argument — into a score with a trendline, which is what an executive sponsor and an auditor both want to see.
How to do it well
- Keep the team small and senior. Agility and decision speed beat headcount; a sprawling governance committee is where guardrails go to stall.
- Adopt a monitor-first posture. For any risk that isn’t critical, deploy the control in
Audit/monitor mode first, learn the real-world impact, then escalate toDeny. Blocking what you don’t yet understand is how governance loses the trust of delivery teams. - Delegate enforcement; centralise policy. The governance team owns the strategy and the policy; the platform team applies controls at the platform scope and workload teams enforce within their workload. The governance team should not be the ones clicking “assign”.
- Re-run the Benchmark on a cadence. Treat it like a recurring health check, not a one-time onboarding quiz — its value is the delta between runs.
Concrete artifacts, decisions, and Azure tools
| Step | Key decision | Primary artifact | Azure tooling |
|---|---|---|---|
| Build team | Sponsor, authority, scope | Charter + RACI matrix | — (org) |
| Assess risks | Which domains, what priority | Prioritised risk register | Cloud Governance assessment (Benchmark) |
| Document policies | Risk tolerance per statement | Corporate policy doc with IDs (e.g. SC01, CM01) | — (written policy) |
| Enforce | Audit vs Deny; scope | Policy assignments, RBAC model | Azure Policy, RBAC, Defender for Cloud, Entra ID Governance |
| Monitor | KPIs, alert thresholds | Compliance dashboard | Azure Policy compliance, Defender secure score, Resource Graph, Azure Monitor |
The five governance disciplines
What they are
The Five Disciplines of Cloud Governance are CAF’s durable taxonomy for what you govern. They predate the modernised five-step process and slot neatly inside it: each discipline is a lens for the assess risks and document policies steps, and each comes with its own Policy Compliance Process. Crucially, the disciplines are incremental, not sequential — you rarely need all five at full maturity on day one. You implement the slice of each that your current risk demands in the MVP, then deepen them as adoption grows.
The five disciplines are:
| Discipline | Governs (the risk it addresses) | Representative controls | Lead Azure services |
|---|---|---|---|
| Cost Management | Runaway/untracked spend, budget overrun, unallocatable cost | Budgets + alerts, SKU/region restrictions, mandatory cost tags, anomaly alerts | Microsoft Cost Management, Azure Budgets, Azure Advisor, Azure Policy (allowed SKUs), tag inheritance |
| Security Baseline | Misconfiguration, exposure, threats, regulatory non-compliance | Encryption at rest/in transit, deny public endpoints, MFA, threat detection, compliance initiatives | Microsoft Defender for Cloud, Microsoft Cloud Security Benchmark, Azure Policy regulatory initiatives, Microsoft Sentinel |
| Identity Baseline | Excess privilege, stale/standing access, identity-based attack | RBAC least privilege, Conditional Access, PIM/JIT, access reviews, MFA | Microsoft Entra ID, Conditional Access, Entra PIM, Entra ID Governance, Azure RBAC/ABAC |
| Resource Consistency | Drift, sprawl, untaggable/orphaned resources, inconsistent ops | Naming + tagging enforcement, allowed resource types/locations, diagnostic settings, backup policy | Azure Policy, management groups, Azure Resource Graph, Azure Monitor, Azure Backup |
| Deployment Acceleration | Manual error, snowflake environments, slow safe delivery | Policy-as-code, IaC + pipelines, deployIfNotExists remediation, template/landing-zone reuse |
Bicep / Terraform / ARM, Azure Pipelines / GitHub Actions, Azure Policy (DINE/Modify), EPAC |
Why they matter
The disciplines matter because they convert the vague mandate “govern the cloud” into five concrete backlogs you can prioritise and assign owners to. They also map cleanly onto organisational reality: Cost Management is where Finance/FinOps lives, Security Baseline and Identity Baseline are where the CISO’s controls land, Resource Consistency is the platform team’s operational hygiene, and Deployment Acceleration is the bridge to the engineering org’s delivery practice. Without the taxonomy, governance conversations collapse into a single overloaded “security and cost” bucket and identity, consistency, and delivery acceleration get neglected — which is exactly where the most expensive incidents (privilege escalation, untaggable spend, drifted production) tend to originate.
How to do them well
Cost Management is monitor-first by nature: set Azure Budgets with action-group alerts at the right scope (management group for the platform team, resource group for workload teams), enforce cost-allocation tags so every rupee/dollar is attributable, and use Azure Policy to disallow cost-intensive resource types and oversized SKUs rather than trying to police them after the fact. Wire Azure Advisor cost recommendations into the monitor step. The discipline’s KPI is the percentage of spend that is correctly tagged and within budget, not raw spend.
Security Baseline should lean on built-in rather than hand-rolled controls: enable Microsoft Defender for Cloud, adopt the Microsoft Cloud Security Benchmark initiative, and add the regulatory compliance initiatives you are actually bound by (PCI DSS v4, ISO 27001, NIST SP 800-53, CMMC, HIPAA/HITRUST, FedRAMP). The Defender secure score becomes your headline security KPI. Write custom policies only for risks no built-in covers.
Identity Baseline is governed mostly in Microsoft Entra, not Azure Policy: enforce MFA and ban weak passwords, drive least privilege through Azure RBAC at the correct scope, eliminate standing admin access with Entra Privileged Identity Management (PIM) just-in-time activation, gate access with Conditional Access, and schedule access reviews through Entra ID Governance. The KPI is standing privileged assignments trending toward zero.
Resource Consistency is where Azure Policy earns its keep: enforce a naming and tagging strategy (with Modify/Append effects and tag inheritance so missing tags are added, not just flagged), constrain allowed locations and allowed/disallowed resource types, and use deployIfNotExists to guarantee every resource gets diagnostic settings and backup. Query the estate’s real state with Azure Resource Graph.
Deployment Acceleration ties the other four together: everything — landing zone, policy, and workloads — is infrastructure-as-code (Bicep or Terraform) in source control and shipped through pipelines, and the policies themselves are managed as policy-as-code (Microsoft recommends Enterprise Azure Policy as Code, EPAC, to keep assignments aligned with the Azure landing zone recommended set). This is what makes guardrails reproducible and drift-resistant.
Azure Policy and management groups
What they are
Management groups are containers that sit above subscriptions and form a hierarchy up to the tenant root group. They are the scaffolding that makes governance scale: a subscription can have one parent management group, the hierarchy can be up to six levels deep (excluding the root and the subscription), and anything you assign at a node — Azure Policy or Azure RBAC — inherits downward to every child management group, subscription, resource group, and resource beneath it.
Azure Policy is the enforcement engine. A policy definition describes a rule and an effect; you group definitions into an initiative (policy set) and then create an assignment that binds the definition/initiative to a scope (management group, subscription, or resource group), with optional exclusions. The effect determines behaviour:
| Effect | Behaviour | Typical use |
|---|---|---|
Audit / AuditIfNotExists |
Logs non-compliance, changes nothing | Monitor-first; discovering impact before blocking |
Deny |
Blocks the create/update request | Hard guardrails: disallowed regions, public IPs, banned SKUs |
DeployIfNotExists (DINE) |
Deploys a remediation resource when missing | Auto-attach diagnostic settings, backup, Defender agents |
Modify |
Adds/updates/removes properties or tags | Tag inheritance, enforcing TLS, adding required properties |
Append |
Adds fields to a resource at create time | Injecting default tags or settings |
Disabled / DenyAction |
Turns a rule off / blocks a specific action (e.g. delete) | Staging policies; protecting resources from deletion |
Azure ships a large library of built-in definitions and initiatives — most importantly the Microsoft Cloud Security Benchmark and the regulatory-compliance initiatives — alongside the ability to author custom definitions for anything they don’t cover.
Why they matter
This pair is the entire reason governance can be applied once and hold across hundreds of subscriptions. Assign the security baseline at an intermediate management group and every current and future subscription placed under it inherits it automatically — no per-subscription onboarding, no drift when a new subscription appears. Get the hierarchy wrong (or skip it and assign at each subscription) and you inherit the single most painful remediation in cloud governance: re-parenting live production subscriptions and retrofitting policy onto running resources. The management group tree is the part that is genuinely expensive to change later, which is why CAF insists you design it in Ready and merely populate it in Govern.
How to do it well
- Align the tree with the CAF Azure landing zone hierarchy, not your org chart: an intermediate root, then Platform (Management, Connectivity, Identity), Landing zones (Corp, Online), Sandbox, and Decommissioned. Governance assigned at “Landing zones” reaches every workload subscription; “Sandbox” gets deliberately looser policy and “Decommissioned” gets a deny-everything posture.
- Assign broad policy high, specific policy low. Universal controls (encryption, allowed locations, tagging) go at the intermediate root or platform/landing-zone nodes; workload-specific controls go at the subscription or resource group.
- Prefer block lists over allow lists for resource types — a short list of banned services ages far better than an ever-growing list of permitted ones.
- Manage assignments as policy-as-code (EPAC) so the deployed policy set is reviewable, diffable, and re-deployable, and so you can detect when the Azure landing zone recommended policies have changed and update or migrate to built-ins.
- Remember Azure Blueprints is deprecated (retiring; superseded by Azure landing zones + Template Specs + deployment stacks + Azure Policy) — do not start new governance on it.
Concrete artifacts
The deliverables of this sub-component are a management group hierarchy diagram, a policy assignment matrix (which initiative is assigned at which scope, in which effect, with which exclusions), the custom policy/initiative definitions in source control, and the EPAC pipeline that deploys them. Compliance is read back through the Azure Policy compliance view and Azure Resource Graph queries.
Building a governance MVP and iterating
What it is
A governance MVP (minimum viable product) is the smallest set of guardrails that addresses your highest-priority risks today — deliberately incomplete, deployed fast, and designed to be extended. It is CAF’s antidote to the “boil the ocean” governance project. The MVP is the output of running the methodology once across only the risks that matter most: you build the team, run the Benchmark, pick the two or three disciplines with the sharpest risk, write a handful of policy statements, enforce them (mostly in Audit, a few critical ones in Deny), and stand up basic monitoring. Then you iterate — each subsequent cycle adds depth to a discipline or brings a new one online, guided by the re-run Benchmark and by what the adoption teams hit friction on.
Why it matters
Governance maturity and adoption velocity have to grow together. An MVP that ships in a sprint lets the first migration wave proceed safely now, while a “complete” governance design that takes two quarters either blocks the business or — more often — gets bypassed entirely by teams under deadline pressure, leaving you with shadow IT and no guardrails at all. The MVP also keeps governance honest: because each iteration is small and traceable to a risk, you avoid accreting controls nobody can justify, and you can remove a guardrail when its risk recedes. This is the same “start small and expand” philosophy that CAF applies to landing zones in Ready, applied to policy.
How to do it well
- Start from the landing zone’s recommended policy set, in monitor mode. The Azure landing zone accelerator ships a curated initiative set; assigning it in
Auditfirst gives you an instant, low-risk baseline and a real compliance picture before you turn on anyDeny. - Pick MVP scope by risk, not by completeness. For most enterprises the first MVP is Security Baseline + Resource Consistency + Cost Management; Identity Baseline deepens next; Deployment Acceleration is the practice that carries all of it.
- Define exit/expand criteria per iteration. Each cycle should name what it adds (e.g. “promote the encryption and public-IP policies from Audit to Deny”, “add the PCI DSS initiative to the Corp management group”) and how you’ll know it worked (the relevant KPI moved).
- Instrument before you tighten. Stand up the compliance dashboard and budgets first; you cannot iterate on a number you aren’t watching.
Governance KPIs to drive the iteration
| Discipline | Headline KPI | Source |
|---|---|---|
| Cost Management | % spend tagged & within budget; # budget breaches | Cost Management + Budgets |
| Security Baseline | Defender secure score; % regulatory-initiative compliance | Defender for Cloud / Azure Policy |
| Identity Baseline | # standing privileged role assignments; % users with MFA | Entra PIM / Conditional Access |
| Resource Consistency | % resources policy-compliant; % correctly tagged | Azure Policy compliance / Resource Graph |
| Deployment Acceleration | % infra deployed via IaC; policy drift detected | Pipelines / EPAC |
Real-world enterprise scenario
Meridian Logistics, a mid-sized freight and supply-chain company (about 4,200 staff, head-quartered in Pune with operations across India, the UAE, and Singapore), has completed Strategy, Plan, and Ready. Their platform team deployed an Azure landing zone with a management group tree (intermediate root mg-meridian → Platform, Landing zones with Corp and Online, Sandbox, Decommissioned) and ~30 subscriptions. Two migration waves are about to start. Leadership’s mandate: govern without becoming the bottleneck. They are PCI DSS-bound for a customer payment portal and ISO 27001-certified.
Methodology & Benchmark. They stand up a five-person governance team — a cloud security lead, an identity lead, a FinOps analyst, a platform engineer, and a compliance manager — reporting to the CTO, with a published RACI that makes the team accountable for policy and the platform/workload teams responsible for enforcement. They run the Cloud Governance assessment; it scores them low on identity and cost governance and medium on security, and flags zero tag enforcement. That sets the MVP priority order.
The five disciplines, decided.
- Security Baseline — assign the Microsoft Cloud Security Benchmark initiative at
mg-meridianand the PCI DSS v4 initiative scoped to only theOnlinemanagement group hosting the payment portal; enable Defender for Cloud (Defender plans on servers, storage, SQL, containers) tenant-wide. Target secure score ≥ 70% within two quarters. - Identity Baseline — enforce MFA and Conditional Access for all admins, move every privileged role into Entra PIM with 8-hour JIT activation, and schedule quarterly access reviews via Entra ID Governance. Goal: standing privileged assignments from 47 to <5.
- Cost Management — Azure Budgets at each landing-zone subscription with alerts to the workload owners and a roll-up budget at
Landing zones; an Azure Policy denying GPU and >32-vCPU SKUs outside an approved list; mandatorycostCenter,env, andownertags. - Resource Consistency —
Modify/tag-inheritance policies so the three mandatory tags are auto-applied;Allowed locationsrestricted to Central India, UAE North, and Southeast Asia;DeployIfNotExiststo push diagnostic settings to a central Log Analytics workspace and enable Azure Backup on all VMs. - Deployment Acceleration — all landing-zone and workload infra in Bicep in Azure Repos, shipped via Azure Pipelines; policy managed as code with EPAC.
Azure Policy & management groups. Everything universal (security benchmark, allowed locations, tagging, diagnostics) is assigned at mg-meridian or Landing zones and inherits down; PCI is scoped narrowly to Online; Sandbox gets a relaxed assignment with a hard budget and DenyAction on long-lived resources; Decommissioned denies all new deployments. Critical controls (public-IP-on-storage, unencrypted disks, disallowed regions) go straight to Deny; everything else starts in Audit.
The MVP and iteration. Sprint-one MVP = Security Baseline + Resource Consistency + Cost Management, ~14 policy statements, the security benchmark assigned in Audit. Two weeks later they promote encryption, public-endpoint, and allowed-location policies to Deny once the audit data showed only 3 false-positive resource types (added as exclusions). Iteration two brings Identity Baseline (PIM rollout) online; iteration three adds the PCI initiative ahead of the portal go-live.
Measurable outcome (one quarter). Defender secure score 41% → 68%; tagged-and-attributable spend 22% → 94%; standing privileged assignments 47 → 4; policy-compliant resources 61% → 92%; PCI initiative at 100% on the Online scope ahead of the audit. Critically, no migration wave was blocked — the monitor-first rollout meant guardrails were tuned against real workloads before anything was denied.
Deliverables & checklist
Common pitfalls
- Policy without a risk behind it. Assigning a wall of Azure Policy definitions nobody can trace to a business risk creates noise, blocks teams, and gets disabled. Avoid it by always running risk → policy statement → control, and deleting any control that can’t be justified.
- Deny-first enforcement. Starting controls in
Denybefore you understand their impact breaks deployments and burns the governance team’s credibility. Avoid it with the monitor-first pattern —Auditfirst, gather impact, add exclusions, then escalate toDeny. - Skipping (or org-chart-shaping) the management group tree. A flat estate, or one whose hierarchy mirrors the org chart, forces per-subscription policy and painful later re-parenting. Avoid it by aligning to the CAF landing zone hierarchy and assigning broad policy high.
- Big-bang governance. A “complete” governance design that takes quarters blocks the business and gets bypassed. Avoid it by shipping a risk-prioritised MVP in a sprint and iterating.
- Governance team doing the enforcement. If the small central team is the one assigning controls everywhere, it becomes the bottleneck it was meant to prevent. Avoid it by delegating enforcement to platform/workload teams via inheritance and keeping the team on strategy and policy.
- Building on deprecated foundations. Starting new governance on Azure Blueprints (retiring) means rework. Avoid it by using Azure landing zones, Azure Policy, Template Specs/deployment stacks, and policy-as-code from the outset.
What’s next
With guardrails encoded and a governance loop running, Part 8 — Manage turns to operating the estate day-to-day: the management baseline, operational compliance, and business-aligned reliability through Azure Monitor and the broader management toolchain.