Architecture Azure

Azure Cloud Adoption Framework: Govern — the Govern Methodology & Benchmark, the Five Disciplines, Azure Policy & Management Groups, and the Governance MVP

Where this fits

In the Azure Cloud Adoption Framework, Govern is the methodology that keeps a cloud estate inside the guardrails the business needs as it scales — it is the ongoing counterweight to the one-time build of Ready and the workload velocity of Migrate and Innovate. Where Ready stands up the landing zone, Govern is the continuous discipline of deciding which corporate policies matter, encoding them as enforceable controls (overwhelmingly through Azure Policy assigned across a management group hierarchy), measuring compliance, and tightening the guardrails as risk tolerance and the estate both evolve. Govern is deliberately not a project with an end date; CAF frames it as a cycle you bootstrap with a governance MVP and iterate forever alongside adoption.

Azure Cloud Adoption Framework — animated overview

The Govern methodology and the Governance Benchmark

What it is

CAF describes Govern as a five-step process that a small, accountable cloud governance team runs continuously. The first step is performed once to stand the function up; the remaining four repeat as a cycle:

  1. Build a cloud governance team — a small, cross-functional team (security, identity, finance/FinOps, platform, compliance) with a named executive sponsor (typically CIO/CTO), explicit authority to define policy and remediate non-compliance, and a defined scope expressed as a RACI matrix against the platform and workload teams.
  2. Assess cloud risks — identify and prioritise the risks unique to your organisation across every governance domain: regulatory compliance, security, identity, operations, cost, data, resource management, and increasingly AI.
  3. Document cloud governance policies — for each material risk, write a plain-language policy statement (a corporate policy) that is comprehensive, enforceable, and aligned to business need — before you reach for a tool.
  4. Enforce cloud governance policies — translate each statement into an automated control wherever feasible (Azure Policy, RBAC, Defender for Cloud, Entra ID Governance) and a manual control only where automation is impractical.
  5. Monitor cloud compliance — measure adherence, surface drift, alert on violations, and feed findings back into the next risk assessment.

The mental model underneath every step is the Policy Compliance Process inherited from CAF’s classic governance model: Business risk → Corporate policy → Policy statements → Design/enforcement → Monitor & enforce. You never start by writing an Azure Policy definition; you start by naming a risk the business actually cares about, and only then decide how much automation and enforcement that risk justifies.

The Governance Benchmark (the Cloud Governance assessment on the CAF site) is the diagnostic that bootstraps and re-grades this loop. It is a short questionnaire that scores your current versus desired state across the governance domains, exposes the gaps, and links each gap directly to the relevant CAF guidance and the cloud governance guides. Run it before your first MVP to find the highest-risk gaps, then re-run it each quarter to measure whether the loop is actually closing them.

Why it matters

The methodology exists to stop two opposite failure modes. The first is governance theatre — a 200-page policy binder no tool enforces, so the estate drifts freely. The second is tooling without intent — engineers assigning a wall of Azure Policy definitions nobody can trace back to a business risk, which blocks teams, generates noise, and gets disabled the first time it breaks a deployment. By forcing risk → policy statement → control in that order, the methodology guarantees every guardrail is justified, measurable, and removable when the risk changes. The Benchmark matters because it turns “are we well-governed?” — an argument — into a score with a trendline, which is what an executive sponsor and an auditor both want to see.

How to do it well

Concrete artifacts, decisions, and Azure tools

Step Key decision Primary artifact Azure tooling
Build team Sponsor, authority, scope Charter + RACI matrix — (org)
Assess risks Which domains, what priority Prioritised risk register Cloud Governance assessment (Benchmark)
Document policies Risk tolerance per statement Corporate policy doc with IDs (e.g. SC01, CM01) — (written policy)
Enforce Audit vs Deny; scope Policy assignments, RBAC model Azure Policy, RBAC, Defender for Cloud, Entra ID Governance
Monitor KPIs, alert thresholds Compliance dashboard Azure Policy compliance, Defender secure score, Resource Graph, Azure Monitor

The five governance disciplines

What they are

The Five Disciplines of Cloud Governance are CAF’s durable taxonomy for what you govern. They predate the modernised five-step process and slot neatly inside it: each discipline is a lens for the assess risks and document policies steps, and each comes with its own Policy Compliance Process. Crucially, the disciplines are incremental, not sequential — you rarely need all five at full maturity on day one. You implement the slice of each that your current risk demands in the MVP, then deepen them as adoption grows.

The five disciplines are:

Discipline Governs (the risk it addresses) Representative controls Lead Azure services
Cost Management Runaway/untracked spend, budget overrun, unallocatable cost Budgets + alerts, SKU/region restrictions, mandatory cost tags, anomaly alerts Microsoft Cost Management, Azure Budgets, Azure Advisor, Azure Policy (allowed SKUs), tag inheritance
Security Baseline Misconfiguration, exposure, threats, regulatory non-compliance Encryption at rest/in transit, deny public endpoints, MFA, threat detection, compliance initiatives Microsoft Defender for Cloud, Microsoft Cloud Security Benchmark, Azure Policy regulatory initiatives, Microsoft Sentinel
Identity Baseline Excess privilege, stale/standing access, identity-based attack RBAC least privilege, Conditional Access, PIM/JIT, access reviews, MFA Microsoft Entra ID, Conditional Access, Entra PIM, Entra ID Governance, Azure RBAC/ABAC
Resource Consistency Drift, sprawl, untaggable/orphaned resources, inconsistent ops Naming + tagging enforcement, allowed resource types/locations, diagnostic settings, backup policy Azure Policy, management groups, Azure Resource Graph, Azure Monitor, Azure Backup
Deployment Acceleration Manual error, snowflake environments, slow safe delivery Policy-as-code, IaC + pipelines, deployIfNotExists remediation, template/landing-zone reuse Bicep / Terraform / ARM, Azure Pipelines / GitHub Actions, Azure Policy (DINE/Modify), EPAC

Why they matter

The disciplines matter because they convert the vague mandate “govern the cloud” into five concrete backlogs you can prioritise and assign owners to. They also map cleanly onto organisational reality: Cost Management is where Finance/FinOps lives, Security Baseline and Identity Baseline are where the CISO’s controls land, Resource Consistency is the platform team’s operational hygiene, and Deployment Acceleration is the bridge to the engineering org’s delivery practice. Without the taxonomy, governance conversations collapse into a single overloaded “security and cost” bucket and identity, consistency, and delivery acceleration get neglected — which is exactly where the most expensive incidents (privilege escalation, untaggable spend, drifted production) tend to originate.

How to do them well

Cost Management is monitor-first by nature: set Azure Budgets with action-group alerts at the right scope (management group for the platform team, resource group for workload teams), enforce cost-allocation tags so every rupee/dollar is attributable, and use Azure Policy to disallow cost-intensive resource types and oversized SKUs rather than trying to police them after the fact. Wire Azure Advisor cost recommendations into the monitor step. The discipline’s KPI is the percentage of spend that is correctly tagged and within budget, not raw spend.

Security Baseline should lean on built-in rather than hand-rolled controls: enable Microsoft Defender for Cloud, adopt the Microsoft Cloud Security Benchmark initiative, and add the regulatory compliance initiatives you are actually bound by (PCI DSS v4, ISO 27001, NIST SP 800-53, CMMC, HIPAA/HITRUST, FedRAMP). The Defender secure score becomes your headline security KPI. Write custom policies only for risks no built-in covers.

Identity Baseline is governed mostly in Microsoft Entra, not Azure Policy: enforce MFA and ban weak passwords, drive least privilege through Azure RBAC at the correct scope, eliminate standing admin access with Entra Privileged Identity Management (PIM) just-in-time activation, gate access with Conditional Access, and schedule access reviews through Entra ID Governance. The KPI is standing privileged assignments trending toward zero.

Resource Consistency is where Azure Policy earns its keep: enforce a naming and tagging strategy (with Modify/Append effects and tag inheritance so missing tags are added, not just flagged), constrain allowed locations and allowed/disallowed resource types, and use deployIfNotExists to guarantee every resource gets diagnostic settings and backup. Query the estate’s real state with Azure Resource Graph.

Deployment Acceleration ties the other four together: everything — landing zone, policy, and workloads — is infrastructure-as-code (Bicep or Terraform) in source control and shipped through pipelines, and the policies themselves are managed as policy-as-code (Microsoft recommends Enterprise Azure Policy as Code, EPAC, to keep assignments aligned with the Azure landing zone recommended set). This is what makes guardrails reproducible and drift-resistant.

Azure Policy and management groups

What they are

Management groups are containers that sit above subscriptions and form a hierarchy up to the tenant root group. They are the scaffolding that makes governance scale: a subscription can have one parent management group, the hierarchy can be up to six levels deep (excluding the root and the subscription), and anything you assign at a node — Azure Policy or Azure RBACinherits downward to every child management group, subscription, resource group, and resource beneath it.

Azure Policy is the enforcement engine. A policy definition describes a rule and an effect; you group definitions into an initiative (policy set) and then create an assignment that binds the definition/initiative to a scope (management group, subscription, or resource group), with optional exclusions. The effect determines behaviour:

Effect Behaviour Typical use
Audit / AuditIfNotExists Logs non-compliance, changes nothing Monitor-first; discovering impact before blocking
Deny Blocks the create/update request Hard guardrails: disallowed regions, public IPs, banned SKUs
DeployIfNotExists (DINE) Deploys a remediation resource when missing Auto-attach diagnostic settings, backup, Defender agents
Modify Adds/updates/removes properties or tags Tag inheritance, enforcing TLS, adding required properties
Append Adds fields to a resource at create time Injecting default tags or settings
Disabled / DenyAction Turns a rule off / blocks a specific action (e.g. delete) Staging policies; protecting resources from deletion

Azure ships a large library of built-in definitions and initiatives — most importantly the Microsoft Cloud Security Benchmark and the regulatory-compliance initiatives — alongside the ability to author custom definitions for anything they don’t cover.

Why they matter

This pair is the entire reason governance can be applied once and hold across hundreds of subscriptions. Assign the security baseline at an intermediate management group and every current and future subscription placed under it inherits it automatically — no per-subscription onboarding, no drift when a new subscription appears. Get the hierarchy wrong (or skip it and assign at each subscription) and you inherit the single most painful remediation in cloud governance: re-parenting live production subscriptions and retrofitting policy onto running resources. The management group tree is the part that is genuinely expensive to change later, which is why CAF insists you design it in Ready and merely populate it in Govern.

How to do it well

Concrete artifacts

The deliverables of this sub-component are a management group hierarchy diagram, a policy assignment matrix (which initiative is assigned at which scope, in which effect, with which exclusions), the custom policy/initiative definitions in source control, and the EPAC pipeline that deploys them. Compliance is read back through the Azure Policy compliance view and Azure Resource Graph queries.

Building a governance MVP and iterating

What it is

A governance MVP (minimum viable product) is the smallest set of guardrails that addresses your highest-priority risks today — deliberately incomplete, deployed fast, and designed to be extended. It is CAF’s antidote to the “boil the ocean” governance project. The MVP is the output of running the methodology once across only the risks that matter most: you build the team, run the Benchmark, pick the two or three disciplines with the sharpest risk, write a handful of policy statements, enforce them (mostly in Audit, a few critical ones in Deny), and stand up basic monitoring. Then you iterate — each subsequent cycle adds depth to a discipline or brings a new one online, guided by the re-run Benchmark and by what the adoption teams hit friction on.

Why it matters

Governance maturity and adoption velocity have to grow together. An MVP that ships in a sprint lets the first migration wave proceed safely now, while a “complete” governance design that takes two quarters either blocks the business or — more often — gets bypassed entirely by teams under deadline pressure, leaving you with shadow IT and no guardrails at all. The MVP also keeps governance honest: because each iteration is small and traceable to a risk, you avoid accreting controls nobody can justify, and you can remove a guardrail when its risk recedes. This is the same “start small and expand” philosophy that CAF applies to landing zones in Ready, applied to policy.

How to do it well

Governance KPIs to drive the iteration

Discipline Headline KPI Source
Cost Management % spend tagged & within budget; # budget breaches Cost Management + Budgets
Security Baseline Defender secure score; % regulatory-initiative compliance Defender for Cloud / Azure Policy
Identity Baseline # standing privileged role assignments; % users with MFA Entra PIM / Conditional Access
Resource Consistency % resources policy-compliant; % correctly tagged Azure Policy compliance / Resource Graph
Deployment Acceleration % infra deployed via IaC; policy drift detected Pipelines / EPAC

Real-world enterprise scenario

Meridian Logistics, a mid-sized freight and supply-chain company (about 4,200 staff, head-quartered in Pune with operations across India, the UAE, and Singapore), has completed Strategy, Plan, and Ready. Their platform team deployed an Azure landing zone with a management group tree (intermediate root mg-meridianPlatform, Landing zones with Corp and Online, Sandbox, Decommissioned) and ~30 subscriptions. Two migration waves are about to start. Leadership’s mandate: govern without becoming the bottleneck. They are PCI DSS-bound for a customer payment portal and ISO 27001-certified.

Methodology & Benchmark. They stand up a five-person governance team — a cloud security lead, an identity lead, a FinOps analyst, a platform engineer, and a compliance manager — reporting to the CTO, with a published RACI that makes the team accountable for policy and the platform/workload teams responsible for enforcement. They run the Cloud Governance assessment; it scores them low on identity and cost governance and medium on security, and flags zero tag enforcement. That sets the MVP priority order.

The five disciplines, decided.

Azure Policy & management groups. Everything universal (security benchmark, allowed locations, tagging, diagnostics) is assigned at mg-meridian or Landing zones and inherits down; PCI is scoped narrowly to Online; Sandbox gets a relaxed assignment with a hard budget and DenyAction on long-lived resources; Decommissioned denies all new deployments. Critical controls (public-IP-on-storage, unencrypted disks, disallowed regions) go straight to Deny; everything else starts in Audit.

The MVP and iteration. Sprint-one MVP = Security Baseline + Resource Consistency + Cost Management, ~14 policy statements, the security benchmark assigned in Audit. Two weeks later they promote encryption, public-endpoint, and allowed-location policies to Deny once the audit data showed only 3 false-positive resource types (added as exclusions). Iteration two brings Identity Baseline (PIM rollout) online; iteration three adds the PCI initiative ahead of the portal go-live.

Measurable outcome (one quarter). Defender secure score 41% → 68%; tagged-and-attributable spend 22% → 94%; standing privileged assignments 47 → 4; policy-compliant resources 61% → 92%; PCI initiative at 100% on the Online scope ahead of the audit. Critically, no migration wave was blocked — the monitor-first rollout meant guardrails were tuned against real workloads before anything was denied.

Deliverables & checklist

Common pitfalls

What’s next

With guardrails encoded and a governance loop running, Part 8 — Manage turns to operating the estate day-to-day: the management baseline, operational compliance, and business-aligned reliability through Azure Monitor and the broader management toolchain.

AzureCloud Adoption FrameworkGovernEnterprise
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

// part 7 of 9 · Azure Cloud Adoption Framework

Keep Reading