Architecture GCP

GCP Cloud Adoption Framework: Operating Model & Epics — Designing the Cloud Operating Model, the Epic Backlog as Your Execution Engine, and Wiring It Into the Landing Zone & Enterprise Foundations Blueprint

Where this fits

Parts 1–5 of this series built the diagnostic half of Google’s Cloud Adoption Framework: the four themes (Learn, Lead, Scale, Secure), the three maturity phases (Tactical, Strategic, Transformational), and an honest, scored assessment of where your organization sits on each. This article is part 6 — the synthesis-and-execution capstone that turns that scored picture into something an enterprise can actually run. Two artifacts do that work: the operating model (who owns what, how decisions are made, how product teams pull from a platform team) and the epic backlog (the sequenced, ownable program of work that moves the gating themes up a phase). The third thing this part nails down is the hand-off most programs fumble — how the operating model and the foundational epics translate into a concrete, deployed Google Cloud landing zone built on the enterprise foundations blueprint and the Cloud Foundation Toolkit. Get this part right and the framework stops being a slide deck and becomes a funded roadmap landing real resource hierarchy, Shared VPC, and Organization Policy into your organization resource; skip it and you have four theme scores and no machine to act on them.

Google Cloud Adoption Framework — animated overview

Designing the cloud operating model

What it is. The cloud operating model is the organizational machine that runs Google Cloud day to day: the set of teams, their decision rights, the funding and accountability flows, and — above all — the interaction pattern between the people who build the platform and the people who build products on it. Google’s CAF measures readiness across four themes, but those themes only become durable if there is a standing operating model that owns them. The operating model answers the questions a scorecard cannot: Who owns the resource hierarchy? Who can change an Organization Policy? When a product team needs a new project, do they file a ticket or push a button? Who pays for the shared VPC? Who is accountable when Security Command Center flags a misconfiguration? It is the connective tissue between the Lead theme (sponsorship and the cross-functional team) and the Scale/Secure themes (the platform and guardrails those teams operate).

Why it matters. Two organizations with identical theme scores can have wildly different outcomes, and the difference is almost always the operating model. The classic failure is the central-IT bottleneck: a well-meaning platform team that insists on hand-building every project and approving every firewall rule, which throttles adoption until product teams route around it into shadow IT. The opposite failure is the anarchy of full delegation: every team gets Owner on its own projects, invents its own networking, and the security posture fragments into a hundred snowflakes that no one can audit. Google’s CAF treats the mature answer as a platform-and-paved-road model — a small platform team (often the Cloud Center of Excellence) that ships self-service, opinionated, guardrailed building blocks, and product teams that consume them with autonomy inside those guardrails. The operating model is what makes “secure by default” and “fast by default” the same default rather than a trade-off.

How to do it well. Pick an operating-model archetype deliberately, then encode it in the resource hierarchy and IAM rather than in a wiki page. The decisions that matter:

The operating-model archetypes compared:

Archetype How projects get created Networking & guardrails Speed Risk When it fits
Centralized / ticket-driven Platform team hand-builds every project Platform-owned, manual Slow Low blast radius, high bottleneck Tiny estates, heavy regulation, very early
Fully decentralized Each team self-serves with broad rights Team-owned, inconsistent Fast High — fragmented, unauditable Almost never at enterprise scale
Platform + paved road (recommended) Self-service vending inside guardrails Platform-owned guardrails, product-owned apps Fast and safe Low — uniform baseline, bounded autonomy The CAF target for Strategic+
Federated / hub-and-spoke Central platform + per-BU sub-platforms Org guardrails + BU-level autonomy Fast Medium Large multi-BU or multi-region enterprises

Artifacts, decisions, and Google Cloud tooling. The artifacts are an operating-model design document (the chosen archetype, team topology, and the platform/product contract), a RACI / decision-rights matrix mapping each decision class to mandate/recommend/delegate, the resource-hierarchy design (folder structure, project naming, IAM inheritance plan), and a FinOps model (billing-account structure, label taxonomy, showback/chargeback policy). The Google Cloud tooling that operationalizes the model is the resource hierarchy itself (organization, folders, projects), Cloud Identity / Google Workspace Groups for role assignment, Cloud IAM (custom roles, least privilege, IAM Conditions), Organization Policy Service for the mandated guardrails, Shared VPC for the platform-owned network, Cloud Billing with BigQuery billing export and Budgets & alerts for the FinOps flow, and Service Catalog / a project factory for the self-service vending that makes the paved-road archetype real.

The epics as a work backlog

What it is. Epics are how Google’s CAF converts a scored assessment and a chosen operating model into work that can be funded, staffed, sequenced, and finished. An epic is a discrete, outcome-oriented body of work that advances one or more themes toward a higher maturity phase — the framework’s unit of execution. Where part 1 introduced epics as a concept, this part treats the backlog as a managed, living instrument: a prioritized, dependency-aware queue of epics that the steering committee funds iteration by iteration, each epic decomposing into the engineering-level user stories and workstreams that teams actually pick up in a sprint. The backlog is the bridge between “you are Tactical on Secure” (a finding) and “this quarter we deploy org-wide guardrails and reach Strategic on Secure” (a plan with an owner, a budget, and a definition of done).

Why it matters. The chasm where cloud programs die is between assessment and action, and the epic backlog is the only thing that spans it. A scorecard with no backlog is an opinion; a pile of recommendations with no sequencing is a recipe for doing the exciting epics (analytics, AI, self-service) before the foundational ones (resource hierarchy, identity, landing zone, guardrails) and discovering nine months in that nothing rests on solid ground. A well-run backlog imposes three disciplines the framework depends on: prioritization (fund the epics that move the gating theme — the one that is both low-scoring and blocking a priority outcome), sequencing by dependency (you cannot vend self-service projects before the project factory exists, which cannot exist before the resource hierarchy and IAM baseline exist), and traceability (every epic answers “why are we doing this?” with “to move theme X from phase A to phase B, which unblocks the outcome that executive Y owns”). The backlog is also what keeps the program funded: a steering committee will keep paying for work it can see advancing on a board with measurable definitions of done.

How to do it well. Write and run the backlog with these rules:

A representative epic backlog, in roughly the order dependency forces, with the operating-model and foundation linkage made explicit:

# Epic Theme(s) Target phase Depends on Google Cloud building blocks
1 Charter the operating model: CCoE, steering, decision rights Lead, Learn Tactical → Strategic CCoE charter, RACI, OKRs, executive sponsorship
2 Stand up resource hierarchy & central identity Secure, Scale Tactical → Strategic 1 Organization, folders, projects, Cloud IAM, Cloud Identity, Groups
3 Deploy the landing zone foundation Scale, Secure Tactical → Strategic 2 Enterprise foundations blueprint, Cloud Foundation Toolkit, Terraform, Config Controller
4 Implement org-wide guardrails Secure Tactical → Strategic 2, 3 Organization Policy Service, Security Command Center, VPC Service Controls, Cloud KMS
5 Build shared networking & hybrid connectivity Scale, Secure Tactical → Strategic 3 Shared VPC, Cloud Interconnect/HA VPN, Cloud NAT, Cloud DNS
6 Ship self-service environment vending Scale, Lead Strategic → Transformational 3, 4, 5 Project factory, Terraform modules, Cloud Build/CI-CD, Service Catalog
7 Run the first migration wave Scale Tactical → Strategic 3, 4, 5 Migration Center, Migrate to Virtual Machines, Database Migration Service
8 Establish FinOps & cost governance Lead, Scale Tactical → Strategic 2, 3 Cloud Billing, BigQuery billing export, Budgets & alerts, Active Assist Recommenders
9 Stand up the analytics & AI platform Scale Strategic → Transformational 5, 6, 7 BigQuery, Dataplex, Looker, Vertex AI
10 Run a continuous upskilling & certification program Learn Tactical → Strategic 1 Google Cloud Skills Boost, certification paths, partner enablement

Artifacts, decisions, and tooling. The artifacts are the epic backlog (each epic carrying the five mandatory attributes), the dependency graph that fixes the sequence, the iteration/roadmap plan (which epics this quarter, gated by dependency and gating theme), and the definition-of-done register that ties each completion to a re-assessable maturity change. The decision the backlog drives is the entire sequenced program; the most consequential single decision is placing the foundational epics (1–5 above) ahead of the headline epics (analytics, AI, self-service) because every later epic stands on them. The tooling is the program-management system of record plus the maturity scorecard versioned over time, so each “done” epic can be evidenced as a phase movement rather than an activity.

How the framework connects to the landing zone and foundation setup

What it is. This is the hand-off the rest of the framework exists to produce: the point where the operating model and the foundational epics become a deployed Google Cloud landing zone — a secure, scalable, well-governed environment built from Google’s enterprise foundations blueprint and instantiated with the Cloud Foundation Toolkit and Terraform. The CAF tells you whether you are ready and what to build first; the landing zone is what gets built. Concretely, the foundational epics (resource hierarchy, identity, guardrails, networking) are not abstract — each one maps onto a specific layer of Google’s reference foundation, so the epic “deploy the landing zone foundation” is literally “run the enterprise foundations blueprint, parameterized to our operating-model decisions.” This section is where Google’s CAF (the readiness framework) shakes hands with the Google Cloud Architecture Framework (the design framework) and the enterprise foundations blueprint (the reference implementation).

Why it matters. The most expensive mistake in a Google Cloud program is building the foundation by hand, inconsistently, before the operating model is decided — because the foundation is the one thing you cannot cheaply re-do later. Get the resource hierarchy or the network topology wrong and you are re-parenting projects and re-IP-ing VPCs a year in, with production on top. The CAF prevents this by forcing the operating-model decisions (team topology, decision rights, isolation strategy, network model, FinOps) to be made before the foundation is laid, so the landing zone encodes the right answers from day one. Equally, the enterprise foundations blueprint prevents the opposite failure — re-deriving from scratch what Google has already distilled into an opinionated, auditable, Terraform-delivered reference that bakes in the security and reliability the Architecture Framework demands. The connection runs both ways: the CAF assessment parameterizes the blueprint (your org structure, your perimeter, your folders), and the blueprint operationalizes the foundational epics (it is the deliverable for epics 2–5 above).

How to do it well. Treat the landing zone as the materialization of your operating model, layer by layer, delivered as code:

How the foundational epics map onto the enterprise foundations blueprint layers and the tooling:

Foundational epic Blueprint layer it deploys Operating-model decision it encodes CFT / GCP mechanism
Resource hierarchy & identity Organization layer Team topology, isolation strategy, IAM inheritance Folders, projects, Cloud IAM, Cloud Identity Groups, custom roles
Org-wide guardrails Organization layer (policies) What is mandated vs delegated Organization Policy Service, Security Command Center, Cloud KMS
Landing zone foundation Bootstrap + environments layers dev/non-prod/prod separation, foundation CI/CD ownership Cloud Foundation Toolkit, Terraform modules, Config Controller
Shared networking & connectivity Networking layer Centralized vs delegated networking, hybrid model Shared VPC, Cloud Interconnect/HA VPN, Cloud NAT, Cloud DNS, VPC Service Controls
Self-service vending Project/app layer Paved-road vs ticket-driven provisioning Project factory, Service Catalog, Cloud Build pipelines
FinOps & cost governance Cross-cutting Cost centre vs chargeback, attribution model Cloud Billing, BigQuery billing export, Budgets & alerts, labels

Artifacts, decisions, and tooling. The artifacts are the landing-zone design document (the blueprint parameterized to your operating-model decisions, layer by layer), the foundation Terraform repository (bootstrap, org, environments, networking, projects — with CI/CD), the org-policy and IAM baseline as code, the network topology design (Shared VPC, address plan, connectivity), and the security/observability baseline (SCC, VPC-SC perimeters, logging-sink architecture, CMEK policy). The headline decision is the isolation and network model — single vs multiple host projects, the dev/non-prod/prod folder split, and the hybrid-connectivity topology — because it is the most expensive thing to change after the fact. The tooling is the enterprise foundations blueprint as the reference, the Cloud Foundation Toolkit / Terraform Google modules (or Fabric FAST, or Config Controller) as the delivery engine, and the Google Cloud Architecture Framework as the acceptance standard the deployed foundation must satisfy.

Real-world enterprise scenario

Northwind Pathology is a fictional but realistic clinical-diagnostics network: ~7,500 staff across 40 lab sites in the UK and Ireland, a regulated estate (UK GDPR, NHS DSP Toolkit, ISO 27001), one datacenter on a lease expiring in 16 months, and a board mandate to consolidate a sprawl of lab-information-system data into a single analytics platform that can power AI-assisted reporting. Parts 1–5 of their CAF journey left them with a clear, lumpy scorecard — Strategic on Scale (a strong engineering group already running GKE), but Tactical on Lead, Secure, and Learn — and a clear gating theme: Secure is blocking everything, because clinical data cannot move until guardrails and a perimeter exist. Their principal architect, Aoife, runs part 6 to turn that into an operating model, a backlog, and a deployed foundation.

Designing the operating model. Northwind chooses the platform + paved-road archetype with a federated twist: a central Platform team (8 engineers, the home of the CCoE) owns the landing zone, guardrails, Shared VPC, and FinOps; each of three clinical business units gets a stream-aligned squad that owns its workloads inside the guardrails. Decision rights are written down explicitly: the Platform team mandates the org-policy set, the IAM baseline, the network perimeter, and CMEK; recommends managed-service choices; and delegates application architecture inside a project. The model is encoded in the hierarchy — an Organization with folders for bootstrap, common (logging, security, network host projects), and dev / non-prod / prod, each with a per-BU sub-folder — and IAM is granted to Cloud Identity Groups at the folder level, never to individuals. FinOps is decided as showback first, chargeback by year two, attributed via a mandated cost-centre / bu / env label taxonomy and BigQuery billing export. The artifacts: an operating-model design doc, a decision-rights RACI, the resource-hierarchy design, and the FinOps model.

The epic backlog. Aoife sequences the backlog by dependency, foundational epics first, with Secure (the gating theme) front-loaded:

# Epic Theme(s) Target phase Definition of done (re-assessable)
1 Charter Platform team / CCoE + steering Lead, Learn Tactical → Strategic Funded CCoE, quarterly steering chaired by COO, RACI signed
2 Resource hierarchy & central identity Secure, Scale Tactical → Strategic Org, folders, Groups-based IAM live; no individual Owner grants
3 Landing zone foundation (blueprint) Scale, Secure Tactical → Strategic Enterprise foundations blueprint deployed via CFT/Terraform, in CI/CD
4 Org-wide guardrails + data perimeter Secure Tactical → Strategic Org Policies enforced, SCC Premium on, VPC-SC perimeter around clinical data
5 Shared networking & hybrid connectivity Scale, Secure Tactical → Strategic Shared VPC + HA VPN/Interconnect to the datacenter live
6 FinOps & cost governance Lead, Scale Tactical → Strategic Label taxonomy enforced, billing export + budgets, showback live
7 First migration wave + DC-exit plan Scale Tactical → Strategic First LIS workloads migrated; dated datacenter-exit plan
8 Self-service environment vending Scale, Lead Strategic → Transformational Project factory + Service Catalog; squads self-provision compliant projects
9 Clinical analytics & AI platform Scale Strategic → Transformational BigQuery + Dataplex + Vertex AI platform live inside the perimeter

Connecting to the landing zone and foundation. Epic 3 is delivered as the enterprise foundations blueprint parameterized to Northwind’s decisions and instantiated with the Cloud Foundation Toolkit Terraform modules: the bootstrap layer holds org-level state in a seed project with its own CI/CD; the organization layer applies the org-policy set (disable default network, restrict external IPs, restrict service-account key creation, enforce CMEK, restrict resource locations to europe-west2/europe-west1 for data residency) and the Groups-based IAM baseline; the environments layer lays down the dev/non-prod/prod folders with per-BU sub-folders; and the networking layer (epic 5) deploys Shared VPC host projects in common, HA VPN then Cloud Interconnect to the datacenter, Cloud NAT, Cloud DNS, and the VPC Service Controls perimeter (epic 4) that rings the clinical-data projects so BigQuery and Cloud Storage cannot be reached from outside the perimeter. Security Command Center Premium is enabled at the org, Cloud KMS CMEK is mandated for clinical data, and centralized Cloud Logging sinks land in a logging project in common. Every one of these is a parameter traceable back to a row in the decision-rights matrix. Once the foundation is live, epic 8 adds the project factory and Service Catalog on top, so the three BU squads provision compliant projects by button — the paved road closing the loop back to the operating model.

The measurable outcome. Thirteen months later Northwind re-assesses. Secure reaches Strategic — org-wide guardrails enforced and evidenced in Security Command Center, a VPC-SC perimeter around all clinical data, CMEK on every regulated bucket, and a clean NHS DSP Toolkit submission citing the foundation as evidence. Lead reaches Strategic (funded Platform team/CCoE, COO-chaired steering, signed RACI). Scale reaches Strategic with the landing zone live, the first two LIS workloads migrated, the datacenter-exit on track for the lease deadline, and 74% of new projects provisioned through the project factory rather than by hand. The clinical analytics platform on BigQuery + Dataplex + Vertex AI goes live inside the perimeter and begins powering AI-assisted draft reporting — a Transformational capability on the Scale theme that the board is now funding as a product line. The scorecard that started spiky and Tactical on three themes is balanced and a phase higher, and — the part that matters to an auditor — every control is evidenced in a tool, not asserted in a slide.

Deliverables & checklist

By the end of the Operating Model & Epics phase you should have produced:

Common pitfalls

  1. Laying the foundation before deciding the operating model. Hand-building the resource hierarchy and network “to get moving,” then discovering the team topology or isolation strategy demands a different shape — and re-parenting projects with production on top. Avoid it by making the operating-model decisions (topology, decision rights, isolation, network, FinOps) explicit before the foundation, and encoding each as a foundation parameter.
  2. The central-IT bottleneck. A platform team that insists on hand-building every project and approving every firewall rule throttles adoption until product teams route around it into shadow IT. Avoid it by shipping self-service paved-road assets (project factory, Service Catalog, golden Terraform) so the compliant path is the fast path, and measuring paved-road adoption %.
  3. Epics whose “done” is an activity, not a maturity change. “Ran a security workshop” is an activity; “moved Secure to Strategic, evidenced in Security Command Center” is an outcome. Avoid it by writing every epic’s definition of done as a re-assessable phase change tied to an owner and sponsor.
  4. Sequencing by excitement instead of dependency. Launching the analytics/AI or self-service epics before the resource hierarchy, identity, and guardrails exist builds a penthouse with no building under it. Avoid it by drawing the dependency graph and front-loading the foundational epics, even though they demo poorly.
  5. Re-deriving the foundation from scratch. Writing bespoke Terraform that re-invents what the enterprise foundations blueprint already distils — and missing the security and reliability defaults Google has baked in. Avoid it by starting from the blueprint and the Cloud Foundation Toolkit (or Fabric FAST), parameterized to your decisions, and using the Architecture Framework as the acceptance standard.
  6. An operating model that lives in a wiki, not in IAM and Org Policy. A decision-rights document nobody enforces is fiction; if “mandated” guardrails are not actually set as Organization Policy constraints, they are aspirations. Avoid it by encoding the model in the hierarchy, IAM (Groups at folders), and Organization Policy — so the model is the configuration, and drift is detectable.

What’s next

With the operating model designed, the epic backlog sequenced, and the landing zone connected to the enterprise foundations blueprint, part 7 of the Google Cloud Adoption Framework series turns to running the program in steady state — governance, FinOps, and continuous re-assessment that keep every theme advancing after the foundation is live.

GCPCloud Adoption FrameworkOperating Model & EpicsEnterprise
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

// part 6 of 6 · Google Cloud Adoption Framework

Keep Reading