A national health insurer’s platform team is drowning, and the ticket queue proves it. Every new project — a claims-fraud model, a member-portal refresh, a partner integration — starts the same way: a developer files a request for “a cloud account to build in,” and then waits. Eleven days on average, because the request bounces between a cloud engineer who hand-builds the subscription, a network team that wires the connectivity, a security reviewer who checks the guardrails, and a finance analyst who tags it for the right cost centre. Half the wait is queue time, not work time. Meanwhile the developers who got tired of waiting last year stood up their own accounts on a corporate credit card, and the security team is now finding regulated member data sitting in untagged, unmonitored AWS accounts nobody approved — exactly the HIPAA-scope nightmare a health insurer cannot have. The CIO’s mandate is blunt: “Make getting a compliant account a self-service request that takes minutes, not a fortnight, and make the shadow accounts pointless because the paved road is faster.”
This article is the reference architecture for that paved road: a self-service landing-zone vending machine where the request, the approvals, and the audit trail live in ServiceNow, and the actual infrastructure is built by Terraform — with policy guardrails and a change gate standing between “I want an account” and “the account exists.” It deliberately spans both Azure and AWS, because this insurer, like most enterprises of its size, is multi-cloud and refuses to build the same governance twice.
The pressures that shape this build
Four forces stack up, and naming them keeps the design honest.
Compliance. A HIPAA-regulated insurer must prove that every environment holding member data was provisioned with the right controls — encryption, logging, network isolation — and that a human with authority approved it. “Trust me, the script does it” does not survive an audit; you need an immutable record linking who asked, who approved, what was built, and with which guardrails.
Speed. The whole point is to collapse eleven days to under an hour of elapsed time for a standard request. If self-service is slower or more painful than the credit-card shortcut, shadow IT wins and the project fails.
Consistency. Every vended account must land in a known-good state — baseline network, identity wiring, logging, tags — so that account #400 is configured identically to account #4. Hand-building guarantees drift; Terraform modules guarantee sameness.
Cost accountability. Finance needs every account stamped at birth with a cost centre and an owner so spend is attributable from day one, not reconstructed painfully later. Untagged accounts are how a cloud bill becomes a mystery.
The pattern that satisfies all four is policy-as-a-service vending: a self-service front door (ServiceNow) that captures intent and approvals, a deterministic builder (Terraform) that turns intent into infrastructure, and a guardrail layer that refuses to build anything non-compliant. The human approves the request; the machine guarantees the configuration.
Why not the obvious shortcuts
Three tempting shortcuts each fail in a way someone on the project will have to learn the hard way.
“Just give developers a Terraform repo and let them self-serve via pull request.” This works for a platform team of ten, not an enterprise of thousands. There is no business approval, no cost-centre validation, no separation of duties between the requester and the approver, and no place a non-technical compliance officer can see or sign the request. Audit asks “who authorised this account?” and the answer is “whoever had merge rights,” which is not an answer.
“Let ServiceNow run the cloud SDK calls directly with its own scripts.” ServiceNow is superb at workflow and terrible at being a deployment engine. You would reimplement state management, drift detection, and idempotency that Terraform already does well, and you would scatter cloud credentials with broad provisioning rights inside the ITSM platform — a juicy target and a separation-of-duties violation.
“Build a custom internal portal that calls cloud APIs.” Now you own a bespoke web app, its identity, its audit logging, and its on-call — reinventing two products (ITSM + IaC) the enterprise already owns and staffs. The maintenance burden outlives the team that built it.
The durable pattern keeps each tool in its lane: ServiceNow owns the conversation and the governance; Terraform owns the infrastructure; a webhook is the seam between them. Neither holds the other’s responsibilities, and neither holds standing credentials it should not.
Architecture overview
The system has a clear front-to-back flow. A requester fills out a catalog item; ServiceNow Flow Designer validates and routes it through approvals; on approval, a webhook triggers a Terraform Cloud run that builds the landing zone in Azure or AWS; policy checks gate the apply; and the resulting account, its attributes, and its relationships are written back into the ServiceNow CMDB as the system of record. Identity is Okta-federated throughout, secrets come from HashiCorp Vault, and Wiz continuously verifies that what got built stays compliant.
Follow the control flow end to end:
-
Request. A developer opens the ServiceNow Service Catalog and selects “New Cloud Landing Zone.” A record producer form captures the inputs that matter: target cloud (Azure or AWS), environment (dev/test/prod), application name, cost centre, data classification (this drives the guardrail tier — a HIPAA-scoped account gets stricter controls), and the requesting team. Okta SSO means the requester’s identity is already trusted, and SCIM-provisioned group membership pre-fills their team and entitlements so they cannot request on behalf of a team they do not belong to.
-
Workflow & validation. Flow Designer picks up the request. It validates the cost centre against the finance system, checks that the requested environment is one the team is entitled to, and computes the guardrail tier from the data-classification field. This is where business logic lives — not in Terraform.
-
Approval gates. The flow routes for approval based on risk. A dev sandbox for non-sensitive data might need only the team lead. A production, HIPAA-scoped account triggers a multi-stage approval: team lead, then the cloud platform owner, then security. Each approval is a timestamped record. Critically, a Change Request is auto-created and linked, so the provisioning is a governed change with its own CAB-visible record — the separation of duties an auditor wants to see.
-
Trigger. On final approval, Flow Designer calls Terraform Cloud through its API via an outbound REST step — a webhook carrying the validated parameters as a run payload. ServiceNow does not build anything; it asks Terraform to. The request’s
sys_idtravels with the payload so the two systems can correlate the run back to the ticket. -
Plan. Terraform Cloud kicks off a run against the right landing-zone module — a versioned, peer-reviewed module that encodes the organisation’s standard: for Azure, a subscription vended under the right management group with baseline VNet, Defender, diagnostic settings, and tags; for AWS, an account vended via AWS Organizations / Control Tower Account Factory with a baseline VPC, GuardDuty, CloudTrail, and tags. The module fetches the cloud provisioning credentials at run time from HashiCorp Vault (dynamic, short-lived) rather than holding a static key.
-
Policy gate. Before any apply, the plan is evaluated by policy-as-code — Terraform Cloud Sentinel or OPA — enforcing hard rules: mandatory encryption, no public network exposure on the baseline, required tags present, naming conventions, region allow-lists. A plan that violates a hard policy is blocked, not warned. Wiz Code additionally scans the IaC for misconfigurations on the same gate. Only a clean plan proceeds.
-
Apply & post-config. Terraform applies, creating the account/subscription and its baseline. Post-apply, automation wires the account into the estate: enrols it in centralised logging, attaches the SCP/Azure Policy set for its tier, registers it with CrowdStrike Falcon for runtime protection and Wiz for continuous posture, and (for AWS) maps Okta groups to the right IAM Identity Center permission sets so the team can actually log in.
-
CMDB write-back. Terraform Cloud’s run completion fires a webhook back to ServiceNow. A Flow updates the original request to “Fulfilled” and writes the new account into the CMDB: a CI for the subscription/account, its cost centre, owner, environment, guardrail tier, and relationships to the application and the requesting team. The CMDB is now the authoritative inventory of every cloud account and why it exists.
-
Closure. The request closes, the requester is notified with their access details, and the linked Change Request moves to closed-complete. Elapsed time for a standard request: minutes, gated only by how fast humans approve.
Component breakdown
| Layer | Tool | Role here | Key configuration choices |
|---|---|---|---|
| Self-service front door | ServiceNow Service Catalog | Captures request intent via a record producer form | Variables for cloud, env, cost centre, data class; entitlement-driven visibility |
| Orchestration | ServiceNow Flow Designer | Validates, computes guardrail tier, routes approvals, calls Terraform | Outbound REST to TFC API; sub-flows per cloud; sys_id correlation |
| Approval & change | ServiceNow Change Management | Governed change record + multi-stage approvals per risk | Auto-create Change linked to request; CAB gate for prod/HIPAA |
| System of record | ServiceNow CMDB | Authoritative inventory of vended accounts and relationships | CI class per account; relationship to app, team, cost centre |
| IaC engine | Terraform Cloud | Plans and applies landing-zone modules; holds state | Per-environment workspaces; VCS-backed modules; remote state |
| Reusable build | Terraform landing-zone modules | Encodes the org standard for Azure subs / AWS accounts | Versioned, peer-reviewed; baseline net/logging/tags/policy |
| Policy gate | Sentinel / OPA + Wiz Code | Blocks non-compliant plans before apply | Hard-mandatory rules; IaC misconfig scan on PR and run |
| Secrets | HashiCorp Vault | Issues short-lived cloud provisioning creds to TFC runs | AWS/Azure secrets engines; dynamic leases; no static keys |
| Identity | Okta | SSO to ServiceNow; SCIM groups; AWS/Azure access post-vend | OIDC to ServiceNow; group-to-permission-set mapping |
| Posture & runtime | Wiz + CrowdStrike Falcon | Continuous compliance and runtime protection on vended accounts | Auto-onboard new account; drift alerts back to ServiceNow |
| Observability | Datadog | Provisioning-pipeline health, run durations, failure rates | TFC + ServiceNow event ingestion; SLO on time-to-vend |
A few choices carry the design and deserve the why.
Why ServiceNow owns approvals and the CMDB, not a custom app. The insurer already runs ServiceNow for ITSM, already trains staff on it, and already has CAB and audit processes wired into it. Reusing it means the change record, the approval chain, and the inventory live where auditors already look — and where compliance officers who will never touch Terraform can read and sign a request. The CMDB write-back is the part teams skip and regret: without it, you have automated provisioning but still cannot answer “list every prod account holding member data and who owns it” without a spreadsheet someone forgot to update.
Why Terraform Cloud, not a self-hosted runner doing terraform apply. Terraform Cloud gives you managed remote state with locking (no two requests corrupting the same workspace), a policy-check stage native to the run lifecycle, an API built for exactly this webhook-triggered pattern, and a run log that is itself an audit artifact. You can do this with self-hosted Terraform plus a CI runner; you then own state storage, locking, secret injection, and the policy harness yourself. For a regulated estate, the managed run lifecycle and the built-in Sentinel/OPA gate are worth it.
Why the cloud credentials come from Vault per run, not a stored key in Terraform Cloud. The provisioning identity is powerful — it can create accounts and subscriptions. A static long-lived key for that identity, sitting in a CI variable, is the single most dangerous secret in the whole system. Instead the Terraform run authenticates to Vault and Vault issues a short-lived, dynamically generated AWS/Azure credential scoped to provisioning, which expires minutes after the run. Steal the run log and you get nothing usable.
Implementation guidance
Start with the landing-zone modules, because everything downstream depends on them being right. Build and peer-review one Terraform module per cloud that encodes your standard. A trimmed Azure subscription-vending shape communicates the intent:
module "landing_zone" {
source = "git::https://internal/modules/azure-landing-zone.git?ref=v3.2.0"
subscription_alias = "lz-${var.app_name}-${var.env}"
management_group = var.guardrail_tier == "hipaa" ? "mg-regulated" : "mg-standard"
cost_center = var.cost_center # tag, enforced by policy
owner = var.owner_email
data_class = var.data_class
baseline_network = true # hub-peered VNet, no public ingress by default
enable_defender = true
diagnostic_to_law = var.central_log_workspace_id
required_tags = {
CostCenter = var.cost_center
Environment = var.env
Owner = var.owner_email
DataClass = var.data_class
ProvisionedBy = "servicenow" # provenance: built by the paved road
}
}
Pin the module by version tag (v3.2.0, never a floating branch) so a change to the standard is a deliberate, reviewed promotion — not a surprise that alters every future account silently. The AWS equivalent wraps Control Tower Account Factory with the same tag and guardrail contract.
Wire the ServiceNow-to-Terraform seam carefully. The outbound REST step in Flow Designer posts a run-trigger to the Terraform Cloud API with the validated variables. Carry the ServiceNow request sys_id as a run variable so both sides can correlate. Make the call idempotent: if an approval fires twice, you must not vend two accounts. Guard with a state flag on the request (“Provisioning In Progress”) that the flow checks before triggering, and have Terraform’s workspace keyed to the request so a duplicate run targets the same state rather than building a second account.
Make the policy gate fail closed. Configure Sentinel/OPA policies as hard-mandatory for the rules that matter — encryption on, no public baseline ingress, required tags present, region in the allow-list, naming convention. A soft-mandatory or advisory policy is a suggestion an under-pressure operator overrides; a hard-mandatory policy is a wall. Run Wiz Code on the same gate to catch IaC misconfigurations the policy set does not explicitly encode. The contract is simple and worth stating to stakeholders: a non-compliant landing zone cannot be built, even by an admin in a hurry.
Close the loop with write-back, or you have only automated half the problem. The Terraform Cloud run-completion webhook must hit a ServiceNow inbound endpoint that updates the request and creates/updates the CMDB CI. Include enough in the run output for the CMDB record to be complete: account/subscription ID, cost centre, owner, environment, guardrail tier, and the application it belongs to. Establish the relationships (account → application → team → cost centre) so the CMDB can answer real questions later.
Enterprise considerations
Security and separation of duties. The architecture is designed so that no single actor can both request and silently fulfil a privileged environment. The requester opens the catalog item; a different role approves; the machine builds within guardrails it cannot disable; and Wiz independently verifies the result. The provisioning credential is short-lived and Vault-issued, so even a compromised Terraform run cannot persist access. Okta governs who can reach the catalog and, post-vend, maps groups to least-privilege permission sets (AWS IAM Identity Center) or RBAC (Azure) so the team that requested the account gets exactly the access for it — and deprovisioning a leaver in Okta removes it estate-wide. CrowdStrike Falcon lands on the workloads in every vended account for runtime threat detection feeding the SOC. And because every account is born with logging, posture scanning, and a tagged owner, the shadow-IT accounts that started this project become strictly worse than the paved road — which is how you actually kill shadow IT.
Cost accountability. Every account is stamped at creation with cost centre and owner as policy-enforced tags, so spend is attributable from the first hour. Feed the account-to-cost-centre map from the CMDB into your FinOps showback, and the eleven-day reconstruction exercise disappears. The guardrail tier can also pin a region allow-list and even a baseline budget-alert, so a runaway dev account pages its owner instead of surprising finance at month-end.
| Concern | Without the platform | With ServiceNow-gated vending |
|---|---|---|
| Time to a compliant account | ~11 days, queue-bound | Minutes, approval-bound |
| Audit trail | Scattered tickets and tribal memory | One linked Change + CMDB CI |
| Configuration consistency | Hand-built, drifts | Versioned module, identical |
| Cost attribution | Reconstructed later, often untagged | Tagged at birth from the form |
| Shadow IT incentive | Faster than the official path | Slower and unmonitored vs paved road |
Scaling. The bottleneck at scale is rarely Terraform — it is human approvals and the cloud provider’s account-creation limits. Auto-approve low-risk tiers (a non-sensitive dev sandbox) entirely so humans only review what genuinely needs judgement, reserving the multi-stage gate for prod and regulated data. Mind the platform ceilings: AWS Organizations has account-creation quotas and Control Tower throttles concurrent enrolments; Azure subscription vending has its own rate limits. Queue and serialise enrolment in Flow Designer so a burst of fifty requests does not slam into a provider throttle and fail half of them.
Failure modes, and what each looks like. Name them before they page you.
- A duplicate approval double-vends an account. The fix is the idempotency guard above — a request-state flag plus a request-keyed workspace, so the second trigger is a no-op.
- Terraform apply fails halfway, leaving a partial account. Terraform Cloud’s state captures what was created; the flow must surface the failure on the request (not silently fail) and either auto-retry or route to the platform team with the run link attached. Never leave a half-built account with no ticket.
- The CMDB write-back webhook is lost, so an account exists but the inventory does not know. Mitigate with a reconciliation job that periodically diffs the cloud estate against the CMDB and flags orphans — the same job that catches any account created outside the paved road.
- The policy gate is misconfigured to advisory, letting a non-compliant plan through. Mitigate by treating the policy set itself as reviewed, version-pinned code, with Wiz as the independent backstop that catches what slipped.
- Vault is unavailable, so runs cannot fetch provisioning credentials and all vending stalls. This is a deliberate fail-closed: no account gets built without a properly issued short-lived credential. Make Vault HA (Raft) and monitor it as a tier-1 dependency of the provisioning path.
Observability. Treat the provisioning pipeline as a product with an SLO. Pipe Terraform Cloud run events and ServiceNow flow events into Datadog to track time-to-vend (the metric the CIO cares about), run success rate, policy-gate rejection rate (a spike means the modules or policies drifted from what teams need), and approval dwell time (where the human delay actually sits). When mean time-to-vend creeps up, the dashboard tells you whether it is Terraform, a provider throttle, or an approver sitting on requests — three very different fixes.
Governance. Version-pin the landing-zone modules and promote changes through review, so the standard evolves deliberately. Keep the Sentinel/OPA policy set in version control alongside the modules. Log every request, approval, run, and write-back — which, because they live in ServiceNow and Terraform Cloud respectively, you largely get for free. The CMDB becomes the single place that answers governance questions: every account, its owner, its purpose, its guardrail tier, and the approved change that created it.
Explicit tradeoffs
Accept these or do not build it. This platform front-loads engineering: you must build and maintain the landing-zone modules, the Flow Designer logic, the policy set, the two webhooks, and the CMDB write-back. That is real work, and for a handful of accounts a year it is overkill — a small shop should let the platform team hand-build with a checklist and skip all of this. The payoff arrives with volume and regulation: the moment you are vending dozens of accounts and an auditor is asking who approved what, the investment returns itself. The seam between ServiceNow and Terraform is also a coupling you must own — an API or schema change on either side can break the flow, so it needs the same testing discipline as production code. And the guardrails that make this safe will occasionally tell a developer “no” for a legitimate edge case the policy did not anticipate; you need a fast, governed exception path (itself a ServiceNow request) so the paved road bends without breaking.
The alternatives, and when they win. If you live in a single cloud and want the lightest path, that provider’s native vending — AWS Control Tower Account Factory for Terraform (AFT) or Azure landing-zone accelerators — gets you most of this with less glue, at the cost of the ServiceNow-native approval and CMDB integration this insurer specifically needs. If your developers are platform-savvy and you value a developer-portal experience over an ITSM one, an internal developer platform on Backstage with Terraform-backed templates and Okta SSO is the more developer-loved front door — though it leans on the engineering org to operate rather than the ITSM org. And if you are genuinely small, the honest answer is a documented runbook and a careful human; reach for this architecture when scale, multi-cloud, and audit pressure make the manual path the bottleneck — which, for this health insurer drowning in an eleven-day queue, it emphatically had.
The shape of the win
For the insurer, the payoff is not “a faster form.” It is that a developer requests a HIPAA-scoped production account at 9am, the right people approve it over coffee, and by mid-morning the account exists — encrypted, logged, network-isolated, tagged to a cost centre, registered with Wiz and CrowdStrike, accessible via their Okta groups, and recorded in the CMDB with the approved change that created it. The security team can list every regulated account and its owner from one query. Finance can attribute every dollar from day one. And the shadow accounts on someone’s credit card are now the slow, unsupported, more painful option — so they stop appearing. Everything in the build — the Flow Designer routing, the version-pinned modules, the fail-closed policy gate, the Vault-issued credentials, the CMDB write-back — exists to make a developer, a CISO, and a CFO each say yes to the same account at the same time. That alignment is the destination; start with one cloud and one tier if you must, but this is where governed self-service has to land.