Architecture AWS

AWS Enterprise Architecture: Multi-Account Landing Zone

A single AWS account feels great on day one. By month eighteen it is a shared blast radius: production and experiments live side by side, a hard-coded IAM key leaks into a public repo, the bill is one undecipherable line item, and your auditor wants to know who can touch the payments data. The multi-account Landing Zone is the answer the rest of the industry converged on — a pre-built, governed “ready to land workloads” foundation where every account is born compliant, isolated, and centrally observable. This article builds that foundation end to end using AWS Organizations, Control Tower, Service Control Policies, Transit Gateway, and IAM Identity Center.

The business scenario

Consider an organization at any of three stages — the pattern is the same, only the scale changes:

All three share the same root problems: weak isolation (one blast radius), inconsistent guardrails (security depends on whoever built the account), identity sprawl (long-lived IAM users and shared keys), network chaos (a mesh of peering connections that becomes unmanageable past a handful of VPCs), and no cost or audit attribution. The Landing Zone solves all five with one opinionated foundation: many small accounts as the unit of isolation, policy-as-guardrail applied top-down, federated short-lived access, a hub-and-spoke network, and consolidated billing plus immutable logs.

The design goal is “every new account is governed from the first second.” A team should be able to request an account, receive it fully baselined within the hour, deploy into a pre-attached network, and be physically incapable of violating the organization’s non-negotiables.

Architecture overview

The Landing Zone is a tree of AWS accounts governed from a single management (payer) account, wired together by a central network hub and a single sign-on plane.

AWS multi-account Landing Zone: an Organizations tree where Control Tower in the management account vends OUs and foundational accounts, SCP guardrails inherit top-down, IAM Identity Center federates SSO into time-bound workload roles, and every workload VPC attaches to a central Transit Gateway with segmented route tables, centralized inspection/egress, on-prem Direct Connect, and immutable Log Archive.

The governance spine (top-down). At the root sits the AWS Organizations management account — billing payer and the only place the org tree is edited. It should run nothing else. AWS Control Tower is enabled here as the orchestration layer; it stands up the org structure, the audit and log-archive accounts, and a library of guardrails (preventive ones implemented as Service Control Policies, detective ones as AWS Config rules). Accounts are grouped into Organizational Units (OUs)Security, Infrastructure, Workloads/Prod, Workloads/Non-Prod, Sandbox, Suspended — and SCPs attach to OUs so policy is inherited, not copy-pasted per account.

The shared-service accounts. Control Tower creates two foundational accounts in the Security OU: a Log Archive account (the write-once destination for every org CloudTrail event, Config snapshot, and VPC Flow Log, locked down so even admins cannot delete) and an Audit/Security account (cross-account read access for the security team, home of GuardDuty, Security Hub, and IAM Access Analyzer delegated administration). In the Infrastructure OU you add a Network account (owns the Transit Gateway, inspection VPC, and central egress) and a Shared Services account (private DNS, golden AMIs, CI/CD).

The request/identity path. A human authenticates once against IAM Identity Center (the successor to AWS SSO), backed by either the built-in directory or an external IdP (Entra ID, Okta) via SAML/SCIM. Identity Center maps the user’s groups to permission sets and grants time-bound, role-based access into specific accounts — there are no long-lived IAM users for people. The user lands in the AWS access portal, picks Payments-Prod / PowerUser, and assumes a session-scoped role. Every action they take is recorded by the org trail into Log Archive.

The data/network path. Workload VPCs live in their own accounts and carry no internet gateway and no individual NAT. Each VPC attaches to the Transit Gateway in the Network account. East-west traffic between, say, an app VPC and a shared-services VPC is routed through the TGW; north-south (internet-bound) traffic is forced through a centralized inspection/egress VPC where a firewall (AWS Network Firewall or a third-party appliance) and shared NAT gateways live. On-premises connectivity (Direct Connect or redundant Site-to-Site VPN) terminates once at the TGW and is reachable by every account through route-table propagation. TGW route tables segment the network: a “prod” association table that cannot route to “non-prod,” a “shared” table everyone can reach. The result is a star, not a mesh — N attachments instead of N² peering links.

Put together, the request flows down the OU tree for governance and across the Transit Gateway for traffic, with all identity entering through one portal and all logs draining into one immutable account. The remaining sections unpack each piece.

Component breakdown

Component What it does Why it’s here Key configuration choices
Organizations (management account) Billing payer; root of the account tree; consolidated billing; enables trusted access for org-wide services Single point to create accounts and attach policy; one invoice Run nothing else in it; enable all features; restrict access to 2–3 break-glass admins; SCP-deny everything but org administration in the root
Control Tower Orchestrates the Landing Zone: builds OUs, foundational accounts, baseline guardrails, and Account Factory for vending new accounts Turns weeks of manual setup into a governed, repeatable baseline; keeps drift in check Enable in your home region; choose the regions to govern; let it create Log Archive + Audit accounts; use Account Factory for Terraform (AFT) for IaC-driven vending
Organizational Units (OUs) Logical grouping of accounts for policy inheritance SCPs and config attach to OUs, so a new account in Workloads/Prod instantly inherits prod guardrails Group by function and risk, not by team; keep the tree shallow (≤ 3 levels); a Sandbox OU with permissive SCPs and tight budgets; a Suspended OU that quarantines compromised accounts
Service Control Policies (SCPs) Org-level guardrails that cap the maximum permissions any principal (even root) in an account can have Make non-negotiables impossible to violate, regardless of IAM Default-deny mindset: deny leaving approved regions, deny disabling CloudTrail/GuardDuty/Config, deny root access-key creation, deny deleting the org log buckets, require IMDSv2; remember SCPs don’t grant, they only restrict
Log Archive account Immutable sink for org CloudTrail, Config, VPC Flow Logs, ELB/Route 53 logs Tamper-evident audit trail isolated from the accounts that produce it S3 Object Lock (compliance mode) + bucket policy denying delete; no human write access; cross-region replication for the trail
Audit / Security account Delegated admin for GuardDuty, Security Hub, Access Analyzer, Macie; cross-account read roles Central security operations without logging into workload accounts Delegate from management, auto-enroll new accounts, aggregate Security Hub findings org-wide
Network account Owns Transit Gateway, inspection VPC, central egress, Direct Connect gateway; shares the TGW via RAM One team owns connectivity; workload accounts consume, not configure, the network Share TGW with AWS Resource Access Manager (RAM); segment with TGW route tables; no workloads run here
Transit Gateway (TGW) Regional hub connecting all VPCs, VPNs, and Direct Connect; routes and segments traffic Replaces the unmanageable N² VPC-peering mesh; central inspection and on-prem reach Separate route tables per segment (prod/non-prod/shared); appliance-mode for stateful inspection VPCs; TGW peering for multi-region
IAM Identity Center Workforce SSO; maps IdP groups → permission sets → time-bound account roles Eliminates human IAM users and shared keys; one place to grant/revoke; full audit Federate to Entra/Okta via SAML + SCIM; permission sets as code; session duration 1–4 h; require MFA; use ABAC with SAML attributes for fine-grained scoping
Shared Services account Central private DNS (Route 53 Resolver / shared hosted zones), golden AMI pipeline, artifact stores, CI/CD Avoids each team re-inventing DNS, images, and pipelines Resolver rules shared via RAM; EC2 Image Builder for golden AMIs; reachable over the shared TGW route table

A note on the two foundational accounts most teams under-appreciate: Log Archive is the single most important account to get right because it is your evidence locker — if an attacker (or a panicked engineer) can delete CloudTrail, you have lost both forensics and compliance. Audit is what lets a small security team operate over hundreds of accounts without ever holding standing credentials in them.

Implementation guidance

Bootstrapping order. Enable Organizations in a clean management account → enable Control Tower (it creates the core OUs, Log Archive, and Audit accounts) → layer your own OUs and SCPs → stand up the Network account and Transit Gateway → wire IAM Identity Center → vend the first workload accounts via Account Factory. Do not retrofit Control Tower onto a messy existing org until you’ve mapped existing accounts to target OUs; enrolling an account applies its baseline and can surface drift.

Infrastructure as Code. The reference choice is Account Factory for Terraform (AFT). AFT gives every account three customization layers: a global customization (applied to all accounts — e.g., a standard IAM password policy, default EBS encryption, baseline Config rules), an account-specific customization (e.g., a payments account gets PCI Config conformance packs), and account requests as Terraform code in a pipeline so vending is a pull request, not a console click. A minimal account request looks like:

module "payments_prod" {
  source = "./modules/aft-account-request"

  control_tower_parameters = {
    AccountEmail              = "aws+payments-prod@example.com"
    AccountName               = "payments-prod"
    ManagedOrganizationalUnit = "Workloads/Prod"
    SSOUserEmail              = "cloud-ops@example.com"
    SSOUserFirstName          = "Cloud"
    SSOUserLastName           = "Ops"
  }

  account_tags = {
    "BusinessUnit" = "payments"
    "Environment"  = "prod"
    "DataClass"    = "pci"
    "CostCenter"   = "CC-4407"
  }

  account_customizations_name = "pci-baseline"
}

SCPs themselves are JSON managed via Terraform’s aws_organizations_policy / aws_organizations_policy_attachment. A region-lockdown guardrail (deny everything outside approved regions while allow-listing global services):

{
  "Version": "2012-10-17",
  "Statement": [{
    "Sid": "DenyOutsideApprovedRegions",
    "Effect": "Deny",
    "NotAction": [
      "iam:*", "sts:*", "organizations:*", "route53:*",
      "cloudfront:*", "waf:*", "support:*", "budgets:*"
    ],
    "Resource": "*",
    "Condition": {
      "StringNotEquals": {
        "aws:RequestedRegion": ["eu-west-1", "eu-central-1"]
      }
    }
  }]
}

Pair it with a “protect the foundation” SCP that denies cloudtrail:StopLogging, cloudtrail:DeleteTrail, config:DeleteConfigurationRecorder, guardduty:DeleteDetector, and any s3:DeleteBucket/s3:PutBucketPolicy against the log-archive buckets, plus a deny on iam:CreateAccessKey for the root user pattern. Attach foundation SCPs at the root so even the management-adjacent OUs inherit them; attach environment-specific SCPs (e.g., prod can’t create IAM users at all) at the Workloads/Prod OU.

Networking wiring. In the Network account, create the Transit Gateway (disable default route-table association/propagation so you control segmentation explicitly), then share it via RAM to the whole org. Each workload account attaches its VPC to the shared TGW and adds a default route (0.0.0.0/0) pointing at the TGW for egress. Create three TGW route tables — prod-rt, nonprod-rt, shared-rt. Associate prod VPC attachments to prod-rt and propagate only the shared-services and on-prem routes into it; crucially, do not propagate non-prod routes into prod-rt, which makes prod↔non-prod traffic structurally impossible. Route all 0.0.0.0/0 through an inspection VPC running AWS Network Firewall in appliance mode (so flow symmetry is preserved across AZs), with NAT gateways and the single internet egress behind it. Terminate Direct Connect on a Direct Connect Gateway associated with the TGW; add a backup Site-to-Site VPN for resilience.

Identity wiring. Connect IAM Identity Center to your IdP: SAML for authentication, SCIM for automatic user/group provisioning so joiners/leavers sync without manual steps. Define permission sets as code (managed-policy ARNs plus inline policies plus a session duration). Map IdP groups to (account, permission set) assignments — e.g., the payments-engineers group → payments-prod account with a PaymentsPowerUser permission set that excludes IAM and KMS-key-deletion. For least privilege at scale, use ABAC: pass attributes (department, project) from the IdP as session tags and write permission-set policies that scope resource access by aws:PrincipalTag. Reserve a tightly-watched break-glass path: two physical root credentials for the management account, stored offline, MFA-protected, alerting on any use.

Detective layer. Control Tower turns on a baseline; extend it with org-wide GuardDuty (delegated to the Audit account, auto-enroll new accounts), Security Hub with the AWS Foundational Security Best Practices and CIS standards aggregated org-wide, AWS Config conformance packs per OU, and IAM Access Analyzer at the org level to catch any resource shared outside the org.

Enterprise considerations

Security & Zero Trust. The account is the primary trust and blast-radius boundary — a compromise in marketing-nonprod cannot touch payments-prod because they share no IAM, no network route, and no data plane. SCPs enforce non-negotiables that survive even a compromised account admin. Human access is short-lived and federated (no standing keys), MFA-everywhere, and least-privilege via permission sets + ABAC. Network Zero Trust comes from default-deny segmentation at the TGW plus centralized inspection: nothing talks to the internet without passing the firewall, and prod is islanded from non-prod by route-table design rather than by hopeful security-group rules.

Cost optimization. Consolidated billing pools usage across all accounts so volume discounts, Savings Plans, and Reserved Instances apply org-wide and float to wherever they’re needed. Mandatory tags (CostCenter, BusinessUnit, Environment) enforced via SCP and tag policies make AWS Cost Explorer and Cost & Usage Reports attributable per team. Set AWS Budgets with alerts per account (especially hard caps and auto-alerts on the Sandbox OU). Centralizing NAT gateways and egress in the inspection VPC avoids paying for a NAT per workload VPC — a frequently overlooked five-figure annual saving at scale. Watch the trade-off: TGW data-processing and inter-AZ charges are real, so keep chatty services in the same AZ where possible.

Scalability. The model scales by adding accounts, which sidesteps the hard per-account service quotas (IAM roles, security groups, VPCs) that strangle a single-account design. Account Factory vends a fully-baselined account in well under an hour; the TGW supports thousands of attachments per region; OUs and SCPs inherit automatically so governance scales with zero marginal effort per account.

Reliability & DR (RTO/RPO). The Landing Zone itself is resilient: the org structure and SCPs are global control-plane constructs, and Control Tower’s foundational accounts span AZs. For workloads, the account-per-environment pattern makes a clean DR account or DR region straightforward — replicate via cross-region TGW peering and pre-attach the DR VPCs. Typical targets: mission-critical (payments) RTO ≤ 1 h / RPO ≤ 5 min via active-passive multi-region with continuous data replication; standard tier RTO ≤ 4 h / RPO ≤ 1 h via warm standby; dev/test RTO 24 h from IaC redeploy. The Log Archive trail is cross-region replicated so audit survives a regional event. Because everything is IaC (AFT + Terraform), the entire account baseline is reproducible — your real DR plan for the foundation is “re-apply the code.”

Observability. Three planes: (1) audit — the org CloudTrail draining to Log Archive gives a single immutable record of every API call across every account; (2) security posture — Security Hub aggregates GuardDuty/Config/Inspector findings into one Audit-account dashboard; (3) operations — centralize CloudWatch Logs and metrics via cross-account observability (a monitoring account with linked source accounts) so SREs see all workloads in one pane. VPC Flow Logs from every attachment land in Log Archive for network forensics.

Governance. Guardrails come in two flavors — preventive (SCPs that block the action) and detective (Config rules that flag drift and can auto-remediate). Control Tower’s dashboard shows compliance per OU and per guardrail. Tag policies enforce the taxonomy that powers both cost and security. The whole org definition lives in Git: account requests, SCPs, permission sets, and network config are reviewed via pull request, giving you change history and four-eyes approval on the things that matter most.

Reference enterprise example

Northwind Pay is a fictional fintech: ~450 employees, processing card payments across the EU, subject to PCI-DSS, growing 60% year over year. They start on a single overloaded AWS account and migrate to a Landing Zone over a quarter.

Target structure. Management account (billing only, two offline root credentials). Security OU holds Log Archive and Audit. Infrastructure OU holds the Network account (TGW in eu-west-1, Direct Connect to their Frankfurt colo plus backup VPN, inspection VPC with AWS Network Firewall) and Shared Services (Route 53 Resolver, golden AMIs, GitLab runners). Workloads/Prod holds payments-prod, ledger-prod, web-prod. Workloads/Non-Prod holds the matching *-staging accounts. A Sandbox OU gives each of their 9 squads a personal account with a permissive SCP but a hard $300/month budget and auto-nuke of idle resources nightly.

Guardrails they set. Root-attached SCPs deny any region except eu-west-1 and eu-central-1 (data residency), deny disabling CloudTrail/GuardDuty/Config, and protect the log buckets. The Workloads/Prod OU adds an SCP forbidding IAM-user creation (humans only enter via Identity Center) and requiring IMDSv2. The payments-prod account gets a PCI Config conformance pack via AFT account-specific customization.

Identity. They federate Entra ID into IAM Identity Center with SCIM. The payments-engineers group maps to payments-prod with a PaymentsPowerUser permission set (1-hour sessions, no IAM/KMS-delete, MFA required). Auditors get a read-only permission set across all prod accounts. When an engineer leaves, Entra removes them and SCIM revokes access within minutes — no orphaned IAM users anywhere.

Network. Each VPC has no IGW; all attach to the shared TGW. The prod-rt route table carries shared-services and on-prem routes but not non-prod routes, so staging literally cannot reach the production ledger. All internet egress flows through the inspection VPC’s firewall (a single audited choke point for PCI). One NAT-gateway cluster serves the whole org instead of nine.

Outcomes after one quarter.

Metric Before (single account) After (Landing Zone)
Time to provision a governed account ~3 days, manual, inconsistent < 45 min via Account Factory, fully baselined
Blast radius of a leaked key Entire company One account, one environment
Production ↔ non-prod isolation Security-group hope Structurally impossible (no route, no IAM)
Audit trail Deletable by account admins Immutable, cross-region, write-once
Cost attribution One opaque invoice Per squad / per environment via tags
NAT gateways One set per VPC (sprawling) One centralized egress
PCI scope Whole account Isolated to payments-prod

The CISO’s summary: “We went from hoping nobody disabled logging to it being impossible, and our PCI auditor now reviews one isolated account instead of everything we own.”

When to use it

Use a multi-account Landing Zone when you have (or will soon have) multiple teams, more than one environment, compliance obligations (PCI, HIPAA, SOC 2, data residency), or any need to attribute cost and contain blast radius. Past roughly 3–4 accounts or the first compliance audit, this is the default — not an optional nicety. It is the AWS-blessed equivalent of an Azure Landing Zone or a GCP organization hierarchy.

Trade-offs and costs. A Landing Zone adds real overhead: a platform/cloud-foundations team to own it, a learning curve around SCPs and TGW segmentation, and baseline spend (Control Tower itself is free, but the TGW, Network Firewall, Direct Connect, and cross-region replication are not — budget a few thousand dollars a month even before workloads). The governance can feel heavy to a small team that just wants to ship.

Anti-patterns to avoid. Running workloads in the management account (it should be billing-only and locked down). Using SCPs as if they grant permissions — they only cap. Building a full VPC-peering mesh instead of a TGW hub (it doesn’t scale past a handful of VPCs). One giant account per team with everything inside (you lose environment isolation). Long-lived IAM users for humans (use Identity Center). Letting Account Factory drift by editing vended accounts in the console instead of through AFT. An OU tree organized by team instead of by risk/function (you end up duplicating SCPs everywhere).

Alternatives. For a genuinely small shop that will never exceed a couple of teams, plain Organizations + a handful of hand-built accounts + consolidated billing may be enough — you get isolation and one bill without the Control Tower machinery, at the price of doing the baseline yourself. If you need more opinionated, customizable orchestration than Control Tower offers (custom account pipelines, complex multi-account CI/CD), the older AWS Landing Zone solution or a fully bespoke Terraform foundation are options, but you then own all the lifecycle work Control Tower would have handled. For multi-cloud governance the equivalents are Azure Landing Zones and GCP’s Cloud Foundation Fabric — the account/subscription/project tree, policy-as-guardrail, hub-and-spoke, and federated SSO concepts map almost one-to-one.

The Landing Zone is not a product you buy once; it is a foundation you operate. Done well, it becomes invisible — teams ship into a network that already works, with guardrails they never have to think about, and a security team that sleeps better because the dangerous things are simply not possible.

AWSArchitectureEnterpriseReference Architecture
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading