AWS Lesson 31 of 123

Amazon VPC IPAM: Hierarchical CIDR Planning, Allocation, and BYOIP at Scale

The most expensive mistake in multi-account networking is not a misconfigured route table; it is two VPCs that both own 10.0.0.0/16. You cannot renumber a live VPC, you cannot route overlapping prefixes through a Transit Gateway, and by the time the overlap surfaces you have a spreadsheet, a Slack channel, and a quarterly meeting where humans manually hand out CIDR blocks. Amazon VPC IP Address Manager (IPAM) replaces all of that with a hierarchy of pools, automated allocation, and continuous overlap detection across every account in the organization. This is how to design the hierarchy, delegate it, automate allocation, monitor exhaustion, and bring your own public IP space — the way a platform team should, before anyone provisions a VPC.

Address allocation by spreadsheet fails for the same reason manual IAM fails: it does not scale and it cannot be enforced. A team picks 10.20.0.0/16 because it “looked free,” and six months later a Transit Gateway attachment is rejected or a Direct Connect route is black-holed because someone else picked the same block in another account. The longest-prefix-match router underneath TGW and VPC peering simply cannot represent two identical prefixes pointing at different destinations. IPAM is the single source of truth that issues address space rather than recording it after the fact, refuses to issue a conflict, and watches every CIDR it can see for overlap and exhaustion.

By the end of this article you will be able to design a top-down pool tree, delegate IPAM to a networking account, bolt allocation guardrails onto leaf pools, share them through AWS RAM so workload accounts self-serve, alarm on utilization before a prod VPC create fails, and bring your own public IPv4 space and ASN onto AWS without ever announcing the same prefix from two origins. Because IPAM is a system you operate, not a feature you toggle, the failure modes, limits, settings and the diagnostic playbook are all laid out as scannable tables — read the prose once, then keep the tables open when you are mid-migration.

What problem this solves

In a single account with one VPC, you can pick a CIDR by hand and never think about it again. The pain begins at organizational scale: dozens or hundreds of accounts, multiple Regions, mergers that drag in pre-existing 10.x estates, and a Transit Gateway or Direct Connect that demands every routable prefix be globally unique. The spreadsheet that “tracks” allocations is always slightly wrong, always behind, and can never prevent the next collision — it only records it after someone has already shipped a colliding VPC.

What breaks without address governance is concrete and expensive. A Transit Gateway route table cannot hold the same prefix twice, so the second VPC advertising 10.10.0.0/16 is silently unreachable across the shared backbone. A Direct Connect or Site-to-Site VPN route to on-prem 10.0.0.0/8 black-holes any AWS VPC that overlaps it. VPC peering between two 10.0.0.0/16 VPCs is simply rejected. And the cruelest part: you usually cannot renumber the offending VPC, because it is live, regulated, or load-bearing, so the fix is a multi-sprint migration rather than a config change.

Who hits this: any platform or network team running more than a handful of accounts, anyone standing up a hub-and-spoke Transit Gateway, anyone integrating an acquired company’s address space, and anyone who must advertise their own public IP range (for IP allow-lists, reputation, or a lift-and-shift that hard-codes IPs). IPAM’s first-day value is rarely the allocation — it is the discovery of the overlap nobody could previously see, then the guarantee that no new overlap can ever be created. To frame the whole field before the deep dive, here is every problem class IPAM addresses, what fails without it, and the IPAM mechanism that fixes it:

Problem class What breaks without IPAM The IPAM mechanism First place to look
Duplicate CIDRs across accounts TGW attachment rejected; peering refused Pools issue unique CIDRs within a scope get-ipam-resource-cidrs overlap-status
On-prem route collision Direct Connect / VPN route black-holes a VPC Import on-prem space as a manual allocation Overlap report in the private scope
“Looked free” manual picks Sprawl, no audit trail ipv4_ipam_pool_id forces a drawn CIDR Pool allocation list
Prod/non-prod address bleed Test space routed into prod allocation_resource_tags hard filter Pool allocation rules
Silent exhaustion VPC create fails in prod with no warning AWS/IPAM utilization metric + alarm CloudWatch alarm history
Public IP advertisement BYOIP done by ticket, error-prone Public-scope pool + ROA + advertise get-ipam-pool-cidrs state
No central visibility Nobody knows who owns what Org-wide monitored resource inventory get-ipam-resource-cidrs

Learning objectives

By the end of this article you can:

Prerequisites & where this fits

You should already understand VPC fundamentals — that a VPC owns one or more CIDR blocks, that subnets carve host space out of those blocks, and that a route table forwards by longest-prefix match so two identical prefixes cannot coexist in one table. Comfort with aws CLI in CloudShell, reading JSON output, and Terraform/CloudFormation basics is assumed; every operation below ships both an aws snippet and an IaC snippet. Familiarity with AWS Organizations, delegated administrator accounts, and AWS RAM (Resource Access Manager) sharing makes the delegation section land faster.

This sits at the foundation of multi-account networking, upstream of almost everything else you build. If VPC networking fundamentals and the VPC deep dive on subnets, routing, IGW, NAT and endpoints are the “what is a VPC” layer, IPAM is the “who is allowed to own which addresses” layer that must exist before you connect anything. It is the prerequisite for a clean Transit Gateway multi-account architecture and for resilient Direct Connect + Transit Gateway, both of which break the instant two prefixes collide. It depends on the account structure from Control Tower landing zones and is governed by Organizations SCPs and delegated admin.

A quick map of who owns what during an IPAM rollout, so you call the right person fast:

Layer What lives here Who usually owns it What it can block
Management account Org, trusted access, delegation Cloud platform / org admin Delegation; org-wide monitoring
IPAM delegated (networking) account IPAM, scopes, pool tree, alarms Network / platform team Allocation, guardrails, BYOIP
AWS RAM Resource shares of leaf pools Network team Whether members can self-serve
Member / workload accounts VPCs that draw CIDRs App / product teams Tag compliance, netmask bounds
RIR / RPKI (external) ROA, ASN ownership NetOps / external registry BYOIP advertisement validation
CloudWatch / SNS Utilization alarms, routing Observability / NetOps Early warning before exhaustion

Core concepts

Five mental models make every later operation obvious.

A spreadsheet records; IPAM issues. This is the entire shift. A spreadsheet records allocations after the fact and can never refuse a conflict; IPAM issues them and refuses to issue a conflict within a scope. Every VPC stops declaring a CIDR and starts drawing one from a pool, so uniqueness is guaranteed at creation, ownership is recorded centrally, and overlap is continuously detected. The difference between “record after” and “issue and refuse” is the whole problem.

The hierarchy has four levels, and each level has a job. An IPAM is the top-level resource, pinned to a home Region but aware of multiple operating_regions. A scope is a routing domain — every IPAM ships a private default scope and a public default scope, and prefixes inside one private scope must not overlap. A pool is a collection of CIDRs that nests: a child pool’s source_ipam_pool_id points at its parent and provisions space out of it. An allocation is a CIDR handed out of a pool to a resource (a VPC) or reserved manually. Get these four nouns straight and the rest is detail.

Locale decides who a pool can serve. A locale is the AWS Region (or Local Zone) a pool is tied to. Only a pool whose locale matches a VPC’s Region can allocate to that VPC. The standard pattern is a locale-free top pool (so it can feed every Region) with locale-pinned Regional pools beneath it. A pool with the wrong locale — or a top pool accidentally pinned to one Region — is the most common reason an allocation silently fails.

Guardrails are hard filters, not suggestions. A pool can carry allocation_default_netmask_length (the size handed out if the caller omits one), allocation_min_netmask_length / allocation_max_netmask_length (inclusive bounds on the prefix length, not host count), and allocation_resource_tags (a map every allocation must match or the request is rejected). These let you make prod and non-prod address space provably separate and prevent a team from grabbing a /12 when they should take a /20.

Two tiers, and only one is for an organization. The Free Tier gives pools, allocation, and basic monitoring within a single account/Region. The Advanced Tier adds cross-account and cross-Region pools shared via RAM, organization-wide overlap and utilization monitoring, BYOIP/BYOASN management, and public IP insights. Advanced is billed per active IP it manages. Everything below assumes Advanced Tier.

Before the deep sections, pin the vocabulary down side by side:

Term One-line definition Where it lives Why it matters
IPAM Top-level resource; home Region + operating Regions Delegated networking account The root of the whole tree
Scope A routing domain; prefixes within must not overlap Inside an IPAM (private + public default) Overlap is detected per scope
Pool A collection of CIDRs; nests via source_ipam_pool_id Inside a scope Where you carve and delegate
Allocation A CIDR drawn out of a pool Pool → VPC or manual reservation The unit you create/release
Locale The Region/Local Zone a pool serves Pool attribute Wrong locale → allocation fails
Provisioned CIDR A block added into a pool Pool ↔ parent or RIR space The pool’s supply of addresses
Operating Region A Region the IPAM monitors/operates in IPAM attribute Pools can only exist in these
Free / Advanced Tier Capability + billing tier IPAM attribute Advanced = cross-account + BYOIP
BYOIP / BYOASN Bring your own public prefix / ASN Public scope Advertise your space from AWS
ROA Route Origin Authorization in RPKI Your RIR Required to advertise BYOIP

Free Tier vs Advanced Tier — the capability split

The tier choice is not cosmetic; almost every organizational capability lives behind Advanced. Pick Free only for a single-account experiment. The exact split:

Capability Free Tier Advanced Tier
Pools and allocation in one account/Region Yes Yes
Manual + auto allocation, basic util Yes Yes
Cross-account pools (RAM sharing) No Yes
Cross-Region pools No Yes
Org-wide resource monitoring No Yes
Overlap & compliance reporting org-wide No Yes
AWS/IPAM CloudWatch metrics Limited Full (per-pool utilization)
BYOIP (public IPv4/IPv6) management No Yes
BYOASN No Yes
Public IP Insights No Yes
Billing model No per-IP charge Per active managed IP / hour

Why IPAM over the alternatives

Teams reach IPAM after a manual scheme has already hurt them. Seeing the alternatives side by side makes the trade explicit — what each approach can and cannot do:

Capability Spreadsheet / wiki Naming-convention discipline Custom DB + automation VPC IPAM
Records who owns a CIDR Yes (if updated) Implicitly Yes Yes
Prevents a conflicting allocation No No If you build it Yes (refuses)
Detects existing overlap automatically No No If you build it Yes (per scope)
Enforces size/tag guardrails at create No No If you build it Yes (native)
Cross-account self-service No No Heavy to build Yes (via RAM)
Org-wide utilization monitoring No No If you build it Yes (AWS/IPAM)
BYOIP / advertisement control No No No (manual tickets) Yes
Ongoing maintenance burden High (human) High (human) Very high (you own code) Low (managed)

The pattern is clear: every alternative either records without preventing, or forces you to build (and forever maintain) the prevention/monitoring that IPAM gives natively. The only scenario where a manual scheme wins is a single account with one or two VPCs that will never connect to anything.

Designing the pool hierarchy

IPAM lives in your networking/IPAM delegated account, not the management account. Reserve a dedicated account for it. A sound top-down design for an org is: one global top-level pool covering the whole supernet you control, a Regional pool per operating Region (locale-pinned), then per-environment or per-OU pools beneath each Region. The top pool stays locale-free so it can feed every Region; the layer below is pinned.

# In the IPAM delegated account
resource "aws_vpc_ipam" "org" {
  description = "Org-wide IPAM"
  tier        = "advanced"

  operating_regions { region_name = "eu-west-1" }
  operating_regions { region_name = "us-east-1" }
}

# Top-level pool: the whole RFC1918 supernet you control, locale-agnostic
resource "aws_vpc_ipam_pool" "top" {
  address_family = "ipv4"
  ipam_scope_id  = aws_vpc_ipam.org.private_default_scope_id
  description    = "Top-level - all private space"
}

resource "aws_vpc_ipam_pool_cidr" "top" {
  ipam_pool_id = aws_vpc_ipam_pool.top.id
  cidr         = "10.0.0.0/8"
}

# Regional pool, locale-pinned so VPCs in this Region draw from it
resource "aws_vpc_ipam_pool" "euw1" {
  address_family      = "ipv4"
  ipam_scope_id       = aws_vpc_ipam.org.private_default_scope_id
  locale              = "eu-west-1"
  source_ipam_pool_id = aws_vpc_ipam_pool.top.id
  description         = "eu-west-1 regional pool"
}

resource "aws_vpc_ipam_pool_cidr" "euw1" {
  ipam_pool_id   = aws_vpc_ipam_pool.euw1.id
  netmask_length = 12          # IPAM carves a free /12 from the /8
}

The CloudFormation equivalent uses AWS::EC2::IPAM, AWS::EC2::IPAMScope, AWS::EC2::IPAMPool and AWS::EC2::IPAMPoolCidr, so estates split across both tools share one allocation authority. The three structural layers, what each does, and the rule for its locale:

Layer Resource Locale Provisioned with Rule of thumb
IPAM aws_vpc_ipam Home Region + operating_regions n/a One per org, in the networking account
Scope default private / public n/a n/a Add scopes only for disconnected routing domains
Top pool aws_vpc_ipam_pool None (locale-free) Explicit cidr (e.g. 10.0.0.0/8) Feeds every Region; never locale-pin it
Regional pool child of top The Region (eu-west-1) netmask_length from parent One per operating Region
Env / OU pool child of regional Same Region netmask_length from regional Where guardrails + RAM shares attach

A concrete sizing of a 10.0.0.0/8 tree shows how the layers divide and how much room each leaves — size generously so you rarely re-cut (pool size doesn’t drive cost, only active IPs do):

Tier in the tree Example block Children it yields Per-child capacity Headroom
Top pool 10.0.0.0/8 up to 16× /12 one /12 per Region 16 Regions
Regional pool 10.0.0.0/12 up to 16× /16 one /16 per environment 16 envs/Region
Environment pool 10.0.0.0/16 up to 16× /20 one /20 per VPC 16 prod VPCs/env
Standard VPC 10.0.0.0/20 ~16× /24 subnets one /24 per AZ/tier ample subnetting
Small VPC 10.0.0.0/22 ~4× /24 subnets one /24 per AZ tight but workable

Pool attribute reference

Every meaningful pool attribute, its values, default, when to change it, and the gotcha:

Attribute Values Default When to set Gotcha / limit
address_family ipv4 / ipv6 none (required) Always One family per pool; IPv6 pools differ in BYOIP rules
ipam_scope_id a scope ID private default Always Overlap is enforced only within this scope
source_ipam_pool_id a parent pool ID none (top pool) Every non-top pool Omitting it makes a top-level pool needing explicit CIDR
locale a Region / Local Zone none Every pool that allocates to VPCs No-locale pool cannot allocate to a regional VPC
auto_import true / false false Discovery of existing CIDRs true can auto-pull overlapping space — review first
publicly_advertisable true / false n/a (public pools) Public-scope BYOIP pools Only valid in the public scope
aws_service ec2 none Public pools for EC2/EIP Required for some public allocations
allocation_default_netmask_length a prefix length none Leaf pools Used only when caller omits a size
allocation_min_netmask_length a prefix length none Leaf pools Inclusive lower bound on prefix
allocation_max_netmask_length a prefix length none Leaf pools Inclusive upper bound on prefix
allocation_resource_tags tag map none Leaf pools Hard filter; missing tag → request rejected

Scopes — when you need more than the defaults

Most organizations need exactly the two default scopes. You add a private scope only when you genuinely run disconnected routing domains — address space that will never route to the rest of the estate, where deliberate overlap is acceptable (think isolated lab or a fully air-gapped environment). Adding scopes to “organize” pools is a mistake: it disables the very overlap detection you came for, because overlap is computed within a scope, never across.

Scope decision Use a single private scope when… Use additional private scopes when…
Routing Everything may eventually route together (TGW/peering/VPN) Domains are permanently isolated, no shared routing
Overlap detection You want one collision-free address plan You intend to reuse the same CIDRs in isolation
Operational cost You want one place to reason about space You can justify managing parallel plans
Typical count 1 private + 1 public (the defaults) Rare; only for true air-gaps or disjoint tenants

Operating Regions and cross-Region behaviour

The IPAM has a home Region (where the IPAM resource lives) and a set of operating Regions (where it can create pools and monitor resources). Getting this wrong is a quiet trap: you cannot create a locale-pinned pool for a Region the IPAM doesn’t operate in, and resources in non-operating Regions simply aren’t monitored. The behaviours to know:

Aspect Behaviour Implication
Home Region Where the aws_vpc_ipam resource is created Pick a stable primary Region; it anchors the IPAM
Operating Regions Regions the IPAM operates/monitors in Must include every Region you’ll allocate into
Adding an operating Region Modify the IPAM’s operating_regions Do it before creating that Region’s pools
Locale-pinned pool in a non-operating Region Not allowed Add the operating Region first
Resource in a non-operating Region Not monitored Overlap/util reports miss it — blind spot
Cross-Region pool sharing Advanced Tier only Free Tier is single-Region
Removing an operating Region Blocked while pools/resources exist there Reclaim that Region’s space first

Allocation guardrails on leaf pools

Beneath each Regional pool, create one pool per environment (or per OU). This is where you bake in allocation rules so the pool refuses non-compliant requests: a default netmask, inclusive bounds on prefix sizes, and required tags on every allocation.

resource "aws_vpc_ipam_pool" "euw1_prod" {
  address_family      = "ipv4"
  ipam_scope_id       = aws_vpc_ipam.org.private_default_scope_id
  locale              = "eu-west-1"
  source_ipam_pool_id = aws_vpc_ipam_pool.euw1.id
  description         = "eu-west-1 prod"

  # Allocation guardrails
  allocation_default_netmask_length = 20   # default VPC size if caller omits one
  allocation_min_netmask_length     = 16   # nobody may grab bigger than /16
  allocation_max_netmask_length     = 24   # nor smaller than /24

  # Every allocation MUST carry these tags or the request is rejected
  allocation_resource_tags = {
    Environment = "prod"
  }
}

resource "aws_vpc_ipam_pool_cidr" "euw1_prod" {
  ipam_pool_id   = aws_vpc_ipam_pool.euw1_prod.id
  netmask_length = 16
}

allocation_min_netmask_length and allocation_max_netmask_length are inclusive bounds on the prefix length, not the host count, so “min 16 / max 24” means allocations between a /16 and a /24. Because larger prefix length = smaller network, min is the biggest block someone may take and max is the smallest — the inversion trips everyone the first time. The allocation_resource_tags map is a hard filter: a VPC create that does not carry Environment=prod will not draw from this pool.

The guardrail-to-effect mapping, with the exact failure when violated:

Guardrail Controls Example What a violation does How it manifests
allocation_default_netmask_length Size when caller omits one 20 n/a (it’s a default) Caller gets a /20 silently
allocation_min_netmask_length Largest block allowed 16 Request for /15 rejected CreateVpc/allocate fails
allocation_max_netmask_length Smallest block allowed 24 Request for /25 rejected CreateVpc/allocate fails
allocation_resource_tags Mandatory tags Environment=prod Untagged request rejected CreateVpc fails with tag error
Pool free space Available addresses /16 supply Request larger than free space “insufficient space” error
Scope uniqueness No overlap in scope n/a Manual CIDR that collides Allocation refused / flagged

A prefix-length cheat sheet so nobody guesses how big a block actually is:

Prefix Total addresses Usable VPC hosts (minus 5 AWS-reserved/subnet) Typical use
/16 65,536 up to ~65,000 A large region/env supernet
/18 16,384 ~16,000 A big environment pool
/20 4,096 ~4,090 A standard production VPC
/22 1,024 ~1,019 A medium VPC / EKS cluster
/24 256 ~251 A small VPC / single workload
/28 16 ~11 Smallest practical subnet block

Delegating administration and sharing pools via AWS RAM

Two things must be wired before member accounts can self-serve. First, delegate IPAM to your networking account so it (not the org management account) administers IPAM. Run this once from the management account:

aws ec2 enable-ipam-organization-admin-account \
  --delegated-admin-account-id 222233334444

This also enables the trusted access between IPAM and Organizations that powers org-wide monitoring. Second, share the leaf pools to the accounts or OUs that should allocate from them, using AWS RAM. You share the pool, not the IPAM, and RAM sharing with your organization must be enabled first.

resource "aws_ram_resource_share" "ipam_prod" {
  name                      = "ipam-euw1-prod"
  allow_external_principals = false
}

resource "aws_ram_resource_association" "ipam_prod" {
  resource_arn       = aws_vpc_ipam_pool.euw1_prod.arn
  resource_share_arn = aws_ram_resource_share.ipam_prod.arn
}

# Share to an entire OU (principal = OU ARN) or to specific account IDs
resource "aws_ram_principal_association" "ipam_prod_ou" {
  principal          = "arn:aws:organizations::111122223333:ou/o-exampleorgid/ou-prod-abcd1234"
  resource_share_arn = aws_ram_resource_share.ipam_prod.arn
}

Once shared, a workload account can reference the pool ID directly in its own VPC definition. It cannot see other pools, cannot widen the netmask bounds, and every CIDR it pulls is registered centrally in the IPAM account. The sharing model, what each piece does, and the failure if you skip it:

Step / setting What it does Owned by Failure if skipped Confirm with
enable-ipam-organization-admin-account Delegates IPAM to net account Management account Net team can’t admin IPAM describe-ipam-organization-admin-account
RAM “sharing with Organizations” Allows org-internal shares Management account Shares to OUs/accounts fail RAM settings page
aws_ram_resource_share The share container Net account Nothing to attach pools to get-resource-shares
resource_association (pool ARN) Puts the pool in the share Net account Members can’t see the pool list-resources on the share
principal_association (OU/acct) Grants the principal access Net account Member account sees nothing get-resource-share-associations
allow_external_principals=false Blocks outside-org shares Net account Risk of sharing externally Share config

What a member account can and cannot do once a pool is shared to it — the trust boundary:

Action in a member account Allowed? Why
Reference the shared pool ID in a VPC Yes That is the point of the share
Draw a CIDR within the netmask bounds Yes Within the pool’s guardrails
See other pools in the IPAM No Only shared pools are visible
Widen allocation_min/max_netmask_length No Guardrails are pool-owned
Skip the required allocation_resource_tags No Hard filter rejects it
Delete or modify the pool No Pool lives in the delegated account
View org-wide overlap reports No Monitoring is centralized

Automating VPC CIDR allocation

This is the payoff. A spoke VPC no longer hard-codes a block; it names a pool and a size, and IPAM hands back a free, non-overlapping CIDR. Run this from the member account that received the RAM share:

resource "aws_vpc" "spoke" {
  ipv4_ipam_pool_id   = "ipam-pool-0prodshared0euw1"  # the shared pool ID
  ipv4_netmask_length = 20                            # ask for a /20

  tags = {
    Environment = "prod"   # required by the pool's allocation_resource_tags
    Name        = "payments-prod"
  }
}

CloudFormation expresses the same contract through Ipv4IpamPoolId and Ipv4NetmaskLength on AWS::EC2::VPC. You can also reserve space outside of a VPC — for a future EKS secondary CIDR, an on-prem block you are reconciling, or a peer’s range — with an explicit allocation that carves the space so IPAM will never re-issue it:

aws ec2 allocate-ipam-pool-cidr \
  --ipam-pool-id ipam-pool-0prodshared0euw1 \
  --netmask-length 22 \
  --description "reserved for eks-prod secondary CIDR"

The CLI returns an IpamPoolAllocationId; hold onto it, because that is what you use to release the reservation later. The allocation methods, when to use each, and how IPAM treats it:

Allocation method How you invoke it When to use Released when…
Auto, by netmask ipv4_netmask_length on the VPC The normal case — let IPAM pick The VPC is deleted
Auto, specific CIDR --cidr on allocate-ipam-pool-cidr You need a particular block You release the allocation
Manual reservation allocate-ipam-pool-cidr --netmask-length Hold space for future use release-ipam-pool-cidr by alloc ID
Import existing VPC bring a live VPC under management Adopt without renumbering The VPC is deleted / re-imported
Secondary VPC CIDR second --ipv4-netmask-length association Grow a VPC (e.g. EKS) The association is removed

The most useful allocation-state CLI calls, and what each one answers:

Question Command Key fields
What did this pool hand out? get-ipam-pool-allocations --ipam-pool-id … Cidr, ResourceType, ResourceOwner
Which resources/CIDRs exist org-wide? get-ipam-resource-cidrs --ipam-scope-id … ResourceId, OverlapStatus, ComplianceStatus
What CIDRs are in a pool (supply)? get-ipam-pool-cidrs --ipam-pool-id … Cidr, State
What is the pool tree depth/source? describe-ipam-pools PoolDepth, SourceResource, Locale
Is delegation in place? describe-ipam-organization-admin-account DelegatedAdminAccountId

Declarative provisioning end to end

The whole point is that network teams stop touching the console. The pool hierarchy, RAM shares, and alarms live in the IPAM account’s Terraform state; member-account modules reference shared pool IDs as variables. A reusable VPC module never needs a CIDR input again:

variable "ipam_pool_id" { type = string }
variable "vpc_netmask"  { type = number, default = 22 }
variable "environment"  { type = string }

resource "aws_vpc" "this" {
  ipv4_ipam_pool_id   = var.ipam_pool_id
  ipv4_netmask_length = var.vpc_netmask
  tags                = { Environment = var.environment }
}

The division of state, who owns each file, and what not to put where:

Lives in IPAM account state Lives in member account / VPC module Never hard-code anywhere
aws_vpc_ipam, scopes aws_vpc with ipv4_ipam_pool_id A literal CIDR in a VPC
Top / Regional / leaf pools Subnets carved from the drawn CIDR A pool ID copy-pasted (use a var/output)
allocation_* guardrails Workload resources The netmask bounds (pool owns them)
RAM shares + principal assoc Required tags (Environment=…) Manual reservations done by hand long-term
AWS/IPAM CloudWatch alarms App-specific routing/SGs

Monitoring utilization, overlap, and exhaustion

IPAM continuously computes utilization for every pool and every monitored resource. Query it on demand to find pools that are filling up, and resources that overlap:

# Resource-level view across the whole IPAM: which VPCs/EIPs, and their util %
aws ec2 get-ipam-resource-cidrs \
  --ipam-scope-id ipam-scope-0abc123 \
  --filters Name=management-state,Values=managed

# Overlapping + non-compliant resources, surfaced directly
aws ec2 get-ipam-resource-cidrs \
  --ipam-scope-id ipam-scope-0abc123 \
  --filters Name=overlap-status,Values=overlapping \
            Name=compliance-status,Values=noncompliant

The richer signal is the metrics IPAM publishes. With Advanced Tier and state monitoring enabled, IPAM emits per-pool metrics — allocation counts, available address counts, and the all-important utilization ratio — to CloudWatch under the AWS/IPAM namespace. Alarm on the pool before it exhausts, not after a VPC create fails in prod:

resource "aws_cloudwatch_metric_alarm" "prod_pool_exhaustion" {
  alarm_name          = "ipam-euw1-prod-utilization-high"
  namespace           = "AWS/IPAM"
  metric_name         = "IPAMPoolAllocationUtilizationPercentage"
  dimensions = {
    IpamId       = aws_vpc_ipam.org.id
    IpamPoolId   = aws_vpc_ipam_pool.euw1_prod.id
    IpamScopeId  = aws_vpc_ipam.org.private_default_scope_id
  }
  statistic           = "Maximum"
  period              = 3600
  evaluation_periods  = 1
  threshold           = 80
  comparison_operator = "GreaterThanOrEqualToThreshold"
  alarm_actions       = [aws_sns_topic.netops.arn]
}

A pre-existing VPC that overlaps shows up in the overlap report the moment IPAM starts monitoring its account — which is exactly how you find the landmines the spreadsheet missed. The IPAM resource statuses, what each means, and the action it demands:

Status field Value Meaning Action
ManagementState managed IPAM tracks this CIDR Normal — appears in reports
ManagementState unmanaged Seen but not under a pool Import if it should be governed
ManagementState ignored Explicitly excluded None (you chose to skip it)
OverlapStatus nonoverlapping Unique in scope Healthy
OverlapStatus overlapping Collides with another CIDR Plan a renumber / isolate
ComplianceStatus compliant Within the pool’s rules Healthy
ComplianceStatus noncompliant Violates guardrails Fix tags/size or move pool
ComplianceStatus unmanaged No governing pool Import to govern

The AWS/IPAM metrics worth alarming on, and the threshold to start with:

Metric (namespace AWS/IPAM) What it tells you Starting threshold Why it’s leading
IPAMPoolAllocationUtilizationPercentage How full a pool is ≥ 80% Warns before a prod VPC create fails
Available address count Raw free space left pool-specific floor Absolute headroom, not just %
Allocation count Number of CIDRs handed out trend, not threshold Sudden spikes = misuse or sprawl
Compliance/overlap (via resource report) Bad/colliding CIDRs any > 0 Catches imports that collide

Bring your own public IP space (BYOIP) and BYOASN

Public space lives in the IPAM public scope. To advertise your own range from AWS, you provision it into a public-scope pool, prove ownership, then advertise it. Two ownership requirements matter: the prefix must be at least a /24 for IPv4 (the smallest globally routable block), and you need a valid ROA (Route Origin Authorization) in your RIR’s RPKI naming AWS’s ASN (16509) as an authorized origin so the advertisement passes RPKI validation.

# Provision your CIDR into a public-scope pool (publicly-advertisable)
aws ec2 provision-ipam-pool-cidr \
  --ipam-pool-id ipam-pool-0publicpoolexample \
  --cidr 203.0.113.0/24 \
  --cidr-authorization-context \
      Message="$MSG",Signature="$SIG"

The authorization context is a signed message proving you control the block; you sign a message string with the private key whose public half is published in your RDAP/whois record. Provisioning runs an asynchronous verification — poll get-ipam-pool-cidrs until state is provisioned. Only then advertise it:

aws ec2 advertise-byoip-cidr --cidr 203.0.113.0/24

# and to withdraw it (e.g., before migrating advertisement back on-prem)
aws ec2 withdraw-byoip-cidr --cidr 203.0.113.0/24

You can also bring your own ASN (BYOASN) so EC2 and Global Accelerator advertise your prefixes from your ASN rather than Amazon’s. Provision the ASN, associate it with the IPAM, and tie it to the public pool:

aws ec2 provision-ipam-byoasn \
  --ipam-id ipam-0abc123example \
  --asn 64512 \
  --asn-authorization-context \
      Message="$MSG",Signature="$SIG"

Keep advertisement under your control: provision and verify with advertisement off, cut over DNS, then advertise-byoip-cidr; withdraw before you ever move the announcement back to on-prem so you never have the same prefix announced from two origins. The BYOIP requirements, why each exists, and the failure if you miss it:

Requirement Value / rule Why Failure if missing
Minimum IPv4 prefix /24 Smallest globally routable block Provision/advertise rejected
Minimum IPv6 prefix (public adv.) /48 Smallest routable IPv6 advertisement Advertise rejected
ROA in RPKI Origin ASN 16509 (AWS) Advertisement must pass RPKI validation Advertise fails validation
Authorization context Signed message + signature Proves you control the prefix Provision fails ownership check
Public-scope pool publicly_advertisable=true Public space is a separate scope Wrong scope → cannot advertise
Verification complete State = provisioned Async ownership check must finish Advertise too early fails
BYOASN ROA (if used) ROA authorizes your ASN So your ASN may originate the prefix Advertisement from your ASN fails

The BYOIP lifecycle states and what each one means operationally:

State (get-ipam-pool-cidrs) Meaning Can you advertise? Next step
pending-provision Ownership verification running No Wait; poll the state
provisioned Verified and in the pool Yes (after cutover) advertise-byoip-cidr
failed-provision Ownership check failed No Fix ROA / auth context, retry
advertised Announced from AWS (already) Monitor reachability
withdrawing / deprovisioning Being removed No Wait for completion
pending-deprovision Removal in progress No Ensure no resources still use it

IPv4 vs IPv6 BYOIP differ in important ways — do not assume the IPv4 rules transfer:

Aspect IPv4 BYOIP IPv6 BYOIP
Smallest advertisable /24 /48
Default visibility Publicly advertisable Can be private (not advertised) or public
Use as EIPs Yes (from the pool) N/A (IPv6 not EIP-based)
ROA origin ASN 16509 (or your BYOASN) 16509 (or your BYOASN)
Typical motivation IP allow-lists, reputation, lift-and-shift Bring an owned IPv6 block onto AWS

Architecture at a glance

Follow the control path left to right. It begins in the management account, where you enable trusted access with AWS Organizations and run enable-ipam-organization-admin-account exactly once to delegate IPAM to a dedicated networking account — the management account never administers IPAM day to day. In the IPAM delegated account sits the IPAM itself (home Region plus operating_regions), the pool tree (a locale-free 10.0.0.0/8 top pool, locale-pinned /12 Regional pools, /16 per-environment leaf pools), and the allocation guardrails (min/max netmask, required tags) that make a pool refuse a bad request. From there a leaf pool is published through an AWS RAM resource share — the pool ARN, scoped to an OU and blocking external principals — so that member accounts can reference the pool ID and have a /20 drawn for each spoke VPC without ever picking a CIDR. Every drawn CIDR is registered back in the IPAM account, where utilization and overlap are computed and a CloudWatch alarm on AWS/IPAM fires at 80%. A separate branch into the public scope provisions BYOIP space (/24 minimum, a ROA naming ASN 16509) and advertises it only after DNS cutover.

The five numbered badges mark exactly where this path fails in practice: a locale mismatch stalls allocation, a guardrail rejects an untagged or wrong-sized request, a RAM share done wrong leaves the member account seeing nothing, an overlap or exhaustion condition surfaces in the monitoring branch, and a BYOIP advertisement is refused for a missing ROA or premature announcement. Read the diagram once to internalise the flow, then use the legend as the first triage table when an allocation or advertisement does not behave.

IPAM allocation control path showing Organizations and the management account delegating IPAM to a networking account, the locale-free top pool branching into locale-pinned Regional and per-environment leaf pools with min/max-netmask and required-tag guardrails, a RAM resource share publishing a leaf pool to an OU, member accounts drawing per-VPC CIDRs that register back for utilization and overlap monitoring with an AWS/IPAM 80% CloudWatch alarm, and a public-scope BYOIP pool with a /24 minimum and ASN 16509 ROA advertising only after cutover — with numbered failure badges on locale mismatch, guardrail rejection, RAM share visibility, overlap/exhaustion, and BYOIP advertisement.

Real-world scenario

A fintech platform team I worked with ran 60+ accounts that had grown organically before any address governance existed. Three separate teams had independently landed VPCs on 10.10.0.0/16. The pain surfaced when they stood up a central Transit Gateway for shared services: the second and third attachments advertising 10.10.0.0/16 were silently unusable, because a TGW route table cannot hold the same prefix twice. Renumbering a live, regulated payments VPC mid-quarter was not on the table.

The constraint was hard: they could not renumber the offending VPCs quickly, but they had to know the full blast radius before the TGW migration, and prevent any new overlap from that day forward. They stood up Advanced-Tier IPAM in a dedicated networking account, defined a 10.0.0.0/8 top-level pool with Regional (/12) and per-environment (/16) children, and imported every existing VPC’s CIDR rather than recreating anything. The import immediately populated the overlap report:

aws ec2 get-ipam-resource-cidrs \
  --ipam-scope-id ipam-scope-0abc123 \
  --filters Name=overlap-status,Values=overlapping \
  --query 'IpamResourceCidrs[].{Vpc:ResourceId,Cidr:ResourceCidr,Acct:ResourceOwnerId}'

That single command turned “we think three VPCs collide” into an exact list of resource IDs, CIDRs, and owning accounts — the precise scope they needed to plan migrations. From that point, every new VPC was forced through RAM-shared pools with allocation_resource_tags enforcing environment separation, so the overlap set could only shrink. They renumbered the two least-critical colliding VPCs over the following two sprints, left the regulated one isolated behind PrivateLink until its planned window, and never created a fresh overlap again.

The numbers tell the story of the migration arc, week by week:

Week State What changed Overlap count Outcome
0 Discovery Stood up IPAM, imported all VPC CIDRs 3 colliding VPCs Exact blast radius known
1 Enforced RAM-shared pools + tag guardrails live 3 (frozen) No new overlap possible
2–3 Renumber A Migrated test-tier collider to a drawn /20 2 One collision cleared
4–5 Renumber B Migrated staging collider 1 Down to the regulated VPC
6 Isolate C Regulated VPC behind PrivateLink 1 (isolated) TGW migration unblocked
10 Window Renumbered the regulated VPC in its window 0 Estate fully collision-free

The lesson: IPAM’s real first-day value was not allocation, it was discovery of the overlap nobody could previously see — and then making it structurally impossible to add another.

Advantages and disadvantages

IPAM trades a one-time design-and-delegate cost for permanent, enforced address governance. Weigh it honestly before you commit a team to operating it:

Advantages Disadvantages
Allocations are guaranteed unique within a scope — overlap becomes structurally impossible You must design the pool tree up front; a bad tree is painful to re-cut later
Overlap discovery on import surfaces landmines the spreadsheet never could Importing existing estates can reveal a large, awkward backlog of collisions
Guardrails (min/max netmask, required tags) reject bad requests automatically Mis-set bounds (the min/max inversion) silently block legitimate VPC creates
RAM sharing lets workload accounts self-serve without seeing other pools Sharing requires Organizations + RAM correctly wired; easy to half-configure
AWS/IPAM metrics alarm on exhaustion before a prod create fails State monitoring + alarms are something you must set up, not a default
BYOIP/BYOASN brings your public space onto AWS with controlled advertisement BYOIP depends on external RIR/RPKI (ROA) you don’t fully control
One declarative source of truth in Terraform/CloudFormation Advanced Tier bills per active managed IP — a real (if modest) line item

IPAM is the right call for any organization with more than a handful of accounts, a hub-and-spoke Transit Gateway, hybrid connectivity to on-prem, or a need to advertise owned public space. It is overkill for a single account with one or two VPCs that will never peer or connect to anything — there, a documented manual CIDR is fine. The disadvantages are all manageable: the pool tree is the only decision you must get roughly right up front, and even that can be extended (new Regional/leaf pools) far more easily than it can be fundamentally re-shaped.

Hands-on lab

Stand up a minimal IPAM, build a two-level pool tree with guardrails, draw a VPC CIDR, and prove the guardrails reject a bad request — then tear it all down. This is single-account and Free-Tier-friendly for the basic pool/allocation steps (no cross-account RAM, no BYOIP). Run in CloudShell.

Step 1 — Variables.

REGION=eu-west-1
export AWS_DEFAULT_REGION=$REGION

Step 2 — Create the IPAM (Free Tier, one operating Region).

IPAM=$(aws ec2 create-ipam \
  --operating-regions RegionName=$REGION \
  --query 'Ipam.IpamId' --output text)
SCOPE=$(aws ec2 describe-ipams --ipam-ids $IPAM \
  --query 'Ipams[0].PrivateDefaultScopeId' --output text)
echo "IPAM=$IPAM SCOPE=$SCOPE"

Expected: an ipam-… ID and an ipam-scope-… ID.

Step 3 — Top pool, provision 10.0.0.0/16 into it.

TOP=$(aws ec2 create-ipam-pool --ipam-scope-id $SCOPE \
  --address-family ipv4 --query 'IpamPool.IpamPoolId' --output text)
aws ec2 provision-ipam-pool-cidr --ipam-pool-id $TOP --cidr 10.0.0.0/16

Step 4 — Leaf pool with guardrails (locale-pinned, required tag, netmask bounds).

LEAF=$(aws ec2 create-ipam-pool --ipam-scope-id $SCOPE \
  --address-family ipv4 --source-ipam-pool-id $TOP --locale $REGION \
  --allocation-default-netmask-length 24 \
  --allocation-min-netmask-length 20 \
  --allocation-max-netmask-length 26 \
  --allocation-resource-tags Key=Environment,Value=lab \
  --query 'IpamPool.IpamPoolId' --output text)
aws ec2 provision-ipam-pool-cidr --ipam-pool-id $LEAF --netmask-length 20

Wait until the leaf pool’s provisioned CIDR shows state=provisioned:

aws ec2 get-ipam-pool-cidrs --ipam-pool-id $LEAF \
  --query 'IpamPoolCidrs[].State' --output text

Step 5 — Create a compliant VPC (drawn /24, correct tag). Expected: it succeeds.

aws ec2 create-vpc --ipv4-ipam-pool-id $LEAF --ipv4-netmask-length 24 \
  --tag-specifications 'ResourceType=vpc,Tags=[{Key=Environment,Value=lab}]' \
  --query 'Vpc.{Id:VpcId,Cidr:CidrBlock}'

Step 6 — Prove the guardrails (the whole point). Each of these MUST fail:

# (a) Missing the required Environment=lab tag -> rejected
aws ec2 create-vpc --ipv4-ipam-pool-id $LEAF --ipv4-netmask-length 24 || echo "REJECTED: tag rule"

# (b) Too big: /19 is below allocation-min-netmask-length 20 -> rejected
aws ec2 create-vpc --ipv4-ipam-pool-id $LEAF --ipv4-netmask-length 19 \
  --tag-specifications 'ResourceType=vpc,Tags=[{Key=Environment,Value=lab}]' || echo "REJECTED: min netmask"

# (c) Too small: /27 is above allocation-max-netmask-length 26 -> rejected
aws ec2 create-vpc --ipv4-ipam-pool-id $LEAF --ipv4-netmask-length 27 \
  --tag-specifications 'ResourceType=vpc,Tags=[{Key=Environment,Value=lab}]' || echo "REJECTED: max netmask"

If any of (a)–© succeeds, your guardrail is not doing what you think — a guardrail you have not seen reject something is a guardrail you do not have.

Step 7 — Confirm the allocation registered.

aws ec2 get-ipam-pool-allocations --ipam-pool-id $LEAF \
  --query 'IpamPoolAllocations[].{Cidr:Cidr,Type:ResourceType,Owner:ResourceOwner}'

Validation checklist — what each step proved:

Step What you did What it proves
2–3 IPAM + top pool + 10.0.0.0/16 The hierarchy and supply exist
4 Leaf pool with bounds + required tag Guardrails are configurable
5 Compliant VPC drew a /24 Allocation works end to end
6a Untagged create rejected The tag filter is enforced
6b /19 rejected min = the biggest allowed block
6c /27 rejected max = the smallest allowed block
7 Allocation listed The draw registered centrally

Teardown (bottom-up — deletion protection blocks any other order):

# Delete the VPC(s) first (releases their auto allocations), then:
aws ec2 deprovision-ipam-pool-cidr --ipam-pool-id $LEAF --cidr <leaf-cidr>
aws ec2 delete-ipam-pool --ipam-pool-id $LEAF
aws ec2 deprovision-ipam-pool-cidr --ipam-pool-id $TOP --cidr 10.0.0.0/16
aws ec2 delete-ipam-pool --ipam-pool-id $TOP
aws ec2 delete-ipam --ipam-id $IPAM

Cost note. Free-Tier IPAM with a couple of pools and one VPC for an hour is effectively free; Advanced Tier would bill per active managed IP. Deleting the IPAM bottom-up stops everything.

Common mistakes & troubleshooting

This is the playbook — the part you bookmark. First a scannable table you can read mid-migration, then the full reasoning for the entries that bite hardest. The recurring theme: IPAM rarely errors loudly; it refuses quietly, and the skill is knowing which refusal you are looking at.

# Symptom Root cause Confirm (exact cmd) Fix
1 VPC create from pool fails, “no locale” / no allocation Pool has no locale, or top pool is locale-pinned to the wrong Region describe-ipam-pools --query 'IpamPools[].{Id:IpamPoolId,Locale:Locale}' Pin the regional layer to the VPC’s Region; keep the top pool locale-free
2 CreateVpc rejected with a tag error allocation_resource_tags not satisfied describe-ipam-pools shows the required tags Add the exact tag (e.g. Environment=prod) to the VPC create
3 Request for a /15 rejected Below allocation_min_netmask_length Check pool AllocationMinNetmaskLength Ask within bounds; min = biggest block allowed
4 Request for a /27 rejected Above allocation_max_netmask_length Check pool AllocationMaxNetmaskLength Ask within bounds; max = smallest block allowed
5 “insufficient space” on a create Pool’s provisioned CIDRs are exhausted get-ipam-pool-cidrs + get-ipam-pool-allocations Provision more space into the pool from its parent
6 Member account “can’t see the pool” RAM share missing/wrong principal, or RAM org-sharing off get-resource-shares; get-resource-share-associations Enable RAM with Organizations; associate pool ARN + correct OU/account
7 Imported VPC shows overlapping Pre-existing duplicate CIDR get-ipam-resource-cidrs --filters Name=overlap-status,Values=overlapping Renumber the collider or isolate it (PrivateLink) until a window
8 Prod VPC create fails unexpectedly at peak Pool silently exhausted, no alarm get-ipam-pool-allocations; CloudWatch AWS/IPAM util Alarm at 80%; provision more space ahead of time
9 advertise-byoip-cidr rejected Prefix < /24, missing ROA, or still verifying get-ipam-pool-cidrs --query 'IpamPoolCidrs[].State' Provide a /24+ with a valid ROA (ASN 16509); wait for provisioned
10 Same prefix announced from two origins Advertised on AWS without withdrawing on-prem Check BGP/looking-glass; describe-byoip-cidrs Withdraw one origin; advertise only after the other is down
11 Can’t deprovision a CIDR / delete a pool Allocations still exist under it (deletion protection) get-ipam-pool-allocations (non-empty) Reclaim bottom-up: delete VPCs / release reservations first
12 Manual reservation won’t release Wrong/expired IpamPoolAllocationId get-ipam-pool-allocations --query '[].IpamPoolAllocationId' Release with the exact allocation ID
13 Compliance shows unmanaged for live VPCs Their CIDRs were never imported get-ipam-resource-cidrs --filters Name=compliance-status,Values=unmanaged Import the CIDRs (don’t recreate the VPCs)
14 Delegation commands fail from net account IPAM not delegated; running from wrong account describe-ipam-organization-admin-account Run enable-ipam-organization-admin-account from the management account

The expanded form for the entries that cost the most time:

1. VPC create from a pool fails with no allocation / a “locale” complaint. Root cause: The pool the VPC references has no locale, or the top-level pool was accidentally locale-pinned so it can’t feed the VPC’s Region. Confirm: aws ec2 describe-ipam-pools --query 'IpamPools[].{Id:IpamPoolId,Locale:Locale,Depth:PoolDepth}' — the leaf pool must carry the VPC’s Region, the top pool should show null. Fix: Pin the Regional layer to each Region; keep the top pool locale-free. Only a locale-matched pool can allocate to a VPC in that Region.

2. CreateVpc is rejected with a tag error. Root cause: The leaf pool’s allocation_resource_tags is a hard filter and the VPC create didn’t carry the required tag. Confirm: describe-ipam-pools shows AllocationResourceTags; compare to your --tag-specifications. Fix: Add the exact key/value (e.g. Environment=prod). This is the mechanism that keeps prod and non-prod space provably separate — it is working as designed.

3–4. A request for a /15 (or a /27) is rejected. Root cause: The size is outside the pool’s inclusive prefix bounds. min_netmask_length is the biggest block allowed (smallest prefix number), max_netmask_length the smallest block. Confirm: Check AllocationMinNetmaskLength / AllocationMaxNetmaskLength on the pool. Fix: Ask within bounds. If the bounds are genuinely wrong, change them on the pool (in the IPAM account) — a member account cannot.

6. A member account reports it “can’t see the pool.” Root cause: The RAM share is missing, points at the wrong principal/OU ARN, or RAM “sharing with Organizations” was never enabled. A common variant: someone shared the IPAM instead of the pool. Confirm: aws ram get-resource-shares --resource-owner SELF; aws ram get-resource-share-associations --association-type PRINCIPAL. Fix: Enable RAM sharing with Organizations (management account), then associate the pool ARN and the correct OU/account principal to the share.

7. An imported VPC shows overlapping. Root cause: A pre-existing duplicate CIDR that the spreadsheet missed — exactly what import is meant to surface. Confirm: get-ipam-resource-cidrs --filters Name=overlap-status,Values=overlapping returns the resource IDs, CIDRs and owning accounts. Fix: You cannot renumber a live VPC instantly; isolate the collider (PrivateLink) and renumber it in a planned window. The overlap set can only shrink once new VPCs are forced through guardrailed pools.

9–10. BYOIP won’t advertise, or a prefix ends up announced twice. Root cause (9): The prefix is smaller than /24, lacks a valid ROA naming ASN 16509, or is still in pending-provision. (10): You advertised on AWS without first withdrawing the on-prem announcement. Confirm: get-ipam-pool-cidrs --query 'IpamPoolCidrs[].State' (must be provisioned); for double-announcement, check a looking-glass / your BGP. Fix: Provide a /24+ with a correct ROA, wait for provisioned, advertise only after DNS cutover, and withdraw the other origin before announcing — never originate the same prefix from two ASNs at once.

11. You can’t deprovision a CIDR or delete a pool. Root cause: Deletion-protection semantics — a pool with live allocations under it can’t be emptied or deleted. Confirm: get-ipam-pool-allocations returns a non-empty list. Fix: Reclaim bottom-up: delete the VPCs (which releases their auto allocations) and release manual reservations by IpamPoolAllocationId first, then deprovision and delete the pool.

Best practices

Security notes

The control-to-threat mapping for the IPAM control plane:

Control Mechanism Protects against
Scoped IAM for pool ops Least-privilege policies on ec2:*Ipam* Unauthorized pool/CIDR changes
SCP/RCP on address space Org policies tied to tags/accounts Non-prod principals using prod space
Delegated-admin restriction Limit who assumes into the net account Hijacking the org address plan
BYOIP key hygiene Rotate/store the signing key securely Forged ownership / prefix hijack attempts
RAM external-principals off allow_external_principals=false Address authority leaking outside the org
CloudTrail alerting Alert on sensitive IPAM API calls Stealthy advertisement or pool changes

Cost & sizing

The bill driver for IPAM is simple: Advanced Tier charges per active IP it manages, billed hourly. The Free Tier (single account/Region, no cross-account sharing, no BYOIP) has no per-IP charge. So cost scales with the number of monitored, allocated addresses across the estate — not with the number of pools or VPCs. There is no per-pool or per-allocation fee; you are paying for the continuous monitoring/inventory of active IPs.

Right-sizing IPAM is therefore about not over-monitoring: import the address space you actually need governed, mark genuinely-irrelevant CIDRs as ignored rather than managed, and don’t stand up Advanced Tier in tiny single-account setups that the Free Tier covers. The cost is almost always trivial relative to the value — a single TGW outage from an undetected overlap, or one emergency renumber of a regulated VPC, dwarfs a year of IPAM’s per-IP charge. The cost/sizing levers:

Driver What you pay for Rough scale Lever to reduce When it’s worth it
Advanced Tier active IPs Per managed IP / hour Grows with monitored addresses ignored for irrelevant CIDRs Any multi-account / TGW estate
Free Tier Nothing (per-IP) Single account/Region only Use it for tiny setups One or two VPCs, no sharing
BYOIP advertised space The IPs themselves (EIP rules apply) Per address in use Only advertise what you use Owned-IP allow-lists, reputation
CloudWatch alarms Standard CW metric/alarm pricing A few alarms per pool Alarm on key pools only Always — exhaustion is expensive
SNS notifications Standard SNS pricing Negligible n/a Always

A rough picture: for an org with a few thousand actively-managed addresses across 60 accounts, Advanced-Tier IPAM is a low-tens-of-dollars-per-month line item — comfortably inside a platform budget and immaterial next to the cost of a single address-collision incident. Size the pools generously (a /8 top pool, /12 regions) so you rarely re-cut the tree; pool size does not drive cost, only active IPs do.

Interview & exam questions

1. Why can’t two VPCs both use 10.0.0.0/16 if they need to connect? Routing is by longest-prefix match, and a Transit Gateway or VPC route table cannot hold two identical prefixes pointing at different destinations. The second attachment/peering is rejected or silently unreachable. IPAM prevents this by issuing only unique CIDRs within a scope and detecting any existing overlap.

2. Walk through the IPAM hierarchy. An IPAM (home Region + operating Regions) contains scopes (a private and public default; prefixes within a scope must not overlap), which contain pools that nest via source_ipam_pool_id, which hand out allocations (CIDRs) to VPCs or as manual reservations. Overlap is enforced per scope.

3. What does locale do, and what’s the standard pattern? A locale ties a pool to a Region (or Local Zone); only a locale-matched pool can allocate to a VPC in that Region. The pattern is a locale-free top pool feeding locale-pinned Regional pools, with per-environment leaf pools beneath. A wrong/missing locale is the classic reason an allocation silently fails.

4. Free Tier vs Advanced Tier — what needs Advanced? Cross-account pools (RAM sharing), cross-Region pools, org-wide overlap/utilization monitoring, BYOIP/BYOASN, and public IP insights all require Advanced. Free Tier is single-account/Region with basic monitoring. Advanced bills per active managed IP.

5. How do allocation_min_netmask_length and allocation_max_netmask_length work — and what’s the trap? They are inclusive bounds on the prefix length, not host count. Because a larger prefix number is a smaller network, min is the biggest block allowed and max is the smallest. “min 16 / max 24” permits /16 through /24. The inversion trips people constantly.

6. How do you let a workload account self-serve CIDRs without exposing the whole address plan? Share the leaf pool (not the IPAM) via AWS RAM to the account or OU, with allow_external_principals=false. The member references the pool ID, draws within the guardrails, and cannot see other pools, widen bounds, or skip required tags.

7. You import an existing estate and three VPCs show overlapping. What now? You generally can’t renumber a live VPC instantly. Isolate the collider (e.g. behind PrivateLink) and renumber it in a planned window, while forcing all new VPCs through guardrailed pools so the overlap set can only shrink. Import’s value here is discovery — turning a guess into an exact list of resource IDs and accounts.

8. What are the hard requirements to advertise a BYOIP IPv4 prefix from AWS? The prefix must be at least a /24, you need a valid ROA in your RIR’s RPKI naming ASN 16509 (or your BYOASN), and the provisioning ownership check (signed authorization context) must complete to provisioned. Advertise only after, ideally post-DNS-cutover.

9. How do you avoid announcing the same prefix from two origins during a migration? Provision and verify on AWS with advertisement off, cut over DNS, then advertise-byoip-cidr; and withdraw the on-prem (or other) announcement before AWS originates it. Never originate one prefix from two ASNs simultaneously.

10. How do you get warned before a pool exhausts? Enable Advanced-Tier state monitoring and alarm on the AWS/IPAM CloudWatch metric IPAMPoolAllocationUtilizationPercentage at, say, 80%, routed to a NetOps SNS topic — so you provision more space before a prod VPC create fails, rather than after.

11. Why can’t you immediately delete a pool, and how do you reclaim space? Deletion-protection semantics block deprovisioning a CIDR or deleting a pool while allocations exist under it. Reclaim bottom-up: delete VPCs (releasing their auto allocations) and release manual reservations by IpamPoolAllocationId, then deprovision and delete the pool.

12. How do you bring an existing VPC under IPAM without renumbering it? Import its current CIDR into the appropriate pool. This registers the space as a manual allocation, flips the resource to managed, and makes it appear in utilization and overlap reports — the VPC keeps its address while finally becoming visible. Do not recreate the VPC.

These map primarily to the AWS Certified Advanced Networking – Specialty (ANS-C01)network design and management at scale, hybrid connectivity, IP addressing — and touch the Solutions Architect Professional (SAP-C02) multi-account networking domain. The cert-mapping for revision:

Question theme Primary cert Domain
Overlap, TGW routing, longest-prefix ANS-C01 Network design / connectivity
IPAM hierarchy, locale, tiers ANS-C01 IP address management
RAM sharing, delegation SAP-C02 / ANS-C01 Multi-account networking
Guardrails, tag-based separation SAP-C02 Governance at scale
BYOIP / BYOASN / ROA ANS-C01 Hybrid & public connectivity
Utilization monitoring / alarms ANS-C01 Network management & ops

Quick check

  1. Two VPCs in different accounts both own 10.10.0.0/16 and you attach both to one Transit Gateway. What happens, and why?
  2. Your top-level pool was created with locale = "eu-west-1". A VPC in us-east-1 can’t draw from the tree. What’s wrong?
  3. A leaf pool has allocation_min_netmask_length = 16 and allocation_max_netmask_length = 24. Will a request for a /26 succeed? Will a /16? Will a /12?
  4. A member account says it “can’t see” the pool you shared. Name two things to check.
  5. You’re migrating a public prefix onto AWS. In what order do you provision, advertise, withdraw on-prem, and cut over DNS — and why?

Answers

  1. The TGW route table cannot hold the same prefix (10.10.0.0/16) twice, so only the first attachment is routable; the second is silently unreachable. Routing is longest-prefix-match and can’t represent two identical prefixes to different destinations. IPAM would have prevented the duplicate at allocation time.
  2. The top-level pool is locale-pinned to eu-west-1, so it can only feed that Region. Keep the top pool locale-free and pin the Regional layer; only a locale-matched pool can allocate to a VPC in a given Region.
  3. A /26 is rejected (smaller than the /24 max). A /16 succeeds (it’s the largest allowed). A /12 is rejected (bigger than the /16 min). Remember: min = biggest block allowed, max = smallest.
  4. Check (a) the RAM resource share has the pool ARN associated (not the IPAM) and the correct OU/account principalget-resource-share-associations; and (b) that RAM sharing with Organizations is enabled in the management account. A frequent mistake is sharing the IPAM instead of the pool.
  5. Provision and verify on AWS with advertisement off → cut over DNS → advertise-byoip-cidr on AWS → withdraw the on-prem announcement. This guarantees the prefix is never announced from two origins at once and that DNS already points at the new path before AWS originates the route.

Glossary

Next steps

You can now design an IPAM hierarchy, delegate and share it, automate allocation with guardrails, monitor for exhaustion and overlap, and bring your own public space onto AWS. Build outward:

awsvpcipamnetworkingbyoipmulti-account
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments