AWS Networking

Amazon VPC IPAM: Hierarchical CIDR Planning, Allocation, and BYOIP at Scale

The most expensive mistake in multi-account networking is not a misconfigured route table; it is two VPCs that both own 10.0.0.0/16. You cannot renumber a live VPC, you cannot route overlapping prefixes through a Transit Gateway, and by the time the overlap surfaces you have a spreadsheet, a Slack channel, and a quarterly meeting where humans manually hand out CIDR blocks. Amazon VPC IP Address Manager (IPAM) replaces all of that with a hierarchy of pools, automated allocation, and continuous overlap detection across every account in the organization. This is how to design the hierarchy, delegate it, automate allocation, monitor exhaustion, and bring your own public IP space, the way a platform team should before anyone provisions a VPC.

The CIDR overlap problem at scale

Address allocation by spreadsheet fails for the same reason manual IAM fails: it does not scale and it cannot be enforced. A team picks 10.20.0.0/16 because it “looked free,” and six months later a Transit Gateway attachment is rejected or a Direct Connect route is black-holed because someone else picked the same block in another account. The longest-prefix-match router underneath TGW and VPC peering simply cannot represent two identical prefixes pointing at different destinations.

IPAM solves this by being the single source of truth for address space. You define one top-level pool, carve it into a tree of sub-pools by region and environment, and force every VPC to draw a CIDR from a pool rather than declare one. IPAM guarantees the allocation is unique within the pool’s scope, records who owns it, and continuously monitors every CIDR in the accounts it watches for overlap and utilization.

Mental model: a spreadsheet records allocations after the fact. IPAM issues them and refuses to issue a conflict. The difference is the entire problem.

IPAM has two relevant tiers. The Free Tier gives you pools, allocation, and basic monitoring within a single account/region. The Advanced Tier is what an organization needs: cross-account and cross-region pools shared via AWS RAM, organization-wide overlap and utilization monitoring, BYOIP/BYOASN management, and public IP insights. Everything below assumes Advanced Tier billed per active IP it manages.

Step 1 - Enable IPAM and design the pool hierarchy

IPAM lives in your networking/IPAM delegated account, not the management account. Reserve a dedicated account for it. The hierarchy has three structural layers you must understand before writing any code:

A sound top-down design for an org looks like this: one global top-level pool, a Regional pool per operating Region (pools below the top can be locale-pinned), then per-environment or per-OU pools beneath each Region.

# In the IPAM delegated account
resource "aws_vpc_ipam" "org" {
  description = "Org-wide IPAM"
  tier        = "advanced"

  operating_regions { region_name = "eu-west-1" }
  operating_regions { region_name = "us-east-1" }
}

# Top-level pool: the whole RFC1918 supernet you control, locale-agnostic
resource "aws_vpc_ipam_pool" "top" {
  address_family = "ipv4"
  ipam_scope_id  = aws_vpc_ipam.org.private_default_scope_id
  description    = "Top-level - all private space"
}

resource "aws_vpc_ipam_pool_cidr" "top" {
  ipam_pool_id = aws_vpc_ipam_pool.top.id
  cidr         = "10.0.0.0/8"
}

# Regional pool, locale-pinned so VPCs in this Region draw from it
resource "aws_vpc_ipam_pool" "euw1" {
  address_family      = "ipv4"
  ipam_scope_id       = aws_vpc_ipam.org.private_default_scope_id
  locale              = "eu-west-1"
  source_ipam_pool_id = aws_vpc_ipam_pool.top.id
  description         = "eu-west-1 regional pool"
}

resource "aws_vpc_ipam_pool_cidr" "euw1" {
  ipam_pool_id   = aws_vpc_ipam_pool.euw1.id
  netmask_length = 12          # IPAM carves a free /12 from the /8
}

A locale is the AWS Region (or Local Zone) a pool is tied to. Only a pool with a locale that matches a VPC’s Region can allocate to that VPC. Keep the top-level pool locale-free so it can feed every Region; pin the Regional layer.

Step 2 - Per-environment sub-pools with allocation rules

Beneath each Regional pool, create one pool per environment (or per OU). This is also where you bake in allocation rules so the pool refuses non-compliant requests: a netmask the pool will hand out, bounds on the prefix sizes it allows, and required tags on every allocation.

resource "aws_vpc_ipam_pool" "euw1_prod" {
  address_family      = "ipv4"
  ipam_scope_id       = aws_vpc_ipam.org.private_default_scope_id
  locale              = "eu-west-1"
  source_ipam_pool_id = aws_vpc_ipam_pool.euw1.id
  description         = "eu-west-1 prod"

  # Allocation guardrails
  allocation_default_netmask_length = 20   # default VPC size if caller omits one
  allocation_min_netmask_length     = 16   # nobody may grab bigger than /16
  allocation_max_netmask_length     = 24   # nor smaller than /24

  # Every allocation MUST carry these tags or the request is rejected
  allocation_resource_tags = {
    Environment = "prod"
  }
}

resource "aws_vpc_ipam_pool_cidr" "euw1_prod" {
  ipam_pool_id   = aws_vpc_ipam_pool.euw1_prod.id
  netmask_length = 16
}

allocation_min_netmask_length and allocation_max_netmask_length are inclusive bounds on the prefix length, not the host count, so “min 16 / max 24” means allocations between a /16 and a /24. The allocation_resource_tags map is a hard filter: a VPC create that does not carry Environment=prod will not draw from this pool. Use that to keep prod and non-prod address space provably separate.

Step 3 - Delegate administration and share pools via AWS RAM

Two things must be wired before member accounts can self-serve. First, delegate IPAM to your networking account so it (not the org management account) administers IPAM. Run this once from the management account:

aws ec2 enable-ipam-organization-admin-account \
  --delegated-admin-account-id 222233334444

Second, share the leaf pools to the accounts or OUs that should allocate from them, using AWS Resource Access Manager. You share the pool, not the IPAM. Sharing requires RAM sharing with your organization to be enabled.

resource "aws_ram_resource_share" "ipam_prod" {
  name                      = "ipam-euw1-prod"
  allow_external_principals = false
}

resource "aws_ram_resource_association" "ipam_prod" {
  resource_arn       = aws_vpc_ipam_pool.euw1_prod.arn
  resource_share_arn = aws_ram_resource_share.ipam_prod.arn
}

# Share to an entire OU (principal = OU ARN) or to specific account IDs
resource "aws_ram_principal_association" "ipam_prod_ou" {
  principal          = "arn:aws:organizations::111122223333:ou/o-exampleorgid/ou-prod-abcd1234"
  resource_share_arn = aws_ram_resource_share.ipam_prod.arn
}

Once shared, a workload account can reference the pool ID directly in its own VPC definition. It cannot see other pools, cannot widen the netmask bounds, and every CIDR it pulls is registered centrally in the IPAM account.

Step 4 - Automate VPC CIDR allocation

This is the payoff. A spoke VPC no longer hard-codes a block; it names a pool and a size, and IPAM hands back a free, non-overlapping CIDR. Run this from the member account that received the RAM share:

resource "aws_vpc" "spoke" {
  ipv4_ipam_pool_id   = "ipam-pool-0prodshared0euw1"  # the shared pool ID
  ipv4_netmask_length = 20                            # ask for a /20

  tags = {
    Environment = "prod"   # required by the pool's allocation_resource_tags
    Name        = "payments-prod"
  }
}

You can also reserve space outside of a VPC - for a future EKS secondary CIDR, an on-prem block you are reconciling, or a peer’s range - with an explicit allocation. This carves the space so IPAM will never re-issue it:

aws ec2 allocate-ipam-pool-cidr \
  --ipam-pool-id ipam-pool-0prodshared0euw1 \
  --netmask-length 22 \
  --description "reserved for eks-prod secondary CIDR"

The CLI returns an IpamPoolAllocationId; hold onto it, because that is what you use to release the reservation later.

Step 5 - Monitor utilization, overlap, and exhaustion

IPAM continuously computes utilization for every pool and every monitored resource. Query it on demand to find pools that are filling up:

aws ec2 get-ipam-pool-allocations \
  --ipam-pool-id ipam-pool-0prodshared0euw1

# Resource-level view across the whole IPAM: which VPCs/EIPs, and their util %
aws ec2 get-ipam-resource-cidrs \
  --ipam-scope-id ipam-scope-0abc123 \
  --filters Name=management-state,Values=managed

The richer signal is the metrics IPAM publishes. With Advanced Tier and state monitoring enabled, IPAM emits per-pool metrics including allocation counts, available address counts, and the all-important utilization ratio to CloudWatch under the AWS/IPAM namespace. Alarm on the pool before it exhausts, not after a VPC create fails in prod:

resource "aws_cloudwatch_metric_alarm" "prod_pool_exhaustion" {
  alarm_name          = "ipam-euw1-prod-utilization-high"
  namespace           = "AWS/IPAM"
  metric_name         = "PoolAllocationUtilizationPercentage"
  dimensions = {
    IpamId       = aws_vpc_ipam.org.id
    IpamPoolId   = aws_vpc_ipam_pool.euw1_prod.id
    IpamScopeId  = aws_vpc_ipam.org.private_default_scope_id
  }
  statistic           = "Maximum"
  period              = 3600
  evaluation_periods  = 1
  threshold           = 80
  comparison_operator = "GreaterThanOrEqualToThreshold"
  alarm_actions       = [aws_sns_topic.netops.arn]
}

For overlap, IPAM flags any resource whose CIDR collides with another in the same scope with a compliance status. Surface non-compliant and overlapping resources directly:

aws ec2 get-ipam-resource-cidrs \
  --ipam-scope-id ipam-scope-0abc123 \
  --filters Name=overlap-status,Values=overlapping \
            Name=compliance-status,Values=noncompliant

A pre-existing VPC that overlaps shows up here the moment IPAM starts monitoring its account, which is exactly how you find the landmines that the spreadsheet missed.

Step 6 - Bring your own public IP space (BYOIP) and BYOASN

Public space lives in the IPAM public scope. To advertise your own range from AWS, you provision it into a public-scope pool, prove ownership, then advertise it. Two ownership requirements matter: the prefix must be at least a /24 for IPv4 (the smallest globally routable block), and you need a valid ROA (Route Origin Authorization) in your RIR’s RPKI naming AWS’s ASN (16509) as an authorized origin so the advertisement passes RPKI validation.

# Provision your CIDR into a public-scope pool (publicly-advertisable)
aws ec2 provision-ipam-pool-cidr \
  --ipam-pool-id ipam-pool-0publicpoolexample \
  --cidr 203.0.113.0/24 \
  --cidr-authorization-context \
      Message="$MSG",Signature="$SIG"

The authorization context is a signed message proving you control the block; you sign a message string with the private key whose public half is published in your RDAP/whois record. Provisioning runs an asynchronous verification - poll get-ipam-pool-cidrs until state is provisioned. Only then advertise it:

aws ec2 advertise-byoip-cidr --cidr 203.0.113.0/24

# and to withdraw it (e.g., before migrating advertisement back on-prem)
aws ec2 withdraw-byoip-cidr --cidr 203.0.113.0/24

You can also bring your own ASN (BYOASN) so EC2 and Global Accelerator advertise your prefixes from your ASN rather than Amazon’s. Provision the ASN, associate it with the IPAM, and tie it to the public pool:

aws ec2 provision-ipam-byoasn \
  --ipam-id ipam-0abc123example \
  --asn 64512 \
  --asn-authorization-context \
      Message="$MSG",Signature="$SIG"

Keep advertisement under your control. Provision and verify with advertisement off, cut over DNS, then advertise-byoip-cidr. Withdraw before you ever move the announcement back to on-prem so you never have the same prefix announced from two origins.

Step 7 - Declarative provisioning end to end

The whole point is that network teams stop touching the console. The pool hierarchy, RAM shares, and alarms live in the IPAM account’s Terraform state; member-account modules reference shared pool IDs as variables. A reusable VPC module never needs a CIDR input again:

variable "ipam_pool_id" { type = string }
variable "vpc_netmask"  { type = number, default = 22 }

resource "aws_vpc" "this" {
  ipv4_ipam_pool_id   = var.ipam_pool_id
  ipv4_netmask_length = var.vpc_netmask
  tags                = { Environment = var.environment }
}

CloudFormation expresses the same contract through Ipv4IpamPoolId and Ipv4NetmaskLength on AWS::EC2::VPC, so estates split across both tools share one allocation authority.

Step 8 - Reclaiming addresses and migrating existing VPCs

Pools default to deletion protection semantics: you cannot deprovision a CIDR from a pool while allocations exist under it, and you cannot delete a non-empty pool. Reclaim bottom-up. Release a manual reservation with its allocation ID:

aws ec2 release-ipam-pool-cidr \
  --ipam-pool-id ipam-pool-0prodshared0euw1 \
  --ipam-pool-allocation-id ipam-pool-alloc-0abcd1234example

Allocations created automatically when a VPC was provisioned from the pool are released when that VPC is deleted; you do not release those by hand. To bring existing VPCs under management without renumbering, do not recreate them. Import their current CIDRs into the right pool, which registers the space as a manual allocation and flips the resource to monitored. After import, the VPC keeps its address but now shows up in utilization and overlap reports - and any overlap it has with the rest of the estate is finally visible.

Verify

Confirm the system is actually enforcing what you designed, not just deployed:

# 1. Delegation is in place
aws ec2 describe-ipam-organization-admin-account

# 2. Pools exist with the hierarchy you expect (note PoolDepth/SourceResource)
aws ec2 describe-ipam-pools \
  --query 'IpamPools[].{Id:IpamPoolId,Depth:PoolDepth,State:State}'

# 3. A test VPC drew a CIDR from the pool (and only one)
aws ec2 get-ipam-pool-allocations \
  --ipam-pool-id ipam-pool-0prodshared0euw1 \
  --query 'IpamPoolAllocations[].{Cidr:Cidr,Type:ResourceType,Owner:ResourceOwner}'

# 4. Nothing overlaps and nothing is non-compliant
aws ec2 get-ipam-resource-cidrs \
  --ipam-scope-id ipam-scope-0abc123 \
  --filters Name=overlap-status,Values=overlapping

Then deliberately try to break the guardrails: create a VPC from the prod pool without the Environment=prod tag and confirm the request is rejected, and request a /15 from a pool whose allocation_min_netmask_length is 16 and confirm it fails. A guardrail you have not seen reject something is a guardrail you do not have.

Enterprise scenario

A fintech platform team I worked with ran 60+ accounts that had grown organically before any address governance existed. Three separate teams had independently landed VPCs on 10.10.0.0/16. The pain surfaced when they stood up a central Transit Gateway for shared services: the second and third attachments advertising 10.10.0.0/16 were silently unusable, because a TGW route table cannot hold the same prefix twice. Renumbering a live, regulated payments VPC mid-quarter was not on the table.

The constraint was hard: they could not renumber the offending VPCs quickly, but they had to know the full blast radius before the TGW migration, and prevent any new overlap from that day forward. They stood up Advanced-Tier IPAM in a dedicated networking account, defined a 10.0.0.0/8 top-level pool with Regional and per-environment children, and imported every existing VPC’s CIDR rather than recreating anything. The import immediately populated the overlap report:

aws ec2 get-ipam-resource-cidrs \
  --ipam-scope-id ipam-scope-0abc123 \
  --filters Name=overlap-status,Values=overlapping \
  --query 'IpamResourceCidrs[].{Vpc:ResourceId,Cidr:ResourceCidr,Acct:ResourceOwnerId}'

That single command turned “we think three VPCs collide” into an exact list of resource IDs, CIDRs, and owning accounts - the precise scope they needed to plan migrations. From that point, every new VPC was forced through RAM-shared pools with allocation_resource_tags enforcing environment separation, so the overlap set could only shrink. They renumbered the two least-critical colliding VPCs over the following two sprints, left the regulated one isolated behind PrivateLink until its planned window, and never created a fresh overlap again. The lesson: IPAM’s real first-day value was not allocation, it was discovery of the overlap nobody could previously see.

Production checklist

awsvpcipamnetworkingbyoipmulti-account

Comments

Keep Reading