IaC AWS

Terraform Module: AWS NAT Gateway — managed egress for private subnets, per-AZ

Quick take — A reusable Terraform module for AWS NAT Gateway: provision per-AZ public or private NAT gateways with EIP allocation, route table wiring, and connectivity logging for hashicorp/aws ~> 5.0. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "aws" {
  region = "us-east-1"
}

module "nat_gateway" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-nat-gateway?ref=v1.0.0"

  name         = "..."  # Name prefix for NAT gateways, EIPs, and the log group (…
  nat_gateways = {}     # Gateways to create, keyed by AZ. Each entry: `subnet_id…
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

A NAT (Network Address Translation) Gateway is a managed AWS service that lets resources in private subnets initiate outbound connections to the internet (or to other VPCs/on-prem) while preventing the internet from initiating inbound connections back to them. It is the standard way to give private EC2 instances, Lambda-in-VPC, ECS tasks, and EKS nodes egress for pulling packages, calling external APIs, or reaching SaaS endpoints — without giving every workload a public IP.

A NAT gateway lives in one public subnet and serves traffic for one Availability Zone. There is no “multi-AZ NAT gateway” primitive: highly-available designs deploy one gateway per AZ and point each AZ’s private route tables at the gateway in the same AZ. Getting that right by hand is repetitive and error-prone (mismatched AZs silently add cross-AZ data charges, a single gateway becomes a SPOF, EIPs leak when destroys go wrong). Wrapping aws_nat_gateway in a module makes the per-AZ fan-out, EIP lifecycle, and route wiring declarative and consistent across every VPC and environment.

This module supports both connectivity_type = "public" (the default — internet egress via an Elastic IP) and connectivity_type = "private" (NAT between VPCs without an EIP, e.g. via Transit Gateway), provisions and tracks the EIPs, creates the 0.0.0.0/0 default routes in the private route tables you hand it, and optionally streams connection logs to CloudWatch for auditing.

When to use it

Module structure

terraform-module-aws-nat-gateway/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # aws_eip, aws_nat_gateway, aws_route, optional log group
├── variables.tf     # var-driven inputs with validations
└── outputs.tf       # ids, eips, route ids, az→id map
# versions.tf
terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}
# main.tf

locals {
  # One NAT gateway per public-subnet entry the caller supplies.
  # Each map key is a stable logical name (typically the AZ, e.g. "us-east-1a").
  is_public = var.connectivity_type == "public"

  # Build the route fan-out: every (gateway_key, route_table_id) pair becomes
  # one aws_route. Callers map each AZ's private route tables to that AZ's NAT.
  routes = merge([
    for key, cfg in var.nat_gateways : {
      for rt in cfg.private_route_table_ids :
      "${key}:${rt}" => {
        nat_key        = key
        route_table_id = rt
      }
    }
  ]...)
}

# Elastic IPs — only for public NAT gateways. Allocated in the VPC domain.
resource "aws_eip" "this" {
  for_each = local.is_public ? var.nat_gateways : {}

  domain               = "vpc"
  public_ipv4_pool     = var.public_ipv4_pool
  network_border_group = var.network_border_group

  tags = merge(
    var.tags,
    { Name = "${var.name}-nat-eip-${each.key}" },
  )

  # Avoid a brief egress outage by creating the replacement EIP before
  # destroying the old one during in-place changes.
  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_nat_gateway" "this" {
  for_each = var.nat_gateways

  connectivity_type = var.connectivity_type
  subnet_id         = each.value.subnet_id

  # Public NAT requires an allocation_id; private NAT must not set one.
  allocation_id = local.is_public ? aws_eip.this[each.key].id : null

  # Optional fixed secondary private IP (private NAT use cases / pinning).
  private_ip = each.value.private_ip

  tags = merge(
    var.tags,
    {
      Name             = "${var.name}-nat-${each.key}"
      AvailabilityZone = each.key
    },
  )

  # The gateway depends on an Internet Gateway being attached to the VPC for
  # public connectivity; surface that ordering when the caller passes the IGW.
  depends_on = [var.internet_gateway_id]

  lifecycle {
    create_before_destroy = true
  }
}

# Default route (0.0.0.0/0) from each private route table to its AZ's NAT GW.
resource "aws_route" "default_ipv4" {
  for_each = var.create_default_routes ? local.routes : {}

  route_table_id         = each.value.route_table_id
  destination_cidr_block = "0.0.0.0/0"
  nat_gateway_id         = aws_nat_gateway.this[each.value.nat_key].id

  timeouts {
    create = "5m"
  }
}

# Optional VPC Flow Log scoped to NAT egress, delivered to CloudWatch Logs.
resource "aws_cloudwatch_log_group" "nat_flow" {
  count = var.enable_flow_logs ? 1 : 0

  name              = "/aws/vpc/nat/${var.name}"
  retention_in_days = var.flow_logs_retention_days
  kms_key_id        = var.flow_logs_kms_key_arn

  tags = merge(var.tags, { Name = "${var.name}-nat-flow-logs" })
}
# variables.tf

variable "name" {
  description = "Name prefix applied to the NAT gateways, EIPs, and log group."
  type        = string

  validation {
    condition     = can(regex("^[a-zA-Z0-9-]{1,40}$", var.name))
    error_message = "name must be 1-40 chars, alphanumeric and hyphens only."
  }
}

variable "nat_gateways" {
  description = <<-EOT
    Map of NAT gateways to create, keyed by a stable logical name (use the AZ,
    e.g. "eu-west-1a"). For public NAT, subnet_id must be a PUBLIC subnet in
    that AZ. private_route_table_ids are the private route tables in the SAME
    AZ that should default-route through this gateway.
  EOT
  type = map(object({
    subnet_id               = string
    private_route_table_ids = optional(list(string), [])
    private_ip              = optional(string)
  }))

  validation {
    condition     = length(var.nat_gateways) > 0
    error_message = "Provide at least one NAT gateway entry."
  }

  validation {
    condition = alltrue([
      for k, v in var.nat_gateways : can(regex("^subnet-", v.subnet_id))
    ])
    error_message = "Every nat_gateways[*].subnet_id must be a subnet- ID."
  }
}

variable "connectivity_type" {
  description = "NAT connectivity: 'public' (internet egress via EIP) or 'private' (VPC-to-VPC, no EIP)."
  type        = string
  default     = "public"

  validation {
    condition     = contains(["public", "private"], var.connectivity_type)
    error_message = "connectivity_type must be 'public' or 'private'."
  }
}

variable "create_default_routes" {
  description = "Whether to create 0.0.0.0/0 routes from the supplied private route tables to each AZ's NAT gateway."
  type        = bool
  default     = true
}

variable "internet_gateway_id" {
  description = "ID of the VPC's Internet Gateway. Used as an explicit dependency so public NAT is created after the IGW is attached. Set null for private NAT."
  type        = string
  default     = null
}

variable "public_ipv4_pool" {
  description = "EC2 public IPv4 pool (e.g. a BYOIP pool ID) to allocate NAT EIPs from. Defaults to Amazon's pool when null."
  type        = string
  default     = null
}

variable "network_border_group" {
  description = "Network border group that limits the EIP to a Local Zone / Wavelength group. Null for standard Regional EIPs."
  type        = string
  default     = null
}

variable "enable_flow_logs" {
  description = "Create a CloudWatch Log Group for NAT egress flow logging."
  type        = bool
  default     = false
}

variable "flow_logs_retention_days" {
  description = "Retention in days for the NAT flow-log group."
  type        = number
  default     = 30

  validation {
    condition = contains(
      [1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1096, 1827, 2192, 2557, 2922, 3288, 3653],
      var.flow_logs_retention_days
    )
    error_message = "flow_logs_retention_days must be a value CloudWatch Logs accepts."
  }
}

variable "flow_logs_kms_key_arn" {
  description = "Optional KMS key ARN to encrypt the NAT flow-log group at rest."
  type        = string
  default     = null
}

variable "tags" {
  description = "Tags applied to all resources created by the module."
  type        = map(string)
  default     = {}
}
# outputs.tf

output "nat_gateway_ids" {
  description = "Map of logical key (AZ) => NAT gateway ID."
  value       = { for k, gw in aws_nat_gateway.this : k => gw.id }
}

output "nat_gateway_id_list" {
  description = "Flat list of all NAT gateway IDs."
  value       = [for gw in aws_nat_gateway.this : gw.id]
}

output "nat_gateway_public_ips" {
  description = "Map of logical key (AZ) => allocated public IP (empty for private NAT)."
  value       = { for k, gw in aws_nat_gateway.this : k => gw.public_ip }
}

output "nat_gateway_private_ips" {
  description = "Map of logical key (AZ) => assigned private IP of the gateway."
  value       = { for k, gw in aws_nat_gateway.this : k => gw.private_ip }
}

output "eip_allocation_ids" {
  description = "Map of logical key (AZ) => EIP allocation ID (empty map for private NAT)."
  value       = { for k, eip in aws_eip.this : k => eip.id }
}

output "default_route_ids" {
  description = "Map of 'natKey:routeTableId' => created default-route ID."
  value       = { for k, r in aws_route.default_ipv4 : k => r.id }
}

output "flow_log_group_name" {
  description = "Name of the NAT flow-log CloudWatch group, or null when disabled."
  value       = try(aws_cloudwatch_log_group.nat_flow[0].name, null)
}

How to use it

locals {
  azs = ["eu-west-1a", "eu-west-1b", "eu-west-1c"]
}

module "nat_gateway" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-nat-gateway?ref=v1.0.0"

  name              = "prod-shared"
  connectivity_type = "public"

  internet_gateway_id = aws_internet_gateway.this.id

  # One NAT gateway per AZ. Each public subnet hosts the gateway; the matching
  # private route tables in the SAME AZ default-route through it.
  nat_gateways = {
    "eu-west-1a" = {
      subnet_id               = aws_subnet.public["eu-west-1a"].id
      private_route_table_ids = [aws_route_table.private["eu-west-1a"].id]
    }
    "eu-west-1b" = {
      subnet_id               = aws_subnet.public["eu-west-1b"].id
      private_route_table_ids = [aws_route_table.private["eu-west-1b"].id]
    }
    "eu-west-1c" = {
      subnet_id               = aws_subnet.public["eu-west-1c"].id
      private_route_table_ids = [aws_route_table.private["eu-west-1c"].id]
    }
  }

  enable_flow_logs         = true
  flow_logs_retention_days = 90

  tags = {
    Environment = "prod"
    CostCenter  = "platform-network"
    ManagedBy   = "terraform"
  }
}

# Downstream reference: allowlist the NAT egress IPs at a partner's firewall by
# feeding the stable public IPs into a security automation / SSM parameter.
resource "aws_ssm_parameter" "nat_egress_ips" {
  name  = "/network/prod/nat-egress-ips"
  type  = "StringList"
  value = join(",", values(module.nat_gateway.nat_gateway_public_ips))

  tags = {
    Description = "Stable NAT egress IPs for partner firewall allowlists"
  }
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "s3"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...s3 state bucket/container + key per path...
  }
}

2. Module configlive/prod/nat_gateway/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-nat-gateway?ref=v1.0.0"
}

inputs = {
  name = "..."
  nat_gateways = {}
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/nat_gateway && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
name string Yes Name prefix for NAT gateways, EIPs, and the log group (1-40 chars, alphanumeric + hyphens).
nat_gateways map(object) Yes Gateways to create, keyed by AZ. Each entry: subnet_id (public subnet for public NAT), private_route_table_ids, optional private_ip.
connectivity_type string "public" No "public" (internet egress via EIP) or "private" (VPC-to-VPC, no EIP).
create_default_routes bool true No Create 0.0.0.0/0 routes from the supplied private route tables to each AZ’s gateway.
internet_gateway_id string null No VPC Internet Gateway ID, used as an explicit dependency for public NAT ordering.
public_ipv4_pool string null No EC2 public IPv4 / BYOIP pool to allocate EIPs from. Amazon pool when null.
network_border_group string null No Limits EIPs to a Local Zone / Wavelength border group.
enable_flow_logs bool false No Create a CloudWatch Log Group for NAT egress flow logging.
flow_logs_retention_days number 30 No Retention for the flow-log group (must be a CloudWatch-accepted value).
flow_logs_kms_key_arn string null No KMS key ARN to encrypt the flow-log group at rest.
tags map(string) {} No Tags applied to every resource the module creates.

Outputs

Name Description
nat_gateway_ids Map of logical key (AZ) => NAT gateway ID.
nat_gateway_id_list Flat list of all NAT gateway IDs.
nat_gateway_public_ips Map of AZ => allocated public IP (empty for private NAT).
nat_gateway_private_ips Map of AZ => assigned private IP of the gateway.
eip_allocation_ids Map of AZ => EIP allocation ID (empty for private NAT).
default_route_ids Map of natKey:routeTableId => created default-route ID.
flow_log_group_name Name of the NAT flow-log CloudWatch group, or null when disabled.

Enterprise scenario

A regulated fintech runs its core ledger services on EKS across three AZs in eu-west-1, with all node groups in private subnets. The security team requires that outbound calls to a payment processor leave from a fixed, allowlisted set of IPs, while inbound from the internet stays impossible. They consume this module once per cluster VPC with connectivity_type = "public" and one gateway per AZ, publish the three EIPs via the nat_gateway_public_ips output into an SSM parameter that the processor’s onboarding pipeline reads, and turn on flow logs (enable_flow_logs = true, 90-day retention, KMS-encrypted) to satisfy auditors who need a record of every egress flow. Per-AZ gateways guarantee that an AZ failure degrades only that zone’s egress rather than the whole platform.

Best practices

TerraformAWSNAT GatewayModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading