AWS Networking

Amazon VPC, In Depth: Subnets, Route Tables, IGW, NAT, Endpoints & Every Component

Every workload you run in AWS that touches the network — an EC2 instance, an RDS database, a Lambda function reaching a private API, a load balancer — lives inside a Virtual Private Cloud (VPC): your own logically isolated, software-defined slice of the AWS network where you choose the IP address range, carve it into subnets across Availability Zones, and decide exactly what can reach the internet, what stays private, and how packets are routed. Get the design right and everything downstream just works. Get it wrong and you feel it for years: you run out of addresses mid-migration, you cannot peer two VPCs because their ranges overlap, traffic that should stay private egresses through a NAT gateway you are paying for by the gigabyte, or a “private” instance silently has a route to the internet.

This is the exhaustive lesson. We go component by component — the VPC CIDR and how to add IPv6 and secondary ranges, every field on a subnet and the five IP addresses AWS reserves in each one, route tables and the immovable local route, the Internet Gateway, the long-running argument of NAT Gateway versus a self-managed NAT instance, DHCP option sets, the two DNS attributes that break name resolution when they are off, the difference between gateway and interface (PrivateLink) endpoints, where peering ends and Transit Gateway begins, and VPC Flow Logs — until you can whiteboard a production VPC from memory and answer the follow-up questions a Solutions Architect interview or the SAA-C03 and ANS-C01 exams will throw at you. It is beginner-accessible — every term is defined as it appears — but complete: read it once and you know the service end to end.

Learning objectives

By the end of this lesson you will be able to:

Prerequisites & where this fits

You need an AWS account and the basics of regions, Availability Zones, and the CLI/console from the earlier Fundamentals lessons, plus a working idea of what an IP address and a subnet are. No deep networking background is assumed — CIDR, routing, and NAT are all explained from first principles. This is the opening Networking deep-dive of the AWS Zero-to-Hero course and the foundation that every later networking lesson builds on. The very next lesson, AWS Security Groups vs Network ACLs, In Depth, covers the filtering layer that sits on top of the routing layer you design here; this lesson deliberately stays on addressing, routing, and connectivity, and points you there for firewalls. When your address planning outgrows a spreadsheet, Amazon VPC IPAM: Hierarchical CIDR Planning, Allocation, and BYOIP at Scale automates it; when one VPC becomes dozens, Designing Multi-Account VPC Connectivity with Transit Gateway replaces the peering mesh.

Core concepts

A VPC (Virtual Private Cloud) is a regional resource — it spans every Availability Zone in one AWS Region but cannot cross Regions — that defines a private IPv4 address range (and optionally IPv6) which is yours alone. Inside it you build a network using a small set of primitives that fit together predictably. Anchor everything that follows on these mental models:

Key terms you will see throughout: CIDR (Classless Inter-Domain Routing — the /16, /24 notation that defines how many addresses a block holds and how the prefix is split between network and host), ENI (Elastic Network Interface — the virtual NIC that everything in a VPC actually attaches to), IGW (Internet Gateway — the VPC’s door to the public internet), NAT (Network Address Translation — letting many private addresses share one public address for outbound traffic), route table (the ordered set of rules that decides where a packet goes next), and endpoint (a private on-ramp to an AWS service that keeps traffic off the internet).

Default VPC vs a custom VPC

Every Region in a new account comes with a default VPC so that you can launch an instance immediately without designing a network first. Understanding what makes it “default” tells you what a custom VPC does not give you for free.

Property Default VPC Custom VPC (one you create)
CIDR 172.31.0.0/16, fixed You choose
Subnets One default subnet per AZ, all public None until you create them
Internet Gateway Created and attached You attach it yourself
Route to 0.0.0.0/0 Present in the main route table → IGW You add it
Public IP on launch Auto-assign public IPv4 = on in default subnets Off by default
DNS hostnames Enabled Disabled by default
Good for Quick demos, getting started Everything real — explicit control

The convenience of the default VPC is also its danger: every default subnet is public and auto-assigns a public IP, so an instance launched there is internet-reachable the moment a permissive security group is attached. For anything beyond a throwaway test, build a custom VPC where nothing is public unless you deliberately route it that way. You can delete the default VPC, and recreate it later from the console if you ever need it back.

VPC CIDR: primary, secondary, and IPv6

When you create a VPC the one truly load-bearing decision is the primary IPv4 CIDR block. It defines the pool of private addresses every subnet will be carved from, and it cannot be changed or removed for the life of the VPC — you can only add secondary blocks.

Setting What it is Choices / limits Default When to change / gotcha
Primary IPv4 CIDR The main private address range /16 (65,536 addresses) down to /28 (16 addresses); use RFC 1918 private ranges None — required Permanent. Pick a /16 for production so subnets have room. Cannot overlap with any network you will peer or connect to on-prem.
Secondary IPv4 CIDRs Extra ranges added later when you run out Up to 5 by default (raise to ~50 via quota); must not overlap existing blocks or reserved AWS ranges None Add when subnets fill up. Cannot fall inside an existing block; choose from the same private range family to keep routing sane.
IPv6 CIDR An optional /56 block Amazon-provided (you get a /56, subnets are /64) or your own (BYOIP) Off Enable for IPv6 workloads or to use egress-only internet gateways. IPv6 addresses are public and globally routable — there is no “private IPv6” in the RFC 1918 sense.

Two rules save most teams from pain. First, size for the whole estate, not today’s app — a /16 costs nothing extra over a /28 (you pay for traffic and resources, never for address space), and running out of contiguous space later forces ugly secondary-CIDR workarounds. Second, never reuse the same CIDR across VPCs you might connect. If vpc-a and vpc-b are both 10.0.0.0/16, you can never peer them or attach them to the same Transit Gateway — overlapping ranges have no unambiguous route. Allocate a unique block per VPC up front; when this becomes hard to track by hand, that is exactly the problem VPC IPAM solves.

# Create a custom VPC with a /16 primary block
aws ec2 create-vpc \
  --cidr-block 10.0.0.0/16 \
  --tag-specifications 'ResourceType=vpc,Tags=[{Key=Name,Value=vpc-lab}]'

# Add a secondary IPv4 block later
aws ec2 associate-vpc-cidr-block --vpc-id vpc-0abc... --cidr-block 10.1.0.0/16

# Add an Amazon-provided IPv6 /56
aws ec2 associate-vpc-cidr-block --vpc-id vpc-0abc... --amazon-provided-ipv6-cidr-block

Subnets: public vs private, AZ placement, sizing, and reserved IPs

A subnet is a sub-range of the VPC CIDR that lives in exactly one Availability Zone. Resources attach to subnets, and the subnet’s route table determines whether it is public or private.

Setting What it is Choices Default When / trade-off / gotcha
VPC The parent network Any VPC in the Region The subnet’s CIDR must fall inside the VPC’s CIDR.
Availability Zone Physical location of the subnet Any AZ in the Region AWS picks if unspecified Pin it explicitly and spread workloads across ≥2 AZs for HA. A subnet cannot span or move AZs.
IPv4 CIDR block The subnet’s address range /16 to /28 within the VPC Required /24 (256 addresses) is a comfortable default. Smaller than /28 is not allowed because of reserved IPs.
IPv6 CIDR Optional /64 from the VPC’s /56 One /64 per subnet None Required if the subnet hosts IPv6 resources.
Auto-assign public IPv4 Give launched instances a public IP automatically On / Off Off Turning this on is what people mean by a “public subnet” in practice — but it only matters alongside a route to an IGW.
Auto-assign IPv6 Auto-assign an IPv6 address on launch On / Off Off Enable for IPv6 subnets.

Public vs private is purely about routing. A subnet is public when its route table has a 0.0.0.0/0 route pointing at an Internet Gateway (and, in practice, auto-assign public IP is on or instances carry Elastic IPs). It is private when it has no such route — instances reach the internet only outbound via a NAT gateway, or not at all. A common third tier is an isolated subnet with no internet route in either direction (for databases), reachable only inside the VPC and via endpoints.

The five reserved IP addresses

AWS reserves the first four and the last IP address in every subnet, so a /24 (256 addresses) gives you 251 usable, not 256. Memorise this — it is a classic exam question and it bites IP planning.

Address (in 10.0.1.0/24) Reserved for
10.0.1.0 Network address
10.0.1.1 VPC router (the implied default gateway)
10.0.1.2 Amazon-provided DNS (the “.2 resolver” — VPC base +2)
10.0.1.3 Reserved for future use
10.0.1.255 Network broadcast (broadcast is not supported, but the address is still reserved)

Because five addresses always disappear, the smallest permitted subnet is a /28 (16 addresses → 11 usable). The .2 resolver in particular matters later: it is the address the Amazon DNS server answers on, and several DNS features depend on it.

# Two subnets in two different AZs (HA), with auto-assign public IP on for the first
aws ec2 create-subnet --vpc-id vpc-0abc... --cidr-block 10.0.1.0/24 \
  --availability-zone ap-south-1a \
  --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=public-1a}]'
aws ec2 modify-subnet-attribute --subnet-id subnet-0pub... --map-public-ip-on-launch

aws ec2 create-subnet --vpc-id vpc-0abc... --cidr-block 10.0.11.0/24 \
  --availability-zone ap-south-1b \
  --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=private-1b}]'

Route tables: main vs custom, the local route, and priority

A route table is an ordered set of rules — destination CIDR → target — that the VPC router consults for every packet leaving a network interface. Each subnet is associated with exactly one route table at a time; if you do not associate one explicitly, the subnet uses the VPC’s main route table.

Concept What it is Detail / gotcha
Main route table The default table every new subnet implicitly uses One per VPC; you can edit it, but the safer pattern is to leave it minimal (private) and attach custom tables to public subnets.
Custom route table A table you create and explicitly associate with subnets The recommended way to define “public” vs “private” — one custom table per tier.
local route An automatic route for the entire VPC CIDR with target local Always present, cannot be deleted or edited. It is why every subnet can reach every other subnet with zero configuration.
0.0.0.0/0 (IPv4) / ::/0 (IPv6) The “default route” — everything not matched elsewhere Point it at an IGW (public), a NAT gateway (private outbound), a Transit Gateway, a peering connection, or an egress-only IGW (IPv6).
Subnet associations Which subnets use this table A subnet has one table; a table can serve many subnets.
Route propagation Auto-learn routes from a VPN/Direct Connect gateway via BGP Toggle per route table; avoids hand-maintaining on-prem prefixes.

Route priority is longest-prefix match. When several routes could apply, the VPC router picks the most specific (longest prefix) one. A packet to 10.0.5.7 matches 10.0.0.0/16 → local over 0.0.0.0/0 → nat because /16 is more specific than /0. The local route therefore always wins for in-VPC traffic, which is precisely why you cannot accidentally route internal traffic out to the internet. Static routes beat propagated (BGP-learned) routes of the same prefix.

# A public route table: default route to the Internet Gateway, associated to the public subnet
aws ec2 create-route-table --vpc-id vpc-0abc... \
  --tag-specifications 'ResourceType=route-table,Tags=[{Key=Name,Value=rtb-public}]'
aws ec2 create-route --route-table-id rtb-0pub... \
  --destination-cidr-block 0.0.0.0/0 --gateway-id igw-0xyz...
aws ec2 associate-route-table --route-table-id rtb-0pub... --subnet-id subnet-0pub...

Internet Gateway: the door to the public internet

An Internet Gateway (IGW) is a horizontally scaled, redundant, highly available VPC component that allows communication between instances in your VPC and the internet. It does two jobs: it provides a target in your route tables for internet-bound traffic, and it performs one-to-one NAT between an instance’s private IPv4 address and its public IPv4 address (or Elastic IP).

Three conditions must all be true for an instance to be reachable from the internet over IPv4 — miss any one and connectivity silently fails, which is the most common “why can’t I reach my instance” support question:

  1. An IGW is attached to the VPC.
  2. The subnet’s route table has 0.0.0.0/0 → the IGW.
  3. The instance has a public IPv4 address (auto-assigned, or an Elastic IP) and its security group / network ACL allow the traffic.

Key facts: a VPC can have only one IGW attached at a time; the IGW itself is free (you pay for data transfer and, since 2024, for public IPv4 addresses); and for IPv6 there is a separate egress-only internet gateway that allows outbound IPv6 only (the IPv6 equivalent of a NAT gateway, since IPv6 has no NAT).

aws ec2 create-internet-gateway \
  --tag-specifications 'ResourceType=internet-gateway,Tags=[{Key=Name,Value=igw-lab}]'
aws ec2 attach-internet-gateway --internet-gateway-id igw-0xyz... --vpc-id vpc-0abc...

NAT Gateway vs NAT instance: outbound-only internet

Instances in a private subnet often still need outbound internet — to download patches, call a third-party API, or pull a container image — without being reachable inbound. That is Network Address Translation (NAT): many private addresses share one public address for outbound flows, and return traffic for those flows is allowed back, but nothing can initiate a connection to the private instances from outside.

You route the private subnet’s 0.0.0.0/0 to the NAT, and the NAT itself sits in a public subnet (it needs the IGW to reach the internet). There are two ways to provide NAT:

Dimension NAT Gateway (managed) NAT instance (self-managed EC2)
What it is A fully managed AWS service An EC2 instance running NAT software
Availability Highly available within one AZ; deploy one per AZ for zone resilience Single instance = single point of failure; you build HA yourself
Throughput Scales automatically 5 → 100 Gbps Bounded by the instance type’s network/CPU
Management Zero — no patching, no sizing You patch, monitor, and size it
Source/dest check N/A Must disable source/destination check or it will not forward
Security groups Cannot attach an SG (control via NACL / the private route) Has a security group like any instance
Port forwarding / bastion Not possible Possible (it is a normal instance)
Cost Hourly charge + per-GB data processing Just the EC2 instance (often a small/free-tier type)
Use it when Almost always — the default Cost-sensitive dev/test, or you need features only an instance gives

Default to the NAT Gateway — it is the managed, scalable, low-effort choice and the right answer in virtually every exam scenario. The two things to know cold: it is zonal, so a truly resilient design places one NAT gateway in each AZ and points each AZ’s private subnets at the NAT gateway in their own AZ (this also avoids paying cross-AZ data transfer); and its bill has two parts — an hourly rate plus a per-GB data-processing charge — which is exactly why pulling large objects from S3 through a NAT gateway is wasteful when a free gateway endpoint would keep that traffic off the NAT entirely (see the next section).

A NAT instance is the legacy approach. The detail interviewers love is that, because the instance forwards traffic for other hosts, you must disable the source/destination check (aws ec2 modify-instance-attribute --no-source-dest-check) — by default an instance drops packets whose source or destination is not itself.

# Allocate an Elastic IP and create a NAT gateway in the PUBLIC subnet
aws ec2 allocate-address --domain vpc           # returns an AllocationId
aws ec2 create-nat-gateway --subnet-id subnet-0pub... --allocation-id eipalloc-0...

# Point the PRIVATE subnet's default route at the NAT gateway
aws ec2 create-route --route-table-id rtb-0priv... \
  --destination-cidr-block 0.0.0.0/0 --nat-gateway-id nat-0...

DHCP option sets

When an instance boots, it gets its network configuration — DNS servers, domain name, NTP servers — via DHCP, and a DHCP option set is the VPC-level object that defines those values. Every VPC has one associated; the default one points at AmazonProvidedDNS and is fine for most cases.

Option What it controls Default When to change
domain-name-servers Which DNS resolvers instances use AmazonProvidedDNS (the .2 resolver) Point at custom resolvers (e.g. on-prem AD DNS, or Route 53 Resolver inbound endpoints) for hybrid name resolution.
domain-name The domain suffix applied to hostnames Region-specific (e.g. ap-south-1.compute.internal) Set a corporate suffix like corp.example.com.
ntp-servers Time servers Amazon Time Sync (169.254.169.123) Override only if you have a specific NTP requirement.
netbios-name-servers / netbios-node-type Legacy Windows NetBIOS None Rarely needed; set node-type=2 for Windows estates that use it.

The important gotchas: you cannot edit an option set in place — you create a new one and associate it with the VPC. After re-associating, existing instances pick up the change only when their DHCP lease renews (or on reboot), so do not expect it to take effect instantly. Replacing AmazonProvidedDNS with custom servers is the usual reason to touch this, and it is how you wire VPC DNS into a hybrid Active Directory environment.

DNS in the VPC: enableDnsSupport and enableDnsHostnames

Two VPC attributes control DNS, and confusing them is a perennial source of “my private endpoint resolves to a public IP” tickets. Both default differently for default vs custom VPCs.

Attribute What it does Default (custom VPC) If turned off
enableDnsSupport Whether the Amazon DNS resolver (the .2 address) answers queries in the VPC On Instances cannot resolve names via the AWS resolver; DNS-based features (including private DNS for endpoints) break.
enableDnsHostnames Whether instances with a public IP get a public DNS hostname auto-assigned Off Instances get no public DNS name; private DNS names for interface endpoints will not resolve even if support is on.

The rule to remember: enableDnsSupport must be on for any DNS to work at all, and enableDnsHostnames must also be on for interface (PrivateLink) endpoints’ private DNS names to resolve to the endpoint’s private IP. Both default to on in the default VPC and (in modern accounts) enableDnsSupport is on but enableDnsHostnames is off in custom VPCs — so when you adopt PrivateLink and find s3.ap-south-1.amazonaws.com still resolving to a public IP, the fix is almost always to turn on enableDnsHostnames.

aws ec2 modify-vpc-attribute --vpc-id vpc-0abc... --enable-dns-support '{"Value":true}'
aws ec2 modify-vpc-attribute --vpc-id vpc-0abc... --enable-dns-hostnames '{"Value":true}'

VPC endpoints: gateway vs interface (PrivateLink)

A VPC endpoint lets resources in your VPC reach supported AWS services (and third-party / your-own services) privately, over the AWS network, without an Internet Gateway, NAT gateway, or public IPs. There are two fundamentally different kinds, and knowing which is which — and when each is even available — is core SAA/ANS material.

Dimension Gateway endpoint Interface endpoint (PrivateLink)
Supported services Only Amazon S3 and DynamoDB Most AWS services (SSM, EC2 API, ECR, CloudWatch, Secrets Manager, SQS, KMS, …) and partner/your-own services
How it works A route added to your route table targeting the endpoint (a prefix list) An ENI with a private IP placed in your subnet(s)
What you point at it Route tables DNS — queries to the service name resolve to the ENI’s private IP (with private DNS enabled)
Cost Free (no hourly or data charge) Hourly per-endpoint, per-AZ charge + per-GB data processing
Cross-Region / on-prem reachable No (stays in-Region, in-VPC) Yes — reachable over peering, TGW, VPN, Direct Connect
Access control Endpoint policy (a resource policy on the endpoint) Endpoint policy + security group on the ENI
Use it for Keeping S3/DynamoDB traffic off the NAT gateway — the classic cost win Private access to every other AWS service API

The decision tree is simple. Is it S3 or DynamoDB? Use a gateway endpoint — it is free and removes that traffic from your NAT bill entirely (a private subnet that only talks to S3 may not need a NAT gateway at all). Anything else? Use an interface endpoint, accepting the hourly cost in exchange for keeping API traffic private. Interface endpoints are built on AWS PrivateLink, the same technology you use to expose your own service privately to other VPCs — covered in depth in AWS PrivateLink for Service Providers and Consumers. Two gotchas: gateway endpoints are Region-local and route-based, so they do not work for on-prem or cross-Region callers (use an interface endpoint there); and interface-endpoint private DNS only works when both enableDnsSupport and enableDnsHostnames are on (see the DNS section above).

# Gateway endpoint for S3 (free) — attach to the private route table
aws ec2 create-vpc-endpoint --vpc-id vpc-0abc... \
  --vpc-endpoint-type Gateway \
  --service-name com.amazonaws.ap-south-1.s3 \
  --route-table-ids rtb-0priv...

# Interface endpoint for SSM (PrivateLink) — ENIs in the private subnets, with private DNS
aws ec2 create-vpc-endpoint --vpc-id vpc-0abc... \
  --vpc-endpoint-type Interface \
  --service-name com.amazonaws.ap-south-1.ssm \
  --subnet-ids subnet-0priv... \
  --security-group-ids sg-0... \
  --private-dns-enabled

Connecting VPCs: peering vs Transit Gateway (in brief)

A single VPC is rarely the whole story — you connect VPCs to each other and to on-premises. Two options dominate, and the line between them is a frequent interview question.

Dimension VPC peering Transit Gateway (TGW)
Topology One-to-one link between two VPCs Hub-and-spoke; one TGW connects many VPCs (and VPN/Direct Connect)
Transitivity Non-transitive — if A↔B and B↔C, A still cannot reach C Transitive — all attached VPCs can route to each other
Scale Connections explode as n(n-1)/2 (a full mesh of 10 VPCs = 45 peerings) Linear — each VPC attaches once
Routing control Per-VPC route tables Central TGW route tables; segmentation via multiple route tables
Cost No hourly fee; pay data transfer Hourly per-attachment fee + per-GB; more, but far simpler at scale
Cross-Region Inter-Region peering supported TGW peering across Regions
Use it when A handful of VPCs, simple any-to-any Many VPCs / accounts, central egress, hybrid connectivity

The headline rule: peering is non-transitive and does not scale — it is fine for two or three VPCs, but a growing estate becomes an unmanageable mesh, at which point you move to a Transit Gateway, which is transitive, centrally routed, and the standard for multi-account networking. Peering also requires non-overlapping CIDRs (you cannot peer two 10.0.0.0/16 VPCs) and does not support edge-to-edge routing (you cannot use a peer’s IGW or NAT). The full hub-and-spoke design, segmentation, and centralised egress are covered in Designing Multi-Account VPC Connectivity with Transit Gateway.

VPC Flow Logs: seeing the traffic

You cannot debug — or secure — a network you cannot see. VPC Flow Logs capture metadata about the IP traffic going to and from network interfaces: source and destination IP and port, protocol, packet and byte counts, the action (ACCEPT or REJECT), and more. They do not capture packet contents — this is NetFlow-style metadata, not a packet capture.

Setting What it is Choices Notes
Scope What the logs cover VPC, subnet, or a single ENI VPC-level captures everything beneath it; start there.
Filter Which traffic to record All, Accepted, or Rejected Rejected is great for spotting blocked traffic / misconfigured security groups.
Destination Where logs go CloudWatch Logs, S3, or Kinesis Data Firehose S3 is cheapest for archival/analytics (query with Athena); CloudWatch for alerting.
Format Which fields Default or custom Add fields like vpc-id, subnet-id, pkt-srcaddr, tcp-flags for richer analysis.
Aggregation interval How often records are emitted 1 min or 10 min 1-minute is more granular but higher volume.

The single most important caveat for troubleshooting: flow logs show the result of security group and NACL evaluation, not the rules themselves. A REJECT tells you traffic was blocked but not by which layer — that you reason out from the stateful/stateless behaviour you will learn in the next lesson. Turn flow logs on for every production VPC; they are inexpensive (you pay for log storage/ingestion) and indispensable the day something breaks or a security review asks “what talked to what”.

aws ec2 create-flow-logs \
  --resource-type VPC --resource-ids vpc-0abc... \
  --traffic-type ALL \
  --log-destination-type s3 \
  --log-destination arn:aws:s3:::my-flow-logs-bucket/vpc/

The complete picture

The diagram below assembles every component into one production-shaped VPC: a /16 split into public and private subnets across two Availability Zones, an Internet Gateway on the public tier, a NAT Gateway per AZ for private-subnet egress, a free gateway endpoint pulling S3 traffic off the NAT, an interface endpoint for an AWS API, the route tables that tie it together, and flow logs watching it all.

Amazon VPC anatomy

Trace a packet through it and the whole lesson clicks: an instance in a private subnet hits the internet via its route table’s 0.0.0.0/0 → NAT gateway (in its AZ) → IGW; the same instance reaches S3 via the local-then-gateway-endpoint route with no NAT involved; and a packet to a peer subnet matches the local route and never leaves the VPC.

Hands-on lab

You will build a minimal but complete two-tier VPC — one public and one private subnet, an IGW, a NAT gateway, and an S3 gateway endpoint — entirely from the CLI, validate routing, then tear it all down. Everything here is AWS Free Tier eligible except the NAT gateway and the Elastic IP, so follow the Cost note and clean up promptly.

Prerequisites: the AWS CLI v2 configured (aws configure) with a region set (examples use ap-south-1).

Step 1 — Create the VPC and turn on DNS hostnames.

VPC=$(aws ec2 create-vpc --cidr-block 10.0.0.0/16 \
  --query Vpc.VpcId --output text)
aws ec2 modify-vpc-attribute --vpc-id $VPC --enable-dns-hostnames '{"Value":true}'
aws ec2 create-tags --resources $VPC --tags Key=Name,Value=lab-vpc
echo "VPC=$VPC"

Step 2 — Create one public and one private subnet in the same AZ (for simplicity).

PUB=$(aws ec2 create-subnet --vpc-id $VPC --cidr-block 10.0.1.0/24 \
  --availability-zone ap-south-1a --query Subnet.SubnetId --output text)
aws ec2 modify-subnet-attribute --subnet-id $PUB --map-public-ip-on-launch
PRIV=$(aws ec2 create-subnet --vpc-id $VPC --cidr-block 10.0.2.0/24 \
  --availability-zone ap-south-1a --query Subnet.SubnetId --output text)
echo "PUB=$PUB PRIV=$PRIV"

Step 3 — Attach an Internet Gateway and make the public subnet public.

IGW=$(aws ec2 create-internet-gateway --query InternetGateway.InternetGatewayId --output text)
aws ec2 attach-internet-gateway --internet-gateway-id $IGW --vpc-id $VPC
RTPUB=$(aws ec2 create-route-table --vpc-id $VPC --query RouteTable.RouteTableId --output text)
aws ec2 create-route --route-table-id $RTPUB --destination-cidr-block 0.0.0.0/0 --gateway-id $IGW
aws ec2 associate-route-table --route-table-id $RTPUB --subnet-id $PUB

Step 4 — Create a NAT gateway in the public subnet and route the private subnet through it.

EIP=$(aws ec2 allocate-address --domain vpc --query AllocationId --output text)
NAT=$(aws ec2 create-nat-gateway --subnet-id $PUB --allocation-id $EIP \
  --query NatGateway.NatGatewayId --output text)
# Wait until the NAT gateway is available (takes a couple of minutes)
aws ec2 wait nat-gateway-available --nat-gateway-ids $NAT
RTPRIV=$(aws ec2 create-route-table --vpc-id $VPC --query RouteTable.RouteTableId --output text)
aws ec2 create-route --route-table-id $RTPRIV --destination-cidr-block 0.0.0.0/0 --nat-gateway-id $NAT
aws ec2 associate-route-table --route-table-id $RTPRIV --subnet-id $PRIV

Step 5 — Add a free S3 gateway endpoint to the private route table.

aws ec2 create-vpc-endpoint --vpc-id $VPC --vpc-endpoint-type Gateway \
  --service-name com.amazonaws.ap-south-1.s3 --route-table-ids $RTPRIV

Step 6 — Validate. Confirm the routing is exactly what you intended:

# Public route table should show 0.0.0.0/0 -> igw-...
aws ec2 describe-route-tables --route-table-ids $RTPUB \
  --query 'RouteTables[].Routes' --output table
# Private route table should show 0.0.0.0/0 -> nat-... AND an S3 prefix-list -> vpce-...
aws ec2 describe-route-tables --route-table-ids $RTPRIV \
  --query 'RouteTables[].Routes' --output table

Expected: the public table has a local route plus 0.0.0.0/0 → igw-…; the private table has local, 0.0.0.0/0 → nat-…, and a pl-… (S3) → vpce-… route. That last line is the gateway endpoint at work — S3 traffic now bypasses the NAT entirely.

Cleanup — delete in reverse dependency order (endpoints and NAT before the IGW and subnets, or the deletes will fail):

EPID=$(aws ec2 describe-vpc-endpoints --filters Name=vpc-id,Values=$VPC \
  --query 'VpcEndpoints[0].VpcEndpointId' --output text)
aws ec2 delete-vpc-endpoints --vpc-endpoint-ids $EPID
aws ec2 delete-nat-gateway --nat-gateway-id $NAT
aws ec2 wait nat-gateway-deleted --nat-gateway-ids $NAT
aws ec2 release-address --allocation-id $EIP
aws ec2 detach-internet-gateway --internet-gateway-id $IGW --vpc-id $VPC
aws ec2 delete-internet-gateway --internet-gateway-id $IGW
aws ec2 delete-subnet --subnet-id $PUB
aws ec2 delete-subnet --subnet-id $PRIV
aws ec2 delete-route-table --route-table-id $RTPUB
aws ec2 delete-route-table --route-table-id $RTPRIV
aws ec2 delete-vpc --vpc-id $VPC

Cost note: the VPC, subnets, IGW, route tables, and the S3 gateway endpoint are free. The two charged items are the NAT gateway (an hourly rate plus per-GB processing) and the Elastic IP (free while attached to a running resource, charged when idle). Running this lab for under an hour and cleaning up costs a few US cents at most — but do not leave the NAT gateway running, as its hourly charge accrues around the clock.

Common mistakes & troubleshooting

Symptom Likely cause Fix
Instance in a “public” subnet has no internet Missing one of the three conditions (no IGW, no 0.0.0.0/0 route, or no public IP) Verify IGW attached, route table has 0.0.0.0/0 → igw, and the instance has a public/Elastic IP.
Private instance cannot reach the internet outbound No NAT route, or NAT gateway is in a private subnet Put the NAT gateway in a public subnet and point the private route table’s 0.0.0.0/0 at it.
Cannot create subnet: CIDR not within VPC Subnet CIDR outside the VPC block, or overlaps another subnet Choose a sub-range that fits inside the VPC CIDR and does not overlap.
Interface endpoint name still resolves to a public IP enableDnsHostnames is off (or private DNS not enabled) Turn on enableDnsHostnames and enableDnsSupport; enable private DNS on the endpoint.
Cannot peer two VPCs Overlapping CIDR ranges Re-IP one VPC, or use distinct ranges from the start — overlapping blocks cannot be peered.
Self-built NAT instance forwards nothing Source/destination check still enabled Run modify-instance-attribute --no-source-dest-check.
S3 traffic is inflating the NAT bill No gateway endpoint; S3 traffic flows through NAT Add a free S3 gateway endpoint to the private route table.
App “A” cannot reach app “C” through a middle VPC Peering is non-transitive Add a direct peering A↔C, or move to a Transit Gateway.

Best practices

Security notes

Interview & exam questions

  1. What makes a subnet “public”? Its route table has a 0.0.0.0/0 route pointing at an Internet Gateway. (In practice you also enable auto-assign public IP or attach Elastic IPs.) There is no “public” flag on the subnet itself — it is purely routing.

  2. How many usable IPs are in a /24 subnet, and why not 256? 251. AWS reserves the first four addresses (network, VPC router, Amazon DNS, future use) and the last (broadcast) in every subnet.

  3. NAT Gateway vs NAT instance — give three differences. NAT Gateway is managed, auto-scales to 100 Gbps, and is HA within an AZ; the NAT instance is a self-managed EC2 (single point of failure, fixed throughput, you patch it) but can act as a bastion/port-forwarder and needs source/destination check disabled.

  4. Why deploy one NAT gateway per Availability Zone? A NAT gateway is zonal; one per AZ removes the single-AZ dependency and avoids cross-AZ data-transfer charges by keeping each AZ’s egress local. If the AZ holding your only NAT gateway fails, all private-subnet egress fails.

  5. Gateway endpoint vs interface endpoint — when do you use each? Gateway endpoints serve only S3 and DynamoDB, are free, and work via a route-table entry. Interface endpoints (PrivateLink) serve most other services, place an ENI with a private IP in your subnet, cost per hour + per GB, and are reachable cross-Region / on-prem.

  6. Can you change a VPC’s primary CIDR after creation? No. The primary IPv4 CIDR is permanent. You can only add up to four (default) secondary CIDR blocks that do not overlap existing ranges.

  7. An interface endpoint’s DNS name resolves to a public IP — what is wrong? enableDnsHostnames (and enableDnsSupport) must be on, and private DNS must be enabled on the endpoint, for the service name to resolve to the endpoint’s private IP.

  8. Is VPC peering transitive? No. If A↔B and B↔C are peered, A cannot reach C through B. You add a direct A↔C peering or move to a Transit Gateway, which is transitive.

  9. What is the local route and can you remove it? An automatic route for the entire VPC CIDR with target local that lets every subnet reach every other subnet. It cannot be deleted or modified and always wins for in-VPC traffic (longest-prefix match).

  10. You need a private instance to reach the internet outbound but never be reachable inbound — what do you build? A NAT gateway in a public subnet, with the private subnet’s 0.0.0.0/0 route pointing at it. For IPv6, an egress-only internet gateway instead.

  11. How do route tables decide between two matching routes? Longest-prefix match — the most specific route wins (e.g. /24 over /0); static routes beat propagated BGP routes of the same prefix.

  12. How do you cut S3 data-transfer costs through a NAT gateway? Add a free S3 gateway endpoint to the private subnet’s route table so S3 traffic bypasses the NAT entirely.

Quick check

  1. True or false: a subnet can span two Availability Zones.
  2. Which two AWS services are supported by gateway endpoints?
  3. Which VPC attribute must be on for interface-endpoint private DNS names to resolve correctly?
  4. Where must a NAT gateway be placed — a public or a private subnet — and why?
  5. What is the smallest subnet size AWS allows, and what limits it?

Answers

  1. False — a subnet lives in exactly one AZ; use multiple subnets across AZs for HA.
  2. Amazon S3 and DynamoDB (only these two).
  3. enableDnsHostnames (alongside enableDnsSupport, which must also be on).
  4. A public subnet — the NAT gateway needs a route to the Internet Gateway to reach the internet on behalf of private instances.
  5. /28 (16 addresses, 11 usable) — limited by the five reserved IPs AWS takes in every subnet.

Exercise

Design (on paper or in the console) a production-ready VPC for a three-tier web application in the ap-south-1 Region that must survive the loss of one Availability Zone:

Bonus: explain what you would change to add a second VPC and connect the two, and at what point you would replace peering with a Transit Gateway.

Certification mapping

Exam Objective area this supports
SAA-C03 (Solutions Architect – Associate) Design secure and resilient architectures — VPC/subnet/AZ design, public vs private routing, NAT for egress, gateway vs interface endpoints, and peering vs Transit Gateway trade-offs.
ANS-C01 (Advanced Networking – Specialty) Network design and connectivity — CIDR/IPv6 planning, route-table behaviour and priority, PrivateLink/endpoints, DHCP option sets and hybrid DNS, and flow-log-based troubleshooting.
DVA-C02 (Developer – Associate) Deployment and security — placing application resources in the right subnet tier and reaching AWS services privately via endpoints.
SOA-C02 (SysOps – Associate) Networking and monitoring — operating NAT gateways, route tables, and VPC Flow Logs for day-to-day troubleshooting.

Glossary

Next steps

Continue the course with AWS Security Groups vs Network ACLs, In Depth — now that you control where traffic flows, learn the filtering layer that decides what is allowed, including the stateful-vs-stateless difference and the classic “return traffic blocked by a NACL” gotcha. Then deepen your networking with:

AWSVPCNetworkingSubnetsNAT GatewayVPC Endpoints
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading