AWS Networking

Amazon Route 53, In Depth: Hosted Zones, Records, Routing Policies & Health Checks

Every connection on the internet begins with a question: what is the IP address for this name? The Domain Name System (DNS) answers it, and on AWS that answer is served by Amazon Route 53 — a highly available, globally distributed authoritative DNS service named after the port DNS runs on, port 53. Route 53 does three distinct jobs that people often conflate: it registers domain names (you can buy example.com through it), it hosts the authoritative records for a domain (it is the source of truth resolvers ask), and it routes traffic intelligently using policies that go far beyond plain DNS — weighting answers for canary releases, sending users to the lowest-latency Region, failing over to a backup when a health check goes red, and answering differently based on the user’s geography.

Route 53 sits at the very front of almost every architecture. It is the first thing a client touches, before the load balancer, before CloudFront, before any compute you pay for. Get it right and you have a resilient, fast-resolving front door with sub-second failover. Get it wrong — a stray CNAME at the apex, a TTL of 86,400 seconds on a record you need to fail over, a health check pointed at the wrong port — and you have an outage that DNS caches will keep serving for hours. This lesson is the exhaustive version: every record type, the Alias-versus-CNAME distinction that trips up nearly everyone, all seven routing policies with explicit when-to-use guidance, the three kinds of health check, and how TTL governs how fast the world sees your changes.

Learning objectives

By the end of this lesson you will be able to:

Prerequisites & where this fits

You should be comfortable with the AWS Console and the AWS CLI (see AWS Hands-On First Steps: Console, CLI, CloudShell, SDKs & Access Keys), and you should have met load balancers in AWS Elastic Load Balancing, In Depth: ALB, NLB, GWLB & Target Groups — Route 53 most often points at an ALB or CloudFront distribution. A working mental model of how a domain delegates to name servers helps but is not assumed; we build it here. This lesson belongs in the Networking module of the AWS Zero-to-Hero course, immediately after load balancing, because the natural progression is: distribute traffic within a Region (ELB), then distribute and fail over traffic across Regions and endpoints (Route 53). It also feeds directly into the edge and DNS-security lessons cross-linked at the end.

Core concepts

Before the settings, the mental model. DNS is a hierarchical, distributed database that maps human-readable domain names to machine-usable data, most commonly IP addresses. The hierarchy reads right-to-left: the root (.), then the top-level domain (TLD — .com, .org, .co.uk), then your registered domain (example.com), then any subdomains (api.example.com).

When a user’s machine needs to resolve api.example.com, it does not ask Route 53 directly. It asks a recursive resolver (typically run by the user’s ISP, or a public one like 8.8.8.8). That resolver walks the hierarchy: it asks a root server “who handles .com?”, asks the .com TLD servers “who is authoritative for example.com?”, and they answer with the name servers for your domain — which, if you host the zone in Route 53, are four Route 53 name servers. The resolver then asks one of those Route 53 name servers for api.example.com, gets the record, caches it for the TTL, and returns it to the user. Route 53 is the authoritative server at the end of that chain — it holds the truth. Everything else is asking and caching.

A few load-bearing terms:

With that, the settings.

Hosted zones: every setting

A hosted zone is where you start. There are two kinds, and the difference is who can see the records.

Public hosted zones

A public hosted zone is authoritative on the public internet. Any resolver in the world can query it. You create one for a domain you intend to serve publicly (example.com). On creation Route 53 assigns four name servers (an NS record) and a SOA record, both created automatically — never delete them. To make the zone live, you must delegate to those four name servers from the parent: if you registered the domain through Route 53, you point the registered domain at the zone’s name servers (Route 53 can do this automatically); if you registered elsewhere, you copy the four name servers into your registrar’s control panel.

Setting What it does Choices / default When to change / gotcha
Domain name The apex/root the zone is authoritative for Any DNS name; immutable after creation Must match the registered domain. Typo means nothing resolves — delete and recreate.
Type Public vs private Public (default) Public = internet-visible. Cannot convert a zone’s type after creation.
Comment Free-text description Optional Use it to record owner/ticket; editable later.
Name servers (NS) The four authoritative servers Route 53 assigns Auto-generated, 4 per zone Copy exactly these into your registrar. Two zones for the same name get different name servers — only the delegated set is live.
SOA Start-of-authority metadata (primary NS, contact, serial, timers) Auto-generated Rarely edited; the minimum TTL field affects negative-caching. Do not delete.

A subtle and very common mistake: deleting a public hosted zone and recreating it gives you a brand-new set of four name servers. The old delegation at the registrar now points at name servers that no longer host the zone, and the domain goes dark until you update the registrar. Treat hosted-zone name servers as something to wire once and leave alone.

Private hosted zones

A private hosted zone is authoritative only inside one or more VPCs you associate with it. Queries from those VPCs (via the Route 53 Resolver at the VPC .2 address) get the private answers; the rest of the internet cannot see the zone at all. This is how you give internal services friendly names — db.internal.example.com resolving to a private IP — without exposing topology publicly.

Setting What it does Choices / default When to change / gotcha
Domain name The zone the records cover Any name (often a real domain or an internal-only one) You can run a private zone for the same name as a public zone — this is split-horizon DNS (below).
VPCs to associate Which VPCs see these records One or more, any Region/account The VPC must have enableDnsHostnames and enableDnsSupport both on, or resolution fails silently. Cross-account association needs a CLI authorisation step.
Region Where the zone is created Any; association can span Regions Private zones are global objects but associations are per-VPC.

Split-horizon (split-view) DNS is the headline use case: associate a private zone for example.com with your VPCs and keep a public zone for example.com on the internet. Inside the VPC, app.example.com resolves to a private ALB; outside, the same name resolves to a public CloudFront distribution. Route 53 evaluates the most specific matching private zone first for queries from an associated VPC, falling back to public resolution only if no private zone covers the name. The classic gotcha: if a private zone for example.com exists but lacks a record for legacy.example.com, queries from the VPC get NXDOMAIN rather than falling through to the public zone — the private zone is authoritative for the whole name space it covers.

Record types: every type you will meet

A record (resource record set) has a name, a type, a TTL (except Alias), and a value. The type tells resolvers what kind of data to expect. Route 53 supports the full standard set; these are the ones you will actually configure.

Type What it holds Typical value Notes & gotchas
A IPv4 address 203.0.113.10 The workhorse. Can be an Alias (see below) instead of a literal IP.
AAAA IPv6 address 2001:db8::1 The IPv6 equivalent of A; also Alias-capable. Add it whenever you serve IPv6.
CNAME Canonical name (an alias to another name) lb-123.eu-west-1.elb.amazonaws.com Returns a name, not an IP; the resolver must look that up too. Forbidden at the zone apex and must be the only record at its name.
NS Name servers for a zone or delegated subdomain four ns-xxx.awsdns-xx.* Created automatically for the zone apex. Add your own NS records to delegate a subdomain to a different zone/provider.
SOA Start of authority — zone metadata and timers ns-... hostmaster... serial refresh retry expire minTTL One per zone, auto-created. The last field sets negative caching (how long NXDOMAIN is cached).
MX Mail exchanger — where email for the domain goes 10 mail.example.com The number is priority (lower = preferred). Required for receiving email.
TXT Arbitrary text "v=spf1 include:_spf.google.com ~all" Used for SPF, DKIM, DMARC, and domain-ownership verification. Quote each string; 255-char chunks.
SRV Service location (host + port + priority + weight) 10 60 5060 sip.example.com For protocols that advertise host and port (SIP, LDAP, some game and chat services).
CAA Which Certificate Authorities may issue certs for the domain 0 issue "amazon.com" A security control — stops a rogue CA issuing a cert for your domain. Add amazon.com so ACM can issue.
PTR Reverse DNS (IP → name) host.example.com Lives in special in-addr.arpa / ip6.arpa zones; used for reverse lookups and mail-server reputation.
NAPTR / DS / SPF / others Telephony rewriting, DNSSEC delegation signer, legacy SPF varies Less common; Route 53 supports them. SPF-the-type is deprecated — use TXT for SPF.

Two practical rules that catch people: a CNAME must be alone at its name (you cannot have a CNAME and an A for www.example.com), and you cannot put a CNAME at the apex (example.com itself), because the apex must also carry the NS and SOA records and the DNS spec forbids a CNAME coexisting with other records. The fix for the apex is the Alias record.

Alias vs CNAME: the distinction that trips everyone

This is the single most-asked Route 53 interview question, so understand it cold.

A CNAME is standard DNS: it says “this name is really that name; go look that up.” It works for any target, AWS or not, but it returns a name, forcing the resolver to do a second lookup, and it is forbidden at the apex and must be alone at its record name.

An Alias record is a Route 53 extension (not standard DNS). It points an A or AAAA record directly at a supported AWS resource — and at resolution time Route 53 substitutes that resource’s current IP address(es) into the answer. To the resolver it looks like a normal A/AAAA answer (it gets IPs, not a name), so there is no second lookup and no charge for Alias queries to AWS resources. Crucially, an Alias works at the zone apex, which is why example.com → CloudFront is always an Alias, never a CNAME.

Dimension CNAME Alias
Standard? Yes (RFC) No — Route 53 only
Returns A name (triggers another lookup) IP address(es) directly
Works at apex? No Yes
Can coexist with other records at the name? No (must be alone) Yes (it is an A/AAAA)
Targets Any DNS name Specific AWS resources + same-zone records (see below)
Query cost Charged as a normal query Free when pointing at an AWS resource
Health/failover integration Manual Evaluate Target Health auto-tracks the target
TTL You set it Inherited from the target (you cannot set it)

Alias targets you can point at: CloudFront distributions, ELB load balancers (ALB/NLB/CLB), S3 website endpoints, API Gateway, VPC interface endpoints, Elastic Beanstalk environments, Global Accelerator, AppSync, and — very usefully — another record in the same hosted zone. That last one lets you alias www.example.com to example.com and maintain the IP in one place.

The Evaluate Target Health toggle on an Alias is the quiet superpower: set it to Yes and Route 53 stops returning that Alias if the underlying resource (e.g. all targets behind an ALB) is unhealthy — health checking you get for free, without creating a separate health check, as long as you are aliasing an ELB, CloudFront, or another Route 53 record that is itself health-checked.

When to use which: Alias for anything pointing at an AWS resource (always — it is free, faster, and apex-capable); CNAME only for pointing a subdomain at a non-AWS name (a SaaS endpoint, a partner’s host) or where you genuinely need standard-DNS behaviour.

Routing policies: all seven, with when-to-use

A routing policy is set per record and decides which value Route 53 returns when several records share the same name and type. This is where Route 53 stops being plain DNS and becomes a traffic director. There are seven.

1. Simple

One record, one answer (or, if you give multiple values, Route 53 returns them all in random order and the client picks). No health checking. This is ordinary DNS.

2. Weighted

Multiple records, same name/type, each with a weight (0–255). Route 53 returns each in proportion to its weight ÷ total. Weight 0 takes a record out of rotation (unless all are 0, in which case all are returned equally).

3. Latency-based

Multiple records, each tagged with an AWS Region. Route 53 returns the record whose Region gives the lowest network latency to the resolver, based on AWS’s continuously-measured latency map.

4. Failover

A primary and a secondary record. Route 53 returns the primary while its health check is healthy, and switches to the secondary when the primary fails. The classic active-passive pattern.

5. Geolocation

Returns a different record based on the geographic location of the user (resolver), matched by continent, country, or — for the US — state. You can set a default record for locations that match no rule.

6. Geoproximity

Routes based on the geographic distance between the user and your resources, with a bias you can dial (–99 to +99) to expand or shrink the geographic area a resource serves. Configured via Route 53 Traffic Flow (the visual policy editor).

7. Multivalue answer

Returns up to eight healthy records chosen at random from a larger set, each optionally health-checked. It is like a simple record with multiple values plus health checking, giving you crude DNS-level load distribution that automatically omits unhealthy endpoints.

Policy Returns based on Health-check aware? Signature use case
Simple Single config No One resource, plain DNS
Weighted Assigned weights Yes (per record) Canary / blue-green, A/B
Latency Lowest measured latency Yes (per record) Active-active multi-Region for speed
Failover Primary health Yes (required) Active-passive DR
Geolocation User’s country/continent/state Yes (per record) Localisation, compliance, geo-block
Geoproximity Distance + bias Yes (per record) Distance routing with tunable boundaries
Multivalue Up to 8 random healthy values Yes (per record) Health-aware spreading without an LB

You can nest policies with Traffic Flow — e.g. latency-based at the top to pick a Region, then weighted within each Region for a canary, then failover under each weight. That composition is how large multi-Region systems are actually expressed.

Health checks: every kind

A health check is a separate Route 53 object that monitors a target and reports healthy/unhealthy; routing policies consult it to decide whether to return a record. Route 53 health checkers run from multiple AWS locations worldwide and a target is considered up if more than 18% of checkers see it as healthy (this is why you must allow the Route 53 health-checker IP ranges through firewalls). There are three types.

Endpoint health checks

Monitor an IP or domain name on a chosen protocol. The settings:

Setting What it does Choices / default When to change / gotcha
Protocol How to probe HTTP, HTTPS, TCP HTTP(S) lets you check a path and status; TCP only checks the port opens. HTTPS does not validate the certificate by default.
Endpoint What to probe IP address or domain name + port If you use a domain name, Route 53 resolves it each check. Use an IP to pin it.
Path (HTTP/S) Which URL to request e.g. /health; default / Point at a deep health endpoint that checks dependencies, not a static page that is “up” while the app is broken.
Request interval Probe frequency Standard 30 s or Fast 10 s Fast detects failure sooner but costs more and is noisier.
Failure threshold Consecutive fails before “unhealthy” 1–10, default 3 Lower = faster failover, more false positives on a blip. 3×30s ≈ 90s to flip.
String matching Require a string in the first 5,120 bytes of the response body Off / on with search string Catches “200 OK but wrong content” — e.g. require "OK" in the body.
Latency graphs Record response time in CloudWatch Off (default) / on Turn on to alarm on slow-but-up endpoints.
Invert health status Treat healthy as unhealthy and vice-versa Off (default) Niche — e.g. fail over to a site only when a maintenance flag returns 200.
Health checker regions Which checker locations probe Default set / custom Reduce to fewer regions to cut noise, but keep enough for the 18% rule.
SNI (HTTPS) Send the hostname in the TLS handshake On by default Required for endpoints that serve multiple certs on one IP.

Calculated health checks

A health check whose status is derived from other health checks using a Boolean rule — “healthy if at least N of these child checks are healthy”. It probes nothing itself.

CloudWatch-alarm health checks

A health check that mirrors the state of a CloudWatch alarm. The check is unhealthy when the alarm is in ALARM. This lets you health-check anything CloudWatch can measure — DynamoDB throttles, SQS queue depth, ELB 5xx rate, a custom metric — not just an HTTP endpoint.

Health checks integrate with routing in two ways: associate a health check with a record (failover, weighted, latency, geolocation, multivalue all honour it and stop returning unhealthy records), or use Evaluate Target Health on an Alias to inherit the target’s health automatically. A frequent design is a failover record pair where the primary is an Alias to an ALB with Evaluate Target Health = Yes — no manual health check object needed.

TTL: the propagation lever

TTL (time to live), in seconds, tells every resolver how long it may cache an answer before re-asking Route 53. It is the single biggest control over how fast a DNS change reaches users — and a constant trade-off.

Record purpose Sensible TTL Reasoning
Stable apex/www pointing at CloudFront (Alias) n/a — Alias TTL is managed Route 53 handles it; you can’t set it.
Records you may fail over 60 s Fast failover; the small extra query cost is worth the recovery time.
Stable MX, TXT (SPF/DKIM), NS 3,600–86,400 s Rarely change; cache hard.
A record you’re about to migrate lower it to 60 s a day before So the cut-over propagates quickly, then raise it again.

Two things people miss: Alias records to AWS resources have a TTL managed by Route 53 (you cannot set it), and negative answers (NXDOMAIN) are cached according to the minimum TTL in the SOA record, so a typo that returns NXDOMAIN can stick in caches even after you fix it.

Amazon Route 53: records, routing policies, health checks

The diagram traces a single query from a client through the recursive resolver to a Route 53 hosted zone, then shows the same name resolving differently under each routing policy and how health checks gate which records are returned.

Hands-on lab

We will create a hosted zone, add records, build a failover pair backed by a health check, and clean up. This uses Route 53 features that incur small charges (see the cost note); there is no perpetual free tier for hosted zones, but the cost of doing this for an hour is a few cents. You do not need to own a domain — we will create a zone and inspect it; you would only delegate a real domain at the registrar step.

1. Set a zone name and create a public hosted zone.

ZONE=kloudvin-lab-$RANDOM.example
aws route53 create-hosted-zone \
  --name "$ZONE" \
  --caller-reference "lab-$(date +%s)" \
  --hosted-zone-config Comment="Route53 deep-dive lab"

Expected output includes a HostedZone.Id like /hostedzone/Z0123456789ABCDEFG and a DelegationSet.NameServers list of four ns-*.awsdns-* servers. Capture the ID:

ZID=$(aws route53 list-hosted-zones-by-name --dns-name "$ZONE" \
  --query 'HostedZones[0].Id' --output text | sed 's#/hostedzone/##')
echo "$ZID"

2. View the auto-created NS and SOA records.

aws route53 list-resource-record-sets --hosted-zone-id "$ZID" \
  --query "ResourceRecordSets[?Type=='NS' || Type=='SOA'].[Name,Type]" --output table

You should see the apex NS (four name servers) and the SOA — both created for you.

3. Add a simple A record.

cat > /tmp/r53-simple.json <<JSON
{ "Changes": [ {
  "Action": "UPSERT",
  "ResourceRecordSet": {
    "Name": "www.$ZONE",
    "Type": "A",
    "TTL": 60,
    "ResourceRecords": [ { "Value": "203.0.113.10" } ]
  } } ] }
JSON
aws route53 change-resource-record-sets --hosted-zone-id "$ZID" \
  --change-batch file:///tmp/r53-simple.json

The response shows a ChangeInfo.Status of PENDING. Route 53 changes are atomic and usually INSYNC within seconds.

4. Create a health check (endpoint, HTTPS to a known-good host) and a failover pair.

HCID=$(aws route53 create-health-check \
  --caller-reference "hc-$(date +%s)" \
  --health-check-config 'Type=HTTPS,FullyQualifiedDomainName=aws.amazon.com,Port=443,RequestInterval=30,FailureThreshold=3,ResourcePath=/' \
  --query 'HealthCheck.Id' --output text)
echo "Health check: $HCID"

cat > /tmp/r53-failover.json <<JSON
{ "Changes": [
  { "Action": "UPSERT", "ResourceRecordSet": {
      "Name": "app.$ZONE", "Type": "A", "TTL": 60,
      "SetIdentifier": "primary",
      "Failover": "PRIMARY",
      "HealthCheckId": "$HCID",
      "ResourceRecords": [ { "Value": "203.0.113.20" } ] } },
  { "Action": "UPSERT", "ResourceRecordSet": {
      "Name": "app.$ZONE", "Type": "A", "TTL": 60,
      "SetIdentifier": "secondary",
      "Failover": "SECONDARY",
      "ResourceRecords": [ { "Value": "198.51.100.30" } ] } }
] }
JSON
aws route53 change-resource-record-sets --hosted-zone-id "$ZID" \
  --change-batch file:///tmp/r53-failover.json

5. Validate. Confirm both failover records exist and check the health-check status:

aws route53 list-resource-record-sets --hosted-zone-id "$ZID" \
  --query "ResourceRecordSets[?Name=='app.$ZONE.'].[SetIdentifier,Failover,HealthCheckId]" \
  --output table

aws route53 get-health-check-status --health-check-id "$HCID" \
  --query 'HealthCheckObservations[].StatusReport.Status' --output table

You should see a primary/PRIMARY record bound to your health-check ID, a secondary/SECONDARY record, and several checker locations reporting Success: HTTP Status Code 200, OK. Because the records use placeholder documentation IPs, do not expect a real dig against them to reach a server — the point is the Route 53 configuration and the health-check signal.

6. Cleanup. Delete the records, the health check, and the zone (a zone with non-default records cannot be deleted):

# delete the failover records (Action must be DELETE with exact current values)
sed 's/"UPSERT"/"DELETE"/g' /tmp/r53-failover.json > /tmp/r53-failover-del.json
aws route53 change-resource-record-sets --hosted-zone-id "$ZID" \
  --change-batch file:///tmp/r53-failover-del.json

sed 's/"UPSERT"/"DELETE"/g' /tmp/r53-simple.json > /tmp/r53-simple-del.json
aws route53 change-resource-record-sets --hosted-zone-id "$ZID" \
  --change-batch file:///tmp/r53-simple-del.json

aws route53 delete-health-check --health-check-id "$HCID"
aws route53 delete-hosted-zone --id "$ZID"

Cost note. A hosted zone costs USD 0.50 per month (pro-rated only for the first 12 hours, then charged per month — so create and delete on the same day to keep it to the half-dollar). Standard queries are about USD 0.40 per million; Alias queries to AWS resources are free. Health checks of AWS endpoints are free; checks of non-AWS endpoints cost about USD 0.75 per check per month, with optional features (HTTPS, string matching, fast interval) adding small increments. This lab, deleted promptly, costs well under a dollar. Always run the cleanup — an orphaned hosted zone quietly bills USD 0.50 every month.

Common mistakes & troubleshooting

Symptom Likely cause Fix
Domain doesn’t resolve at all Registrar still points at old/auto name servers, not the zone’s four Copy the zone’s exact NS values into the registrar’s name-server settings; allow propagation.
“CNAME at apex not allowed” error Tried to CNAME example.com to an AWS resource Use an Alias A/AAAA record at the apex instead.
Failover never switches Primary record has no health check (or Alias Evaluate Target Health is off) Attach a health check to the primary, or enable Evaluate Target Health on the Alias.
Change made but users see old value for ages TTL was high (e.g. 86,400) Lower TTL before a planned change; for emergencies you can only wait out the cached TTL.
Geolocation users get no answer No default record for unmatched locations Add a geolocation record with location Default.
Private zone returns NXDOMAIN for a name that resolves publicly Private zone is authoritative for that name space and lacks the record Add the record to the private zone, or scope the private zone to a narrower name.
Health check flaps / always unhealthy Firewall blocks Route 53 health-checker IPs, or path returns non-2xx/3xx Allow the route53-healthchecks IP ranges; point the check at a real 200-returning health path.
Records in VPC don’t resolve from instances VPC enableDnsSupport/enableDnsHostnames off, or zone not associated Enable both VPC attributes and associate the private zone with the VPC.

Best practices

Security notes

DNS is a security surface, not just plumbing. Enable DNSSEC signing on public zones to let resolvers cryptographically verify that answers are authentic and unmodified — this defends against DNS spoofing and cache poisoning (Route 53 supports DNSSEC signing with KMS-backed keys; you also add a DS record at the parent). Add CAA records to constrain certificate issuance to authorities you trust. Guard against dangling DNS / subdomain takeover: if a record points at a de-provisioned resource (a deleted S3 bucket or released Elastic IP), an attacker can claim that resource and serve content under your name — audit and remove records whose targets no longer exist. Turn on query logging to spot anomalous lookups (data-exfiltration tunnels, malware C2 patterns). Apply least-privilege IAM: scope route53:ChangeResourceRecordSets to specific hosted-zone ARNs so a compromised credential cannot rewrite every zone you own. Finally, for protecting outbound DNS from your VPCs (filtering what your workloads are allowed to resolve), use Route 53 Resolver DNS Firewall — covered in the resolver lesson linked below.

Interview & exam questions

  1. What is the difference between an Alias record and a CNAME, and when must you use an Alias? A CNAME is standard DNS that returns another name (forcing a second lookup), is charged as a query, must be alone at its name, and cannot sit at the zone apex. An Alias is a Route 53 extension on an A/AAAA record that returns the target AWS resource’s IPs directly, is free for AWS targets, can coexist as a normal record, and works at the apex. You must use an Alias to point the apex (example.com) at CloudFront, an ELB, S3 website, etc.

  2. You need to roll a new version of a service to 10% of users, then ramp up. Which routing policy? Weighted routing — give the new stack weight 10 and the old weight 90, then shift the weights. Keep TTL low so the proportions track reality and users re-resolve quickly.

  3. A multi-Region app should serve every user from the fastest Region. Which policy, and what’s its limitation? Latency-based routing. Limitation: it optimises for measured network latency from the resolver, not the user, and ignores geography/compliance — a user near a border can be sent across it. For data-residency, use geolocation.

  4. Failover routing isn’t switching to the secondary even though the primary is down. Why? The primary record almost certainly has no associated health check (and, if it’s an Alias, Evaluate Target Health is off). Without a health signal Route 53 keeps returning the primary. Attach a health check or enable Evaluate Target Health.

  5. What are the three types of health check? Endpoint (probe an IP/domain over HTTP/HTTPS/TCP), calculated (Boolean combination of other health checks — “N of M healthy”), and CloudWatch alarm (mirror the state of any CloudWatch alarm, so you can fail over on metrics like error rate or queue depth).

  6. Explain split-horizon DNS in Route 53 and one gotcha. Run a private hosted zone and a public hosted zone for the same domain; queries from associated VPCs hit the private zone, the internet hits the public one. Gotcha: the private zone is authoritative for its whole name space, so a name it lacks returns NXDOMAIN to the VPC rather than falling through to public resolution.

  7. What does TTL control, and what’s a safe value for a record you might need to fail over? TTL is how long resolvers cache an answer before re-querying. For failover-eligible records use a low TTL (~60 s) so a failover propagates quickly; the extra query cost is negligible versus recovery time.

  8. Why can’t you put a CNAME at example.com? The apex must carry the zone’s NS and SOA records, and DNS forbids a CNAME from coexisting with any other record at the same name. Route 53’s Alias record solves this because it is technically an A/AAAA record.

  9. Geolocation vs geoproximity — what’s the difference? Geolocation routes by the user’s named location (continent/country/US state) with a default fallback. Geoproximity routes by distance between user and resource and lets you apply a bias to expand or shrink each resource’s service area; it requires Traffic Flow.

  10. What is multivalue answer routing and when would you use it over a load balancer? It returns up to eight random, optionally health-checked values, omitting unhealthy ones — health-aware DNS spreading. Use it for a small set of independent endpoints when you want availability without an LB; it is not a true load balancer (no connection awareness, no even distribution).

  11. How does Route 53 decide an endpoint is healthy, and why does it matter for firewalls? Health checkers in multiple global locations probe the endpoint; it’s healthy if more than 18% report success. You must therefore allow the Route 53 health-checker IP ranges through security groups/NACLs/firewalls, or checks fail and traffic drains erroneously.

  12. What is subdomain takeover and how do you prevent it? A “dangling” DNS record points at a de-provisioned resource (deleted bucket, released EIP); an attacker re-creates/claims that resource and serves content under your name. Prevent it by removing records whose targets no longer exist and auditing zones regularly.

Quick check

  1. Which record type points a name at an IPv6 address?
  2. True/false: you can place a CNAME at the zone apex if it is the only record there.
  3. Which routing policy is the right choice for active-passive disaster recovery?
  4. What happens to a geolocation query that matches no rule and has no default record?
  5. Are Alias queries to an AWS resource charged?

Answers

  1. AAAA.
  2. False — a CNAME is never allowed at the apex (the apex must hold NS/SOA); use an Alias.
  3. Failover routing (primary + secondary, with a health check on the primary).
  4. It returns no answer (NODATA) — always configure a Default record for geolocation.
  5. No — Alias queries that resolve to AWS resources are free; standard queries are charged per million.

Exercise

Design the DNS for a two-Region web application (eu-west-1 and us-east-1) fronted by an ALB in each Region, that must (a) serve every user from the lower-latency Region, (b) fail a Region out automatically when its ALB has no healthy targets, and © serve EU users only from eu-west-1 for data-residency. Sketch the records: which routing policies, how they nest, what each record’s Alias target and health configuration are, and what TTLs you’d set. Then write the aws route53 change-resource-record-sets change-batch JSON for the apex records. (Hint: geolocation at the top to honour residency, latency for the rest, Alias-to-ALB with Evaluate Target Health for the per-Region failover, 60 s TTLs.)

Certification mapping

Glossary

Next steps

AWSRoute 53DNSRouting PoliciesHealth ChecksNetworking
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading