AWS Storage

Amazon CloudFront, In Depth: Distributions, Origins, Caching, OAC & Edge Functions

Amazon CloudFront is AWS’s content delivery network (CDN) — a globally distributed fleet of caches that sits between your users and your origin (an S3 bucket, a load balancer, an EC2 instance, an API, or any HTTP server) and serves content from whichever location is nearest to the requester. The point is twofold: latency (a user in Mumbai is answered from an edge a few milliseconds away instead of crossing an ocean to your origin) and offload (the cache absorbs the repeat traffic so your origin handles a fraction of the requests and a fraction of the bytes). Along the way CloudFront also becomes the natural place to terminate TLS, enforce a Web Application Firewall, run lightweight code at the edge, restrict access by geography, and lock the origin down so it can only be reached through CloudFront.

The reason CloudFront earns a full deep-dive — and turns up on both the SAA and DVA exams — is that “put a CDN in front of it” hides a surprising amount of machinery. A single distribution can have many origins, route different URL paths to different origins via cache behaviours, decide exactly what goes into the cache key with managed and custom policies, fail over between origins with origin groups, serve a custom domain over HTTPS with an ACM certificate that must live in us-east-1, and run your own code on every request with CloudFront Functions or Lambda@Edge. Get the cache key wrong and your hit ratio collapses; get OAC wrong and your “private” bucket is world-readable; put the certificate in the wrong Region and the console simply will not show it to you.

This lesson is the exhaustive tour. We start with the mental model and the edge network, walk every origin type (including S3 with Origin Access Control, custom origins, and origin groups with failover), take cache behaviours and all three policy types in full, then cover invalidations, TLS/ACM, the two edge-compute options with a clear when-to-use-which, signed URLs and signed cookies, and the security layer of WAF, geo-restriction and Shield — each with the what · choices · default · when · trade-off · gotcha treatment and real aws cloudfront commands throughout. For the resilience and origin-protection architecture that ties failover, Origin Shield and Route 53 together into a defended global front door, this lesson pairs with Global Edge Architecture with CloudFront and Route 53.

Learning objectives

By the end of this lesson you will be able to:

Prerequisites & where this fits

You need an AWS account, a basic grasp of DNS (CloudFront gives you a *.cloudfront.net hostname you point your own domain at) and of S3 (the most common origin). Because CloudFront is the front door for security controls, a working understanding of TLS/HTTPS and of how a Web Application Firewall inspects requests will help. If S3 is hazy, read Amazon S3, In Depth first; for the request-routing layer above CloudFront, Amazon Route 53, In Depth covers DNS. This is the content-delivery lesson in the Storage module of the AWS Zero-to-Hero course — it builds on S3 and feeds the edge-architecture, WAF and global-resilience lessons that follow.

Core concepts: the edge network, distributions, origins and behaviours

CloudFront has a small vocabulary, and getting it straight makes everything else fall into place.

A distribution is the top-level CloudFront resource. It has a unique ID (E1ABCDEF2GHIJ), an AWS-assigned domain name (d111111abcdef8.cloudfront.net), and it bundles together your origins, your cache behaviours and all your settings. When you “create a CloudFront”, you create a distribution. A distribution is a global resource — it is not tied to a Region, and you manage it from the us-east-1 control plane regardless of where your users or origins are.

An edge location (a “point of presence”, PoP) is a datacentre where CloudFront caches content close to users. There are 600+ of them worldwide. When a user requests https://d111111abcdef8.cloudfront.net/cat.png, DNS (Anycast) routes them to the nearest healthy edge location, which answers from cache if it can.

A regional edge cache (REC) is a larger, second-tier cache that sits between the edge locations and your origin. A cache miss at an edge does not go straight to the origin — it goes to the regional edge cache first, which has a bigger footprint and a longer retention, so popular-but-not-hot content is often served from the REC without troubling the origin. This tiering is automatic and free; the optional Origin Shield (covered later) adds a third, designated tier in a Region you choose for maximum offload.

An origin is where CloudFront fetches content on a cache miss — an S3 bucket, an Application/Network Load Balancer, an EC2 instance, an API Gateway endpoint, a MediaPackage channel, or any HTTP(S) server anywhere on the internet (it does not have to be in AWS).

A cache behaviour is a rule that says “for requests matching this URL path pattern, use this origin with these caching, policy and security settings”. Every distribution has a default cache behaviour (path pattern *, the catch-all) and may have additional ordered behaviours matched most-specific-first.

The cache key is the set of values CloudFront uses to decide whether two requests are “the same object” — by default just the host and URL path, but optionally including specific query strings, headers and cookies. The cache key is the single most important performance lever: include too much (e.g. all headers) and every request looks unique, so nothing is cacheable; include too little and you serve the wrong variant.

Here is the request flow end to end:

  1. The client resolves your domain to a CloudFront edge via DNS/Anycast and opens a TLS connection to the nearest edge location.
  2. CloudFront matches the request path to a cache behaviour.
  3. If the object is in the edge cache and fresh (within its TTL) → it is returned immediately (a cache hit). Viewer-request edge functions run before this.
  4. On a miss, the request goes to the regional edge cache; if that misses too, CloudFront makes an origin request (an origin-request edge function may run here) to the origin defined by the behaviour.
  5. The origin responds; CloudFront caches it according to the cache policy and the origin’s Cache-Control/Expires headers, runs any origin-response function, then returns it (a viewer-response function runs last) and stores it in the REC and edge.

Two clarifications that trip people up. CloudFront caches GET/HEAD by default (and optionally OPTIONS); POST/PUT/PATCH/DELETE are forwarded but never cached. And CloudFront is a pull CDN — you never upload content to it; it pulls from your origin on demand and caches the result.

Creating a distribution: every setting

Creating a distribution in the console walks you through several groups of settings. We will go through every one with the what / choices / default / when / trade-off / gotcha treatment, then show the aws and infrastructure-as-code equivalents.

Origin settings

These define where CloudFront fetches content. A distribution can have many origins.

Setting What it is / choices Default Notes & gotcha
Origin domain The hostname CloudFront fetches from — pick from a dropdown of your S3 buckets, ALBs, etc., or type any public DNS name For S3 use the REST endpoint (bucket.s3.region.amazonaws.com), not the website endpoint, when you want OAC
Origin path An optional path prefix appended to every origin request (e.g. /prod) empty Lets one bucket serve multiple distributions from sub-folders
Name A friendly identifier for the origin within the distribution auto Referenced by cache behaviours
Origin access For S3: Origin access control (OAC, recommended), legacy Origin access identity (OAI), or Public Public OAC is the modern, fully-featured option — see the OAC section
Custom headers Header name/value pairs CloudFront adds to every origin request none The standard way to prove “this request came from my CloudFront” to an ALB/custom origin
Enable Origin Shield Adds a designated caching tier in a chosen Region Off Improves offload and origin availability; small per-request cost
Connection attempts / timeout 1–3 attempts; 1–10 s connection timeout 3 / 10 s Lower for fast failover in an origin group
Response timeout / keep-alive (custom origins) How long to wait for the origin (1–60 s, raise via quota) and keep-alive idle (1–60 s) 30 s / 5 s Raise response timeout for slow APIs; a too-low value yields 504s
Minimum origin SSL protocol (custom origins) TLSv1 / 1.1 / 1.2 between CloudFront and the origin TLSv1.2 Keep at 1.2; older values are an audit finding
Protocol (custom origins) HTTP only / HTTPS only / Match viewer Match viewer “Match viewer” forwards on the same protocol the client used

Default cache behaviour settings

The default behaviour (*) is created with the distribution. You can add more behaviours later (next section). Key fields:

Setting What it is / choices Default When to change
Viewer protocol policy HTTP and HTTPS / Redirect HTTP to HTTPS / HTTPS only HTTP and HTTPS Almost always Redirect HTTP to HTTPS for websites; HTTPS only for APIs
Allowed HTTP methods GET,HEAD / GET,HEAD,OPTIONS / all 7 methods GET,HEAD Pick all-methods for APIs/dynamic origins; only GET/HEAD are ever cached
Restrict viewer access Off, or require signed URLs/cookies Off Turn on for paid/private content (see signed URLs)
Cache policy A managed or custom cache policy (defines the cache key + TTLs) CachingOptimized (recommended) The heart of caching — see the policies section
Origin request policy Optional — what gets forwarded to the origin none Forward more to the origin than you cache on
Response headers policy Optional — headers CloudFront adds to the viewer response (CORS, security headers) none Add HSTS/CSP/CORS without touching the origin
Compress objects automatically Gzip/Brotli compression at the edge Yes Leave on; needs the cache policy to enable the encoding fields
Function associations Attach CloudFront Functions / Lambda@Edge to the 4 events none Edge compute — see that section
Field-level encryption Encrypt specific POST fields with a public key none For sensitive form fields (card numbers) that even your app servers shouldn’t read

Legacy cache settings vs policies. Older distributions used inline “forward query strings / headers / cookies” toggles. Those still exist as legacy cache settings, but AWS now steers you to cache policies + origin request policies. Use policies on anything new — they are reusable, support more options, and a single cache policy keeps the cache key consistent across behaviours.

Distribution (general) settings

These apply to the whole distribution rather than a single behaviour.

Setting What it is / choices Default Notes
Price class All edge locations / Most (excl. most expensive) / Only NA + Europe All Fewer locations = lower cost but higher latency for excluded regions
Alternate domain names (CNAMEs) Your own domains (cdn.example.com) none Each must be covered by the attached certificate; unique across all CloudFront accounts
Custom SSL certificate An ACM cert (must be in us-east-1) or imported IAM cert Default CloudFront cert Required to serve a custom domain over HTTPS — see TLS section
Security policy (min TLS) e.g. TLSv1.2_2021 TLSv1.2_2021 (with SNI) The minimum TLS version/cipher suite for viewers
Supported HTTP versions HTTP/2, HTTP/3 (QUIC) HTTP/2 + HTTP/3 Leave both on for performance
Default root object Object returned for / (e.g. index.html) none Set it, or / returns an error/listing
Standard logging Access logs to S3, CloudWatch Logs, or Kinesis Data Firehose (v2) Off See logging section
IPv6 Enable IPv6 for viewers On Keep on
WAF web ACL Attach an AWS WAF (CLOUDFRONT scope) web ACL none See WAF section
Default TTL / Min / Max TTL Fallback TTLs when set inline (legacy) — otherwise the cache policy owns TTLs 24 h / 0 / 1 yr With a cache policy, set TTLs there

CLI and infrastructure-as-code

The console builds a big DistributionConfig JSON; the CLI takes it directly. A minimal create with an S3 origin and a managed cache policy:

# Managed policy IDs are well-known constants:
#   CachingOptimized          658327ea-f89d-4fab-a63d-7e88639e58f6
#   CachingDisabled           4135ea2d-6df8-44a3-9df3-4b5a84be39ad
#   AllViewerExceptHostHeader b689b0a8-53d0-40ab-baf2-68738e2966ac (origin request)
aws cloudfront create-distribution \
  --distribution-config file://dist-config.json

dist-config.json (abbreviated) references the origin, the default behaviour and a managed cache policy:

{
  "CallerReference": "site-2026-06-15",
  "Comment": "kloudvin static site",
  "Enabled": true,
  "DefaultRootObject": "index.html",
  "Origins": { "Quantity": 1, "Items": [{
    "Id": "s3-site",
    "DomainName": "my-site.s3.ap-south-1.amazonaws.com",
    "OriginAccessControlId": "E2ABCDEFGHIJK",
    "S3OriginConfig": { "OriginAccessIdentity": "" }
  }]},
  "DefaultCacheBehavior": {
    "TargetOriginId": "s3-site",
    "ViewerProtocolPolicy": "redirect-to-https",
    "CachePolicyId": "658327ea-f89d-4fab-a63d-7e88639e58f6",
    "Compress": true
  }
}

In CloudFormation/CDK/Terraform, the same shape appears as AWS::CloudFront::Distribution, the L2 cloudfront.Distribution construct, or the aws_cloudfront_distribution resource respectively — all of which let you reference an aws_cloudfront_origin_access_control and managed/custom policies by ID. CDK’s L2 construct is the most ergonomic for beginners because it wires OAC and the bucket policy for you.

Origins in depth: S3 + OAC, custom origins, and origin groups

S3 origin with Origin Access Control (OAC)

The classic CloudFront pattern is static site/assets in S3, private bucket, served only through CloudFront. The mechanism that makes the bucket private yet reachable is Origin Access Control (OAC) — the modern replacement for the legacy Origin Access Identity (OAI).

The idea: CloudFront signs every request it makes to S3 with SigV4, S3 verifies that signature, and the bucket policy is written to allow access only from your specific CloudFront distribution. Block Public Access stays fully on; nobody can reach the bucket directly.

Why OAC over OAI:

OAC (recommended) OAI (legacy)
Signing SigV4, supports SSE-KMS encrypted objects SigV2, cannot read SSE-KMS objects
Methods All (incl. PUT/DELETE for uploads) GET/HEAD-oriented
Regions All, incl. opt-in Regions Limited
HTTP method to S3 Supports dynamic requests Limited
Status Current, actively developed Maintenance only

The setup is three steps:

# 1. Create the OAC (control plane is us-east-1 / global)
aws cloudfront create-origin-access-control --origin-access-control-config '{
  "Name":"site-oac","SigningProtocol":"sigv4",
  "SigningBehavior":"always","OriginAccessControlOriginType":"s3"
}'

# 2. Reference its Id on the S3 origin (OriginAccessControlId, S3OriginConfig.OriginAccessIdentity = "")
#    (done in the distribution config above)

# 3. Bucket policy allowing ONLY this distribution:
{
  "Version": "2012-10-17",
  "Statement": [{
    "Sid": "AllowCloudFrontOAC",
    "Effect": "Allow",
    "Principal": { "Service": "cloudfront.amazonaws.com" },
    "Action": "s3:GetObject",
    "Resource": "arn:aws:s3:::my-site/*",
    "Condition": { "StringEquals": {
      "AWS:SourceArn": "arn:aws:cloudfront::111122223333:distribution/E1ABCDEF2GHIJ"
    }}
  }]
}

Critical gotchas:

Custom origins (ALB, EC2, API Gateway, any HTTP server)

Anything that is not an S3 REST endpoint is a custom origin — an ALB, an EC2 instance, API Gateway, or a third-party server on the public internet. Custom origins unlock extra settings (protocol policy, SSL protocols, response/keep-alive timeouts, custom origin port) and behave differently from S3: CloudFront forwards the request method, and you decide which headers/cookies/query strings reach the origin via the origin request policy.

The signature security pattern for a custom origin is origin lock-down with a shared secret header: CloudFront adds a secret custom header (e.g. X-Origin-Verify: <random>) to every origin request, and the ALB has a listener rule (or WAF rule) that returns 403 unless that header is present. That stops the public from bypassing CloudFront and hitting the ALB directly. (Pair it with restricting the ALB’s security group to CloudFront’s published prefix list, com.amazonaws.global.cloudfront.origin-facing.)

# Add a secret verification header to a custom origin (excerpt of origin config)
"CustomHeaders": { "Quantity": 1, "Items": [
  { "HeaderName": "X-Origin-Verify", "HeaderValue": "p7Qx...secret" }
]}

For API Gateway and dynamic origins you almost always pair a CachingDisabled cache policy with an AllViewerExceptHostHeader origin request policy, so CloudFront forwards everything the API needs but caches nothing it shouldn’t.

Origin groups and failover

An origin group binds two origins — a primary and a secondary — into a single failover unit you point a cache behaviour at. If the primary returns one of the configured failover status codes (you choose from 403, 404, 416, 500, 502, 503, 504) or the connection fails/times out, CloudFront automatically retries the request against the secondary.

Aspect Detail
Members Exactly 2 origins (primary + secondary)
Trigger Configured failover status codes and connection errors/timeouts
Scope Per request — failover is decided each request, not a sticky switch
Methods Origin failover only happens for GET, HEAD, OPTIONS (idempotent reads)
Common use An S3 bucket in Region A primary, a replica bucket in Region B secondary (active-passive static content)
# Origin group excerpt: fail over on connection error + 5xx
"OriginGroups": { "Quantity": 1, "Items": [{
  "Id": "og-site",
  "FailoverCriteria": { "StatusCodes": { "Quantity": 3, "Items": [500,502,503] } },
  "Members": { "Quantity": 2, "Items": [
    { "OriginId": "s3-primary" }, { "OriginId": "s3-secondary" }
  ]}
}]}

Origin groups are not a full DR story on their own — they handle origin-level failure behind one front door, but moving between front doors or between Regions at the DNS layer is Route 53’s job. The two compose; see the failover-architecture lesson linked at the end.

Cache behaviours and the three policy types

A cache behaviour maps a path pattern to an origin plus a set of caching/forwarding/security settings. Behaviours are evaluated most-specific-first, with the default * behaviour matched last.

Typical path patterns: images/*, api/*, *.css, static/*. A request matching api/* might point at an ALB origin with caching disabled, while * points at the S3 origin with aggressive caching.

The modern way to configure caching is three independent, reusable policies. Keeping them separate is the key insight: the cache key and what you forward to the origin are different decisions.

Cache policy — defines the cache key and TTLs

A cache policy controls (a) what goes into the cache key and (b) the TTLs. Whatever is in the cache key is also forwarded to the origin.

Element Choices Notes
Headers in cache key None / Allowlist Each header added multiplies cache variants — add only what changes the response
Query strings None / Allowlist / All All is fine for APIs; for static assets pick the few that matter (e.g. ?v=)
Cookies None / Allowlist / All Almost always None for cacheable static content
Min / Default / Max TTL seconds (0 to 1 year) The origin’s Cache-Control: max-age is clamped between Min and Max
Encoding fields Gzip / Brotli Must be enabled here for Compress to actually serve compressed bytes

AWS-managed cache policies cover the common cases: CachingOptimized (no cookies/headers/query strings in the key, compression on — perfect for static assets), CachingDisabled (nothing cached — for dynamic/API origins), and CachingOptimizedForUncompressedObjects.

The TTL precedence rule (a frequent exam question): CloudFront uses the object’s Cache-Control: max-age / s-maxage (or Expires) from the origin, clamped to the policy’s Min and Max TTL. If the origin sends no cache header, the Default TTL applies. Cache-Control: no-cache/no-store/private from the origin make the object effectively uncacheable.

Origin request policy — what is forwarded to the origin

An origin request policy controls what CloudFront includes when it goes to the origin on a cache miss — independent of the cache key. This is how you forward, say, the Authorization header or all query strings to a backend without fragmenting your cache. Managed examples: AllViewer, AllViewerExceptHostHeader (vital for ALB origins, which reject CloudFront’s host header), CORS-S3Origin, UserAgentRefererHeaders.

The single most common dynamic-origin recipe: CachingDisabled (cache policy) + AllViewerExceptHostHeader (origin request policy). The origin sees everything; CloudFront caches nothing.

Response headers policy — headers added to the viewer response

A response headers policy lets CloudFront add or override response headers without changing the origin — security headers (HSTS, X-Content-Type-Options, CSP, X-Frame-Options, Referrer-Policy), CORS headers, custom headers, and a Server-Timing header for diagnostics. There is a managed SecurityHeadersPolicy and a CORS-with-preflight family. This is the cleanest way to get an A on a security-headers scan for a static S3 site that can’t set headers itself.

# Create a custom cache policy: cache key = path + ?lang query string, 1h default TTL
aws cloudfront create-cache-policy --cache-policy-config '{
  "Name":"site-lang","DefaultTTL":3600,"MinTTL":0,"MaxTTL":86400,
  "ParametersInCacheKeyAndForwardedToOrigin":{
    "EnableAcceptEncodingGzip":true,"EnableAcceptEncodingBrotli":true,
    "HeadersConfig":{"HeaderBehavior":"none"},
    "CookiesConfig":{"CookieBehavior":"none"},
    "QueryStringsConfig":{"QueryStringBehavior":"whitelist",
      "QueryStrings":{"Quantity":1,"Items":["lang"]}}
  }
}'

Other per-behaviour settings

Invalidations vs versioned object names

When you deploy new content, the edges may still hold the old version until its TTL expires. Two ways to force freshness:

Approach How Cost When
Invalidation Tell CloudFront to evict paths (/index.html, /css/*, or /*) from all edges First 1,000 paths/month free, then ~$0.005 per path Occasional, urgent, or HTML files that keep a stable name
Versioned object names Deploy assets under new names (app.4f2a1.js) and reference them from HTML Free The recommended default for static assets — change the URL, the old cached copy is simply never requested again

The best practice is a hybrid: give CSS/JS/images content-hashed filenames (so they are immutable and cacheable for a year, Cache-Control: max-age=31536000, immutable) and invalidate only the small set of HTML entry points (often just /index.html or /* if cheap). A wildcard /* invalidation counts as one path for billing but evicts everything — handy and cheap for a small site, but it nukes your hit ratio momentarily.

aws cloudfront create-invalidation \
  --distribution-id E1ABCDEF2GHIJ \
  --paths "/index.html" "/css/*"

Gotcha: invalidations are asynchronous (seconds to a few minutes to propagate to all edges) and are matched case-sensitively with exact paths or trailing * wildcards only — no regex.

TLS, ACM and custom domains

By default CloudFront serves over HTTPS on its *.cloudfront.net name using a shared AWS certificate. To serve your own domain (cdn.example.com) over HTTPS you must:

  1. Request or import a certificate in ACM in the us-east-1 Regionthis is mandatory and the number-one CloudFront gotcha. CloudFront is a global service whose certificate store lives in us-east-1; a certificate in any other Region simply will not appear in the distribution’s certificate dropdown. (Regional services like ALB use a cert in their own Region — CloudFront is the exception.)
  2. Validate the certificate (DNS validation via a CNAME is easiest and supports auto-renewal).
  3. Add the domain as an Alternate Domain Name (CNAME) on the distribution.
  4. Select the custom SSL certificate and a security policy (minimum TLS version).
  5. Point your domain’s DNS at the distribution — for Route 53, an A/AAAA alias to the CloudFront domain (free, no extra lookup); for other DNS, a CNAME.
Setting Choices Default Notes
Certificate source ACM (us-east-1) or imported IAM cert Default CloudFront cert ACM auto-renews; imported certs you rotate yourself
Clients supported / SNI SNI (recommended, free) vs dedicated IP ($600/mo) SNI Dedicated IP only for ancient clients with no SNI — virtually never needed
Security policy e.g. TLSv1.2_2021, TLSv1.2_2019 TLSv1.2_2021 Sets the minimum viewer TLS version + cipher suite
Origin protocol / SSL (custom origin) HTTP-only / HTTPS-only / match-viewer; min origin SSL TLSv1.2 The leg between CloudFront and the origin — keep at 1.2

Each alternate domain name must be globally unique across all CloudFront distributions in all accounts — you cannot have cdn.example.com on two distributions. The certificate must cover every alternate domain name (an exact match or a matching wildcard such as *.example.com).

Edge compute: CloudFront Functions vs Lambda@Edge

CloudFront lets you run your own code on requests/responses at the edge. There are two offerings and choosing correctly matters.

CloudFront Functions Lambda@Edge
Runtime JavaScript (ECMAScript 5.1-ish, CloudFront JS engine) Node.js / Python (full Lambda runtimes)
Trigger events Viewer request, Viewer response only Viewer request/response, Origin request/response (all 4)
Where it runs The edge location itself (closest tier) Regional edge cache tier
Scale / latency Millions of req/s, sub-millisecond, ~1–2 ms High, but ~5–30 ms cold/warm
Max execution time < 1 ms CPU budget 5 s (viewer) / 30 s (origin)
Memory / package 2 MB, no external network/disk Up to 10 GB memory, can call AWS APIs, larger package
Network/AWS SDK access No network, no file system Yes — can call S3/DynamoDB, do crypto, etc.
Cost ~1/6th the price of Lambda@Edge Higher (Lambda pricing at the edge)

The decision rule:

// CloudFront Function (viewer request): rewrite /blog -> /blog/index.html
function handler(event) {
  var req = event.request;
  if (req.uri.endsWith('/')) req.uri += 'index.html';
  else if (!req.uri.includes('.')) req.uri += '/index.html';
  return req;
}

Both attach to a cache behaviour via function associations; a behaviour can have at most one function per event, and you cannot put a CloudFront Function and a Lambda@Edge on the same event (but you can mix across events, e.g. a Function on viewer-request and Lambda@Edge on origin-response).

Signed URLs and signed cookies

For private/paid content you do not want anyone with the URL to fetch it. Turn on Restrict viewer access on the behaviour and serve content only via signed URLs or signed cookies.

A policy (canned for the simple “expires at” case, or custom for IP restrictions, a date range, and wildcards) is signed. Signing requires either:

Mechanism What Recommendation
Trusted key groups Public keys you upload to CloudFront; you sign with the matching private key Recommended — managed without root, multiple keys, rotatable
Trusted signer (legacy) The AWS account root’s CloudFront key pair Legacy; avoid — requires root credentials

Flow: your application authenticates the user, then your backend (or a Lambda@Edge) creates the signed URL/cookie with the private key; CloudFront verifies the signature with the public key in the key group and enforces the expiry/conditions at the edge. The private key never leaves your control.

Gotcha: signed URLs/cookies protect the CloudFront path. They are pointless if the origin is also publicly reachable — combine them with OAC (S3) or origin lock-down (custom) so the only way in is through CloudFront.

WAF, geo-restriction and Shield

CloudFront is the natural enforcement point for edge security because it terminates the viewer connection before anything reaches your origin.

AWS WAF. Attach a WAF web ACL of CLOUDFRONT scope (these are created in us-east-1) to the distribution to inspect every request and allow/block/count/CAPTCHA/challenge. WAF gives you managed rule groups (Core/OWASP Top 10, Known Bad Inputs, SQLi, IP reputation, anonymous-IP), rate-based rules (block an IP doing more than N requests / 5 min — your cheapest layer-7 DDoS mitigation), geo-match, custom string/regex/size rules and Bot Control. WAF runs before the cache, so blocked requests never even consult the cache or origin.

Geo-restriction. CloudFront has a built-in geo-restriction (separate from WAF): an allow-list or block-list of countries, enforced at the edge using a GeoIP database. Use it for simple “only serve these countries” / “block these countries” licensing or compliance rules. For anything finer (region + other conditions, or returning a custom block page), use WAF’s geo-match rule instead. Built-in geo-restriction is free; WAF geo rules cost per request evaluated.

AWS Shield. Shield Standard is on automatically and free for all CloudFront distributions, absorbing common network/transport (L3/L4) DDoS attacks at the edge. Shield Advanced (paid subscription) adds enhanced detection, 24/7 access to the Shield Response Team, WAF charges included, cost-protection credits for scaling during an attack, and L7 attack visibility. Fronting your application with CloudFront is itself a DDoS-mitigation best practice because the global edge absorbs and disperses volumetric traffic.

Logging and monitoring

Option What it captures Destination Notes
Standard logs (v2) Every request (detailed access log fields) S3, CloudWatch Logs, or Kinesis Data Firehose Delivered within minutes; the v2 pipeline added CW Logs/Firehose targets and field selection
Real-time logs Sampled/100% requests, chosen fields, per behaviour Kinesis Data Streams Sub-second; for live dashboards/alerting
CloudWatch metrics Requests, bytes, 4xx/5xx error rates, cache hit rate (additional metrics opt-in) CloudWatch Alarm on origin 5xx and on a falling cache-hit ratio
CloudFront Functions metrics Invocations, compute utilisation, errors CloudWatch Watch the 1 ms compute budget
AWS WAF logs Matched rules, action per request S3 / CW Logs / Firehose Separate from CloudFront access logs

The metric to watch for performance is the cache hit ratio — a falling ratio almost always means a cache key got too specific (a stray header or cookie in the policy). The metric to watch for health is origin latency and 5xx rate.

Amazon CloudFront deep dive

The diagram traces a request from the viewer through the nearest edge location, the regional edge cache and optional Origin Shield to the origin (S3 via OAC, or a custom origin), and overlays where each control lives — cache behaviours and policies at the edge, edge functions on the four trigger events, WAF/geo/Shield in front, and signed URLs/OAC protecting the path end to end.

Hands-on lab: a private S3 site behind CloudFront with OAC

This lab creates a private S3 bucket, fronts it with a CloudFront distribution using OAC, serves a page, then invalidates and cleans up. It stays within the AWS Free Tier (CloudFront’s perpetual free tier includes 1 TB data transfer out and 10,000,000 HTTPS requests per month; S3 storage of a few KB is negligible).

1. Create a private bucket and upload a page (use your own globally-unique name):

BUCKET=kloudvin-cf-lab-$RANDOM
REGION=ap-south-1
aws s3api create-bucket --bucket "$BUCKET" --region "$REGION" \
  --create-bucket-configuration LocationConstraint="$REGION"
printf '<h1>Hello from CloudFront + OAC</h1>' > index.html
aws s3 cp index.html "s3://$BUCKET/index.html" --content-type text/html
# Bucket stays fully private — Block Public Access remains on (the default).

2. Create the OAC:

OAC_ID=$(aws cloudfront create-origin-access-control \
  --origin-access-control-config \
  "Name=lab-oac,SigningProtocol=sigv4,SigningBehavior=always,OriginAccessControlOriginType=s3" \
  --query 'OriginAccessControl.Id' --output text)
echo "OAC: $OAC_ID"

3. Create the distribution referencing the bucket REST endpoint, the OAC and the managed CachingOptimized policy. Build the config and create:

cat > dist.json <<JSON
{
  "CallerReference": "lab-$RANDOM",
  "Comment": "oac lab",
  "Enabled": true,
  "DefaultRootObject": "index.html",
  "Origins": { "Quantity": 1, "Items": [{
    "Id": "s3-lab",
    "DomainName": "$BUCKET.s3.$REGION.amazonaws.com",
    "OriginAccessControlId": "$OAC_ID",
    "S3OriginConfig": { "OriginAccessIdentity": "" }
  }]},
  "DefaultCacheBehavior": {
    "TargetOriginId": "s3-lab",
    "ViewerProtocolPolicy": "redirect-to-https",
    "CachePolicyId": "658327ea-f89d-4fab-a63d-7e88639e58f6",
    "Compress": true
  }
}
JSON
DIST_JSON=$(aws cloudfront create-distribution --distribution-config file://dist.json)
DIST_ID=$(echo "$DIST_JSON" | python3 -c 'import sys,json;print(json.load(sys.stdin)["Distribution"]["Id"])')
DIST_DOMAIN=$(echo "$DIST_JSON" | python3 -c 'import sys,json;print(json.load(sys.stdin)["Distribution"]["DomainName"])')
echo "Distribution: $DIST_ID  ->  https://$DIST_DOMAIN"

4. Attach the bucket policy allowing only this distribution (substitute your account ID):

ACCT=$(aws sts get-caller-identity --query Account --output text)
cat > bp.json <<JSON
{ "Version":"2012-10-17","Statement":[{
  "Sid":"AllowCloudFrontOAC","Effect":"Allow",
  "Principal":{"Service":"cloudfront.amazonaws.com"},
  "Action":"s3:GetObject","Resource":"arn:aws:s3:::$BUCKET/*",
  "Condition":{"StringEquals":{
    "AWS:SourceArn":"arn:aws:cloudfront::$ACCT:distribution/$DIST_ID"}}}]}
JSON
aws s3api put-bucket-policy --bucket "$BUCKET" --policy file://bp.json

5. Validate. The distribution takes a few minutes to deploy. Then:

# Wait until Deployed, then fetch through CloudFront (expect 200 + your HTML):
aws cloudfront wait distribution-deployed --id "$DIST_ID"
curl -s "https://$DIST_DOMAIN/" ; echo
# Prove the bucket itself is private (expect 403 AccessDenied):
curl -s -o /dev/null -w '%{http_code}\n' "https://$BUCKET.s3.$REGION.amazonaws.com/index.html"

You should see your HTML through the CloudFront domain and a 403 when hitting S3 directly — exactly the OAC objective.

6. Update + invalidate:

printf '<h1>Updated via invalidation</h1>' > index.html
aws s3 cp index.html "s3://$BUCKET/index.html" --content-type text/html
aws cloudfront create-invalidation --distribution-id "$DIST_ID" --paths "/index.html"

Cleanup (a distribution must be disabled before it can be deleted):

# Disable: fetch config + ETag, flip Enabled=false, update, wait, delete.
ETAG=$(aws cloudfront get-distribution-config --id "$DIST_ID" --query ETag --output text)
aws cloudfront get-distribution-config --id "$DIST_ID" --query DistributionConfig > dc.json
python3 -c 'import json;d=json.load(open("dc.json"));d["Enabled"]=False;json.dump(d,open("dc.json","w"))'
ETAG=$(aws cloudfront update-distribution --id "$DIST_ID" --distribution-config file://dc.json --if-match "$ETAG" --query ETag --output text)
aws cloudfront wait distribution-deployed --id "$DIST_ID"
aws cloudfront delete-distribution --id "$DIST_ID" --if-match "$ETAG"
aws cloudfront delete-origin-access-control --id "$OAC_ID" \
  --if-match "$(aws cloudfront get-origin-access-control --id "$OAC_ID" --query ETag --output text)"
aws s3 rm "s3://$BUCKET" --recursive
aws s3api delete-bucket --bucket "$BUCKET"

Cost note. Everything above is within the CloudFront and S3 free tiers for a quick lab — pennies at most from a few requests and KB of storage. The one cost trap is leaving a distribution enabled; while it serves no traffic it costs nothing for data transfer, but always disable+delete lab distributions so a forgotten one can’t later serve traffic on your dime. Invalidations beyond 1,000 paths/month and Origin Shield/real-time logs/WAF are the line items that add cost in production.

Common mistakes & troubleshooting

Symptom Likely cause Fix
403 on every object from an OAC S3 origin Bucket policy missing the AWS:SourceArn allow, or used the website endpoint, or SSE-KMS key policy doesn’t allow CloudFront Use the REST endpoint, attach the OAC bucket policy, and grant cloudfront.amazonaws.com on the KMS key
Custom domain cert not showing in the dropdown ACM certificate is not in us-east-1 Re-request/import the cert in us-east-1
Cache hit ratio collapsed after a change A header/cookie/query string was added to the cache policy, fragmenting the key Move “must forward but shouldn’t cache on” values to the origin request policy
502/504 from CloudFront Origin SSL/cert mismatch, origin timeout, or wrong origin protocol Check origin’s cert/SNI, raise response timeout, set protocol to match-viewer/HTTPS-only correctly
New deploy still serves old content Cached within TTL; no invalidation and stable filenames Versioned filenames for assets + invalidate the HTML entry points
ALB origin returns 403/wrong host CloudFront forwards its own Host header; ALB rejects it Use the AllViewerExceptHostHeader origin request policy
Cannot delete distribution It is still Enabled Set Enabled=false, wait for Deployed, then delete
Direct origin hit bypasses CloudFront/WAF Origin still publicly reachable Lock down with OAC (S3) or a secret header + CloudFront prefix list SG rule (custom)

Best practices

Security notes

CloudFront is where most of your edge security posture is set. Always redirect HTTP to HTTPS (or HTTPS-only for APIs) so viewers never transit in clear text, and keep the minimum TLS at 1.2+. Lock the origin so it can only be reached via CloudFront — OAC for S3 (and remember the KMS key-policy grant for SSE-KMS buckets), and a secret verification header validated by the ALB/WAF plus a security-group rule scoped to the com.amazonaws.global.cloudfront.origin-facing managed prefix list for custom origins. Use signed URLs/cookies with trusted key groups (not the legacy root signer) for private content, and pair them with origin lock-down so the protection can’t be bypassed. Attach an AWS WAF CLOUDFRONT-scope web ACL with the AWS managed rule groups, a rate-based rule as cheap L7 DDoS mitigation, and Bot Control if you face scraping; Shield Standard is automatic, and Shield Advanced is worth it for high-value, attack-prone properties. Finally, scope IAM tightly: cloudfront:CreateInvalidation and distribution-config changes are powerful, and field-level encryption keeps sensitive POSTed fields unreadable to your own app tier when only a downstream service should see them.

Interview & exam questions

  1. What is the difference between OAC and OAI, and why is OAC preferred? Both restrict an S3 origin to CloudFront only. OAC uses SigV4, supports SSE-KMS objects, all Regions and all HTTP methods; OAI is legacy SigV2 and cannot read KMS-encrypted objects. Always use OAC for new distributions.

  2. Where must a custom-domain TLS certificate live for CloudFront, and why? In ACM in us-east-1. CloudFront is a global service whose certificate store is in us-east-1; a cert in any other Region will not appear in the distribution. (Regional services like ALB use a cert in their own Region.)

  3. Explain the cache key and how to fix a low cache-hit ratio. The cache key is the set of values (host, path, plus chosen query strings/headers/cookies) CloudFront uses to identify “the same object”. A low hit ratio usually means the cache policy includes too much; move “forward to origin but don’t vary cache on” values into the origin request policy instead.

  4. Cache policy vs origin request policy vs response headers policy? Cache policy = cache key + TTLs (and what’s cached on); origin request policy = what is forwarded to the origin on a miss, independent of the key; response headers policy = headers CloudFront adds to the viewer response (security/CORS) without touching the origin.

  5. When would you use an origin group? For origin failover: a primary and a secondary origin where CloudFront retries the secondary on connection errors or configured failover status codes — but only for idempotent GET/HEAD/OPTIONS, and only behind a single front door (Route 53 handles front-door/Region failover).

  6. CloudFront Functions vs Lambda@Edge — when each? Functions: JavaScript, viewer request/response only, sub-millisecond, no network — header/URL rewrites, simple auth, cache-key normalisation. Lambda@Edge: Node/Python, all four triggers incl. origin-side, can make AWS/network calls, longer runtime — dynamic origin selection, lookups, image processing.

  7. How do invalidations differ from versioned object names, and what does an invalidation cost? Invalidations evict cached paths (first 1,000 paths/month free, then ~$0.005 each); versioned filenames are free because the new URL simply isn’t cached yet. Best practice: immutable hashed asset names + invalidate only HTML; a /* invalidation counts as one path but wipes the whole cache.

  8. How do you stop the public from bypassing CloudFront and hitting the origin directly? For S3, OAC + a bucket policy scoped to the distribution ARN (Block Public Access on). For ALB/custom, a secret custom header CloudFront adds, validated by an ALB/WAF rule, plus a security group limited to the CloudFront origin-facing prefix list.

  9. What’s the TTL precedence between the origin headers and the cache policy? CloudFront honours the origin’s Cache-Control: max-age/s-maxage (or Expires), clamped to the policy’s Min and Max TTL; if the origin sends no cache header, the Default TTL applies; no-cache/no-store/private make it effectively uncacheable.

  10. Signed URL vs signed cookie? A signed URL grants time-limited access to one file (great for downloads and players that won’t send cookies); signed cookies grant access to many files by path pattern without changing URLs (great for whole-library streaming). Both should use trusted key groups, not the legacy root signer.

  11. What does Origin Shield do? It adds a single designated caching tier in a Region of your choice between the regional edge caches and the origin, increasing the cache-hit ratio at the origin and reducing origin load — useful when many edges/RECs hit one origin, at a small per-request cost.

  12. Which DDoS/WAF protections does CloudFront give you? Shield Standard is automatic and free (L3/L4). Attach an AWS WAF CLOUDFRONT-scope web ACL for L7 — managed rule groups, rate-based rules, geo-match, Bot Control. Built-in geo-restriction is a free allow/block by country; Shield Advanced adds the response team, cost protection and enhanced detection.

Quick check

  1. You need a custom domain on CloudFront over HTTPS — in which Region must the ACM certificate be?
  2. Your cache-hit ratio just dropped after adding a value to a policy. Which policy did you likely change, and where should that value go instead?
  3. You must restrict an S3 origin so it is reachable only through CloudFront and the bucket holds SSE-KMS objects — name the two policies you must get right.
  4. You need an edge function that looks up a redirect target in DynamoDB on a cache miss — Functions or Lambda@Edge, and on which trigger?
  5. You deployed new app.js content but users still get the old one. Give the two standard ways to fix freshness and which is free.

Answers

  1. us-east-1 — CloudFront’s certificate store is global but lives in N. Virginia; a cert elsewhere won’t appear.
  2. You changed the cache policy (it fragmented the cache key). Move “forward but don’t cache on” values to the origin request policy.
  3. The S3 bucket policy (allow cloudfront.amazonaws.com scoped to the distribution AWS:SourceArn) and the KMS key policy (allow kms:Decrypt/GenerateDataKey for the same principal/condition).
  4. Lambda@Edge on the origin-request trigger (it needs an AWS/network call, which Functions can’t make and which only fires origin-side).
  5. Versioned/hashed object names (free) and invalidations (first 1,000 paths/month free). Versioned names are the free, preferred default for assets.

Exercise

Stand up a small two-behaviour distribution that demonstrates the full toolkit, then prove each control:

  1. Create a private S3 origin with OAC for static assets (default behaviour *, CachingOptimized) and add a second origin — an HTTP API / ALB — bound to an api/* behaviour with CachingDisabled + AllViewerExceptHostHeader.
  2. Attach a response headers policy that adds HSTS and X-Content-Type-Options: nosniff, and confirm the headers appear on the static path with curl -I.
  3. Add a CloudFront Function on viewer-request that rewrites directory URIs (/about/) to /about/index.html, and verify it works.
  4. Request an ACM certificate in us-east-1, attach a custom domain (CNAME), and serve over HTTPS with TLSv1.2_2021.
  5. Attach a WAF web ACL with the Core managed rule group and a rate-based rule (e.g. 2,000 req / 5 min), then confirm a flood gets blocked while normal traffic passes.
  6. Bonus: configure an origin group with a secondary replica bucket and force a failover by making the primary return a 503.

Certification mapping

Exam Objective area this supports
SAA-C03 (Solutions Architect – Associate) Design high-performing and resilient architectures — CDN/edge caching, origin failover (origin groups), TLS/ACM, OAC origin protection, and choosing CloudFront + S3/ALB for global delivery; Design secure architectures — WAF, geo-restriction, Shield and signed URLs at the edge.
DVA-C02 (Developer – Associate) Develop, deploy and secure with AWS services — cache behaviours and policies, invalidations vs versioned objects, CloudFront Functions/Lambda@Edge, signed URLs/cookies, and OAC for serving private S3 content from applications.

Glossary

Next steps

Now that you know CloudFront end to end, move from the single distribution to the defended global front door:

Then continue the course into AWS KMS & Encryption, In Depth to master the encryption that protects content at the origin and the keys CloudFront’s OAC must be granted to use.

AWSCloudFrontCDNCachingOACEdge
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading