AWS Enterprise Architecture: Media Streaming / VOD

Video-on-demand on AWS goes wrong in a very recognisable way. A team uploads an MP4 to S3, slaps CloudFront in front of it, hands the URL to the app, and ships. It works in the demo. Then reality arrives: a viewer on hotel Wi-Fi gets endless buffering because there is exactly one bitrate, an iPhone refuses to play because the file isn’t fragmented for HLS, the content team discovers anyone who once had the link can share it forever, and the first DMCA email lands because there is no entitlement on the asset at all. None of that is CloudFront’s fault. It is the absence of an architecture — a deliberate separation between the mezzanine source, the transcode tier that produces an adaptive ladder, the packaging layer that speaks HLS and DASH, the edge that delivers globally, and the entitlement layer that decides who may watch what. This article is that architecture, built end to end on AWS Elemental MediaConvert, Amazon S3, AWS Elemental MediaPackage, Amazon CloudFront, and CloudFront signed URLs/cookies.

The single most important idea here is that a VOD asset is not a file you serve; it is a pipeline output you protect. One uploaded mezzanine becomes an adaptive ladder of renditions, those renditions become time-aligned segments and manifests, those segments are cached at the edge, and every request for them is gated by a short-lived, cryptographically signed token tied to a viewer’s entitlement. Get that pipeline right and the system is boring in the best way: a phone, a 4K TV, and a laptop on bad Wi-Fi each get the best stream their connection can sustain; nothing playable leaks; and you can re-package or re-secure the whole catalogue without ever asking a creator to re-upload.

The business scenario

Picture an organisation that has video but no reliable, secure, scalable way to deliver it to a screen. This is the same shape at 5 titles and at 50,000.

The early version: a content owner — a training company, a sports league, a media startup, a university, a corporate comms team — has a library of source files sitting in a drive or a bucket. Today they “stream” by handing out a direct download link or embedding a single-bitrate file. It plays acceptably on the office network and falls apart everywhere else. There is no quality adaptation, no device coverage, no access control, and no telemetry — they cannot even tell whether anyone watched to the end.

Then the requirements that a static file structurally cannot satisfy start arriving, and they are the requirements the business actually cares about:

Subscription media / OTT: “We’re launching a paid streaming service. A viewer on a 4K TV and a viewer on a train must both get a watchable stream, playback must start in under two seconds, and a non-subscriber — or someone whose trial expired ten minutes ago — must not be able to play, even if they captured a URL yesterday.”
Corporate / training (LMS): “All-hands recordings and compliance courses must play only for logged-in employees, never appear on the public internet, and we need to prove who completed which module.”
Sports / events: “We post match replays. Traffic is spiky — near-zero between events, then a million views in an hour after the final whistle. It has to scale to that with zero pre-provisioning and not bankrupt us at idle.”
Education / publishing: “Lectures are licensed per-institution. Geography and entitlement matter; a paywalled lecture cannot be hotlinked into someone else’s site.”

Every one of these shares the same structural requirements, and they are the requirements that define the architecture: one source file must be turned into a device-agnostic, bandwidth-adaptive set of streams, delivered globally with low start-up latency, and every byte of it must be gated by a verifiable entitlement that expires. Adaptive bitrate (ABR) is non-negotiable — without a ladder of renditions and a manifest, the player cannot downshift on a bad connection or upshift on a good one. Packaging is non-negotiable — Apple devices want HLS, much of the web and Android want DASH, and you do not want to store the catalogue twice. Entitlement is non-negotiable — an open CloudFront URL is a public URL forever.

The scale-invariance is why this belongs in an architecture center. A five-title startup runs this with on-demand MediaConvert jobs, a single S3 bucket, MediaPackage VOD packaging, one CloudFront distribution, and a CloudFront key group for signing. A global OTT platform runs the identical topology with thousands of concurrent MediaConvert jobs across queues, multi-Region S3 with replication, MediaPackage at fleet scale, CloudFront with hundreds of edge locations and Origin Shield, and DRM layered on top of the same signed-URL gate. The shape — mezzanine → transcode → package → edge → entitlement — never changes. Only the dials move.

The promise to the business: a source file becomes a secure, adaptive, globally playable stream; idle costs almost nothing and peaks absorb themselves; and the entire catalogue can be re-encoded or re-secured without going back to the creators.

Architecture overview

The architecture is a content-preparation pipeline feeding a protected delivery edge. Read it as two halves joined by S3 — an asynchronous ingest-and-prepare path, and a synchronous request-and-deliver path — with an entitlement service straddling the boundary.

Stage 1 — Mezzanine ingest (S3). A source (“mezzanine”) file lands in a private S3 source bucket — uploaded by a CMS, a multipart upload from a desktop tool, or pushed from on-prem. This is the high-quality master: a ProRes, a high-bitrate H.264/H.265 MP4, whatever the creator produced. The bucket is private (Block Public Access on), encrypted, and the arrival of an object is the trigger for everything downstream. Nothing here is ever served to viewers directly.

Stage 2 — Transcode into an adaptive ladder (MediaConvert). An S3 ObjectCreated event (via EventBridge → Lambda, or Step Functions) submits an AWS Elemental MediaConvert job. MediaConvert is the file-based, broadcast-grade transcoder: from the one mezzanine it produces a bitrate ladder — e.g. 240p, 360p, 480p, 720p, 1080p, 2160p — each at a target bitrate, plus audio renditions and extracted captions/thumbnails. Crucially, it produces segmented output (fragmented MP4 / CMAF or TS) with aligned segment boundaries and a manifest, which is what makes adaptive streaming possible: the player can switch renditions at any segment boundary. The job writes all of this to a private S3 packaged/output bucket. MediaConvert can emit HLS and DASH directly; in this reference we deliberately have it emit a CMAF/fMP4 ladder and let MediaPackage do the protocol packaging, so we store one set of media and serve many protocols.

Stage 3 — Just-in-time packaging (MediaPackage VOD). The transcoded ladder in S3 is registered as a MediaPackage VOD asset against a packaging group/configuration. MediaPackage repackages the stored CMAF segments just-in-time into whatever protocol the requesting player needs — HLS for Apple, DASH for the web/Android, CMAF, even Microsoft Smooth — from a single stored copy. It is also where content protection is centralised: MediaPackage can apply encryption and integrate with a DRM provider (via AWS Elemental MediaPackage + SPEKE) so the same asset can be served clear, AES-encrypted, or full-DRM (Widevine/PlayReady/FairPlay) without re-transcoding. MediaPackage becomes the origin for the edge.

Stage 4 — Global delivery (CloudFront). Amazon CloudFront sits in front of MediaPackage as the CDN. It caches manifests and segments at edge locations close to viewers, so a popular title is served from the edge rather than hammering the origin, and start-up latency is low worldwide. CloudFront connects to the MediaPackage origin securely (origin access + a shared secret header), and Origin Shield can be enabled to give MediaPackage a single consolidated caching layer and a higher offload ratio. CloudFront is also where the entitlement check is enforced at the edge.

Stage 5 — Entitlement (CloudFront signed URLs / signed cookies). This is the gate. The viewer’s player never gets a bare CloudFront URL. Instead, after the app authenticates the user and confirms their entitlement (active subscription, valid license, allowed geography), a small entitlement service (API Gateway + Lambda, or your app backend) mints a CloudFront signed URL (for a single manifest/file) or, more commonly for HLS/DASH, a signed cookie (which covers the manifest and all the segment requests that follow). The signature is produced with a private key whose public key is registered in a CloudFront key group; the distribution requires signed requests, so CloudFront itself rejects any request without a valid, unexpired signature at the edge, before it ever reaches MediaPackage. The token is short-lived (minutes to a session) and can be scoped by URL path and even by IP. No valid signature, no playback — full stop.

The end-to-end data path, following one title from upload to a viewer pressing play:

Ingest: a producer uploads lecture-attention.mov to the private S3 source bucket. The ObjectCreated event fires.
Prepare: EventBridge routes the event to a Lambda that submits a MediaConvert job using a saved job template (the ladder definition). MediaConvert transcodes the master into a CMAF ladder (240p→1080p), extracts captions and a thumbnail, and writes segments + a base manifest to the private output bucket. On COMPLETE, MediaConvert emits an EventBridge event.
Register: a second Lambda, triggered by the MediaConvert COMPLETE event, creates a MediaPackage VOD asset pointing at the S3 output, associated with the packaging group, and records the asset’s playback endpoint URLs (HLS + DASH) in the catalogue database (DynamoDB). The title is now “publishable.”
Watch — entitlement: a logged-in subscriber opens the title. The app calls the entitlement service, which checks the subscription/license/geo, then returns a signed cookie (or signed URL) plus the CloudFront playback URL for the title.
Watch — delivery: the player requests the HLS/DASH manifest from CloudFront with the signed cookie attached. CloudFront validates the signature against the key group at the edge; if valid and unexpired, it serves the manifest — from cache if warm, otherwise it fetches from the MediaPackage origin (which just-in-time packages from the stored CMAF in S3) and caches the result.
Watch — adaptation: the player reads the manifest, sees the ladder, and begins pulling segments — each segment request also carries the signed cookie and is also validated at the edge. As the viewer’s bandwidth changes, the player switches rendition at segment boundaries; on a train it drops to 360p, on fibre it climbs to 1080p, all from the same manifest.
Expire: the cookie expires at the end of the entitlement window. A captured URL or a copied cookie is useless minutes later. If the user’s subscription lapses mid-session, the next cookie refresh is denied and playback stops at the boundary.

The diagram, in words. Picture two horizontal bands joined in the middle by an S3 cylinder. On the left band (ingest/prepare, asynchronous): a producer/CMS uploads into a private S3 “source” bucket; an EventBridge clock-spark fires into a Lambda, which arrows into a large MediaConvert box (drawn with a stacked “ladder” icon — 240p…2160p); MediaConvert arrows down into a private S3 “packaged” bucket holding CMAF segments + manifest. A second Lambda (triggered by MediaConvert’s complete event) arrows into a MediaPackage VOD box, registering the asset and writing endpoint URLs into a DynamoDB catalogue. On the right band (deliver, synchronous): a viewer/player on the far right; between the player and the system sits an API Gateway → Lambda “entitlement” box wired to Cognito/IdP and the DynamoDB catalogue; the entitlement box hands back a signed cookie/URL. The player then arrows into a CloudFront cloud (edge locations, optional Origin Shield), which has a lock badge labelled “key group — verify signature at edge” on its inbound side, and arrows back to MediaPackage as its origin; MediaPackage reads from the S3 “packaged” bucket. Cross-cutting boxes underneath everything — IAM, KMS, CloudWatch, WAF (on CloudFront), AWS Organizations/SCP — touch every tier. The defining visual: a one-way prepare pipeline on the left, a protected delivery edge on the right, and a signed-token gate stamped on CloudFront’s front door.

AWS VOD reference architecture: an asynchronous prepare pipeline (S3 source → EventBridge → Lambda → MediaConvert → S3 packaged → MediaPackage → DynamoDB) feeding a synchronous protected delivery edge (Cognito/API Gateway entitlement → signed cookie → CloudFront with a key-group gate → MediaPackage origin), over a cross-cutting IAM/KMS/WAF/CloudWatch governance layer.

Component breakdown

Component	AWS service	Role in the pipeline	Key configuration choices
Mezzanine source	Amazon S3 (source bucket)	Private master/mezzanine store; arrival triggers the pipeline	Block Public Access on; SSE-KMS; multipart upload for large masters; EventBridge `ObjectCreated` notifications enabled; lifecycle to Glacier for cold masters.
Transcoder	AWS Elemental MediaConvert	One mezzanine → an adaptive bitrate ladder of segmented renditions + captions + thumbnails	Job templates + output presets define the ladder; QVBR rate control for quality-per-bit; CMAF/fMP4 output with aligned GOP/segment boundaries; on-demand vs reserved queues; accelerated transcoding for long-form.
Packaged store	Amazon S3 (output bucket)	Holds the stored CMAF segments + base manifest that MediaPackage repackages from	Private; SSE-KMS; lifecycle/Intelligent-Tiering; partitioned by asset id; the durable packaged copy.
Just-in-time packager / origin	AWS Elemental MediaPackage (VOD)	Repackages stored CMAF → HLS/DASH/CMAF on demand from one copy; central content protection point	Packaging group/configuration per protocol; SPEKE + DRM provider for Widevine/PlayReady/FairPlay or AES-128; the CDN origin.
Edge / CDN	Amazon CloudFront	Global caching delivery of manifests + segments; enforces the signature at the edge	Origin access + secret header to MediaPackage; Origin Shield for high offload; cache policy tuned for manifests (short TTL) vs segments (long TTL); field-level/HTTPS-only; WAF attached.
Entitlement (signing)	CloudFront signed URLs / signed cookies + key group	The gate: mints short-lived signed tokens tied to a viewer’s entitlement; CloudFront verifies them	Signed cookies for HLS/DASH (cover manifest + all segments); signed URL for single files; key group (public key) on the distribution; trusted key groups, not the legacy account-wide trusted signer.
Entitlement (logic)	API Gateway + Lambda, Cognito/IdP, DynamoDB	AuthN the viewer, check subscription/license/geo, then sign	Verify session/JWT; look up entitlement + catalogue; set short expiry; optionally bind to viewer IP; rate-limit the mint endpoint.
Orchestration	Amazon EventBridge, AWS Lambda, optional Step Functions	Event-drive the asynchronous pipeline (S3 → transcode → register)	EventBridge rules on S3 + MediaConvert state changes; Step Functions for multi-step/ret‑heavy workflows; DLQs on the Lambdas.
Cross-cutting	IAM, KMS, CloudWatch, WAF	Identity, encryption, observability, edge protection	Least-privilege roles per stage; CMKs on both buckets; CloudWatch + CloudFront real-time logs; WAF rate-based + geo rules.

A few component-level decisions carry disproportionate weight:

Signed cookies — not signed URLs — are usually the right gate for streaming. A VOD stream is not one request; it is a manifest request followed by hundreds of segment requests. A signed URL authorises a single file, so you would have to sign every segment — impossible, since the segment list lives inside the manifest the player parses at runtime. A signed cookie authorises a path (e.g. /v1/asset-1234/*) for a time window, so the manifest and every segment under it are covered by one token the player attaches automatically. Reach for signed URLs only for the rare single-file case (a downloadable MP4, a one-off); reach for signed cookies for adaptive streaming. This single choice is the difference between an entitlement gate that works and one that falls apart at the first segment.

Why MediaConvert emits CMAF and MediaPackage does the protocol packaging — instead of MediaConvert emitting HLS+DASH directly. MediaConvert can output HLS and DASH straight to S3, and for the simplest catalogues that is a perfectly valid, lower-cost topology (CloudFront → S3, no MediaPackage). But you then store and manage two packaged copies (HLS and DASH), you re-run jobs to add a protocol or change segment duration, and DRM/encryption is baked into the stored output. By emitting a single CMAF/fMP4 ladder and letting MediaPackage package just-in-time, you store the media once, serve every protocol from it, add or change packaging without re-transcoding, and centralise content protection/DRM at the packager. The reference favours the MediaPackage path because the moment you need multi-protocol + DRM + agility, it pays for itself; the “MediaConvert-direct-to-S3” path is the right de-scope for small, clear-content catalogues (covered in “When to use it”).

QVBR is the rate-control choice that quietly saves the most money. MediaConvert’s Quality-Defined Variable Bitrate targets a perceptual quality level and spends bits only where the content needs them — a static talking-head rendition uses far fewer bytes than a fast-motion sports clip at the same visual quality. Compared to fixed CBR, QVBR typically cuts storage and egress for the same quality, and egress (CloudFront) is the dominant cost in VOD. Define the ladder with QVBR + a max-bitrate cap per rendition and you get the best quality-per-byte without hand-tuning every title.

Origin Shield is the difference between a calm origin and a melted one at peak. Without it, every CloudFront edge location that gets a cache miss goes back to MediaPackage independently — so a viral title can hit the packager from dozens of edges at once. Origin Shield inserts a single consolidating cache between the edges and MediaPackage: misses collapse to one origin fetch, MediaPackage packages each segment once, and offload ratio climbs sharply. For spiky VOD (sports replays, launches) this is the cheap insurance that keeps the origin from being the bottleneck precisely when traffic spikes.

Implementation guidance

Region, accounts, and isolation. Put the media pipeline (S3 buckets, MediaConvert, MediaPackage) in the account that owns the workload, in a Region close to your editorial team and audience. In a multi-account org (AWS Organizations / Control Tower), keep delivery concerns — CloudFront, WAF, the entitlement service — cleanly separated from content prep, and consider a dedicated media account so the (large, sensitive) mezzanine and packaged buckets are governed apart from app workloads. CloudFront is global; the key-pair/private key used for signing is a secret and belongs in Secrets Manager / Parameter Store (SecureString), never in code or a public bucket.

Infrastructure as Code (Terraform sketch). Everything here is declarative; do not click jobs, packaging groups, or distributions into existence. The core resources and the wiring people most often get wrong:

# 1. Private buckets: mezzanine source (triggers pipeline) and packaged output.
resource "aws_s3_bucket" "source"   { bucket = "vod-source-${var.env}" }
resource "aws_s3_bucket" "packaged" { bucket = "vod-packaged-${var.env}" }

resource "aws_s3_bucket_public_access_block" "source" {
  bucket                  = aws_s3_bucket.source.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}
resource "aws_s3_bucket_notification" "source_events" {
  bucket      = aws_s3_bucket.source.id
  eventbridge = true                       # drive transcode off ObjectCreated via EventBridge
}

# 2. MediaConvert queue (use a RESERVED queue for steady volume; on-demand otherwise).
resource "aws_media_convert_queue" "vod" {
  name         = "vod-${var.env}"
  pricing_plan = "ON_DEMAND"
}
# (The ladder itself lives in a MediaConvert *job template* — CMAF/fMP4 outputs,
#  QVBR rate control, aligned segment duration, captions + thumbnails — referenced
#  by the orchestration Lambda when it submits each job.)

# 3. MediaPackage VOD packaging group + HLS/DASH configs (just-in-time, from one CMAF copy).
resource "aws_media_packagev2_channel_group" "vod" { name = "vod-${var.env}" }
# packaging_configuration(s) attach HLS and DASH (and DRM via SPEKE) to the group.

# 4. CloudFront key group — the public half of the signing key pair.
resource "aws_cloudfront_public_key" "signing" {
  name        = "vod-signing-${var.env}"
  encoded_key = file("${path.module}/keys/vod_signing_public.pem")
}
resource "aws_cloudfront_key_group" "signing" {
  name  = "vod-keys-${var.env}"
  items = [aws_cloudfront_public_key.signing.id]
}

# 5. CloudFront distribution: MediaPackage origin + Origin Shield + REQUIRE signed requests.
resource "aws_cloudfront_distribution" "vod" {
  enabled         = true
  is_ipv6_enabled = true

  origin {
    origin_id   = "mediapackage"
    domain_name = var.mediapackage_endpoint_host
    custom_origin_config {
      origin_protocol_policy = "https-only"
      http_port              = 80
      https_port             = 443
      origin_ssl_protocols   = ["TLSv1.2"]
    }
    custom_header {                                   # shared secret: only CF may hit the origin
      name  = "X-Origin-Secret"
      value = var.origin_secret
    }
    origin_shield {
      enabled              = true                     # consolidate misses; protect MediaPackage
      origin_shield_region = var.region
    }
  }

  default_cache_behavior {
    target_origin_id       = "mediapackage"
    viewer_protocol_policy = "redirect-to-https"
    allowed_methods        = ["GET", "HEAD", "OPTIONS"]
    cached_methods         = ["GET", "HEAD"]
    trusted_key_groups     = [aws_cloudfront_key_group.signing.id]  # <-- the gate
    cache_policy_id        = var.caching_optimized_policy_id
    compress               = true
  }

  restrictions { geo_restriction { restriction_type = "none" } }
  viewer_certificate { cloudfront_default_certificate = true }
  web_acl_id = var.waf_acl_arn
}

The high-value, frequently-missed lines: trusted_key_groups on the cache behaviour is what makes CloudFront reject any unsigned/expired request at the edge — omit it and your “secure” distribution is wide open. The custom_header shared secret (validated on the MediaPackage side) stops anyone from bypassing CloudFront and hitting the origin directly. origin_shield is the offload/scale lever for spiky catalogues. And the cache policy must distinguish manifests (short TTL, so a re-published title updates quickly) from segments (long/immutable TTL — segments never change, so cache them hard). The MediaConvert ladder lives in a job template referenced at submit time, so editorial can evolve the ladder without code changes.

The signing flow (entitlement Lambda, conceptual). The mint endpoint is small and is the security crux:

POST /play/{assetId}      (Authorization: Bearer <viewer JWT>)
  1. Verify the JWT (Cognito/IdP). Reject if invalid/expired.
  2. Look up entitlement: active subscription? license to THIS asset? allowed geo?  -> else 403
  3. Look up the asset's CloudFront playback path from the catalogue (DynamoDB).
  4. Build a CloudFront SIGNED COOKIE with a custom policy:
       Resource:  https://cdn.example.com/v1/<assetId>/*     (path-scoped: manifest + all segments)
       DateLessThan: now + 5 min  (short window; refreshed while the session is entitled)
       (optional) IpAddress: <viewer IP /32 or CIDR>
     Sign with the PRIVATE key (Secrets Manager); key id matches the CloudFront key group.
  5. Return Set-Cookie (CloudFront-Policy / -Signature / -Key-Pair-Id) + the playback URL.

The player then loads the manifest URL with credentials, and the browser attaches the cookie to the manifest and every segment under /v1/<assetId>/* automatically — one token, whole stream. The window is short on purpose: a leaked cookie dies in minutes, and a lapsed subscription is denied at the next refresh.

Networking and identity wiring.

CloudFront is the only public surface. S3 buckets and MediaPackage are never public. CloudFront reaches MediaPackage over HTTPS with the shared-secret header; MediaPackage reads the packaged S3 bucket via its service role. Block Public Access stays on for both buckets.
Least-privilege IAM, one role per stage. The orchestration Lambda gets mediaconvert:CreateJob + iam:PassRole for the MediaConvert role (which gets s3:GetObject on source, s3:PutObject on output, and kms on both). The register Lambda gets the MediaPackage create-asset permissions. The MediaConvert and MediaPackage service roles are distinct and scoped to exactly their buckets/keys. The entitlement Lambda gets secretsmanager:GetSecretValue on the signing key only and read on the catalogue table — and nothing that touches the media buckets.
KMS CMKs on both buckets; the MediaConvert and MediaPackage roles need explicit kms:Decrypt/GenerateDataKey grants — a forgotten KMS grant is the most common “the job submitted but failed reading the input” error.
WAF on CloudFront with a rate-based rule on the mint and playback paths (a single IP requesting thousands of tokens is an abuse signal) and geo rules where licensing requires them. Bot Control if scraping is a concern.

Schema/catalogue discipline. Keep a catalogue (DynamoDB) keyed by asset_id recording: source key, MediaConvert job id + status, MediaPackage asset id, the CloudFront playback paths for HLS and DASH, DRM flag, and publish state. The entitlement Lambda reads it; the app reads it; the register Lambda writes it. The playback URL handed to clients is always a CloudFront path under a per-asset prefix (/v1/<assetId>/...) so signed-cookie scoping is clean.

Enterprise considerations

Security & Zero Trust. The model is defence in layers, with the edge as the gate. (1) Entitlement at the edge: CloudFront with trusted key groups rejects every unsigned/expired request before it reaches the origin — the network is never trusted, every request carries a verifiable, short-lived token. (2) Private origins: both S3 buckets and MediaPackage are non-public; CloudFront authenticates to MediaPackage with a rotated shared secret; Block Public Access is enforced org-wide via SCP so nobody can accidentally expose a mezzanine. (3) Encryption everywhere: TLS in transit (HTTPS-only on the distribution and origin), KMS CMKs at rest on both buckets. (4) Content protection by tier: for premium content, layer DRM (Widevine/PlayReady/FairPlay via MediaPackage + SPEKE) on top of signed URLs — the signed token controls access to the stream, DRM controls use of the decrypted content (output protection, license rules); for most enterprise/training content, signed cookies + AES is sufficient. (5) Secret hygiene: the signing private key lives in Secrets Manager with rotation; the key group lets you rotate keys with zero downtime by registering the new public key alongside the old. (6) Abuse controls: WAF rate-limits token minting and playback; the mint endpoint requires a valid session. The throughline: no bare URLs, no public origins, short-lived tokens, and DRM where the content value warrants it.

Cost optimization. In VOD, CloudFront egress is almost always the dominant line item, so the levers target bytes delivered and bytes stored:

QVBR + a sane ladder. QVBR spends bits only where quality needs them, and not putting a 4K rung on content nobody watches in 4K is free money. Cap the top rung to your real audience; every avoided high-bitrate byte is saved egress and storage forever.
Maximise CloudFront offload. A high cache-hit ratio is the single biggest egress saver — long/immutable TTLs on segments (they never change), Origin Shield to collapse origin fetches, and CloudFront price classes if you can exclude the most expensive geographies. Cache hits are cheaper than origin fetches and spare MediaPackage’s per-request packaging cost.
MediaConvert queue choice. On-demand for spiky/low volume; reserved queues (RTS) for predictable steady throughput cut per-minute transcode cost substantially. Use accelerated transcoding judiciously — it speeds long-form but costs more per minute.
Store once. The CMAF-once + MediaPackage-JIT topology avoids storing HLS and DASH copies; S3 Intelligent-Tiering on the packaged bucket and Glacier on cold mezzanines trim the storage tail.
Right-size the pipeline, not the edge. Transcode is a one-time cost per title; delivery is forever. Spend engineering effort on cache-hit ratio and ladder design before micro-optimising transcode.

Scalability. Each half scales on its own axis. Ingest/transcode is embarrassingly parallel — MediaConvert runs many jobs concurrently (bounded by queue limits you can raise), so a catalogue backfill is “submit 10,000 jobs and wait,” not a capacity problem. Delivery scales with CloudFront, which is built for internet-scale fan-out; the spiky-traffic problem (a replay going viral) is absorbed by the edge + Origin Shield so MediaPackage packages each hot segment once regardless of how many viewers request it. The entitlement service scales as a normal stateless Lambda behind API Gateway. The governing question for “is the origin protected at peak?” is CloudFront cache-hit ratio and MediaPackage request rate — if hit ratio is high, a million concurrent viewers of one title cost the origin almost nothing.

Reliability & DR (RTO/RPO). Durability lives in S3: the mezzanine and packaged buckets are 11-nines durable, and the mezzanine is the true source of truth — if the packaged output or even MediaPackage assets are lost, you re-run the pipeline from the mezzanine and rebuild, so RPO for derived assets is effectively zero as long as masters are retained. For Regional DR, replicate the mezzanine (and optionally packaged) buckets with S3 Cross-Region Replication, and stand up MediaConvert/MediaPackage in the second Region; CloudFront is global and can failover between origins (an Origin Group) so a Regional origin outage fails over with no client change — RTO in minutes for delivery. Pin concrete numbers: delivery-tier failover RTO in minutes (CloudFront origin failover); full asset re-prepare RTO in the low hours per title via the pipeline; data-loss RPO ≈ 0 while masters are retained (CRR closes the Regional gap). The DLQs on the orchestration Lambdas guarantee a single failed transcode never silently strands a title — it lands in the DLQ for re-drive.

Observability. Watch the right signals per stage: MediaConvert job state changes (errored/complete via EventBridge), job duration, and queue depth (backlog = under-provisioned queue); MediaPackage request count and 4xx/5xx (origin health); CloudFront cache-hit ratio (the cost-and-scale canary), 4xx (a spike in 403s often means signing is broken or a key rotation went wrong), origin latency, and real-time logs for delivery analytics; the entitlement Lambda’s error rate and 403 rate (denied-entitlement vs bug). Wire CloudWatch alarms on cache-hit-ratio dropping, CloudFront 5xx rising, and MediaConvert errored jobs as the highest-signal pages. For playback quality (rebuffering, start-up time, errors) measured from the client, use CloudFront real-time logs joined with player-side QoE beacons — origin metrics alone don’t tell you what the viewer experienced.

Governance. Tag every resource by content-classification, owner, cost-center, and env. Enforce org-wide guardrails with SCPs: no public S3 buckets, no CloudFront distribution without WAF, no unencrypted media buckets. Content lifecycle is policy: mezzanine retention (keep masters → you can always rebuild), packaged-asset lifecycle, and takedown/expiry (an unpublish flag in the catalogue + cookie expiry removes access). Manage signing-key rotation and DRM-license policy centrally. Keep the MediaConvert job templates and MediaPackage packaging configs in version control so the ladder and protocols are auditable, reproducible artifacts — not console clicks.

Reference enterprise example

Stagelight is a fictional mid-market streaming startup: a niche sports-and-fitness VOD service with a catalogue of ~8,000 titles (match replays, training programmes, documentaries), ~120,000 subscribers, and brutally spiky traffic — a few thousand concurrent viewers most of the day, spiking to ~90,000 concurrent in the hour after a marquee event posts. Their MVP was a single 1080p MP4 per title behind CloudFront with public URLs. Buffering complaints flooded support, iPhone playback was flaky, and finance discovered (via a Reddit thread) that paywalled replays were being hotlinked freely. The board wanted adaptive playback, sub-two-second start, and real entitlement — without a per-event ops scramble.

What they built. They stood up the reference exactly as above:

Ingest: producers upload masters (high-bitrate H.264/ProRes) to a private S3 source bucket with Block Public Access on and SSE-KMS. ObjectCreated → EventBridge → a small Lambda submits a MediaConvert job.
Transcode: a MediaConvert job template produces a QVBR CMAF ladder — 240p, 360p, 480p, 720p, 1080p (no 4K rung; their content and audience didn’t justify it, saving egress on every stream) — plus AAC audio, WebVTT captions, and a poster thumbnail, with 4-second aligned segments, written to a private S3 packaged bucket. Steady backfill volume justified a reserved (RTS) queue; ad-hoc new titles use on-demand.
Package: on MediaConvert COMPLETE, a register Lambda creates a MediaPackage VOD asset in the packaging group (HLS + DASH configs) and writes the CloudFront playback paths into a DynamoDB catalogue. One CMAF copy serves both Apple (HLS) and web/Android (DASH); adding DASH later required zero re-transcoding.
Deliver: a CloudFront distribution fronts MediaPackage with a shared-secret origin header, Origin Shield enabled in their primary Region, long immutable TTLs on segments and short TTLs on manifests, and WAF with a rate-based rule on the mint and playback paths.
Entitle: an API Gateway + Lambda mint endpoint verifies the subscriber’s Cognito JWT, checks the active subscription and licence, and returns a CloudFront signed cookie scoped to /v1/<assetId>/* with a 5-minute expiry, refreshed while the session stays entitled. The signing private key lives in Secrets Manager; the public key is in a CloudFront key group, and the distribution requires trusted_key_groups.

The decisions that mattered. They explicitly chose signed cookies over signed URLs after a first cut tried to sign the manifest URL alone and every segment request came back 403 — the cookie, path-scoped to the whole asset, fixed it in one change. They chose the MediaPackage JIT path over MediaConvert-direct-HLS specifically so they could add DASH (and later evaluate DRM) without re-encoding 8,000 titles — a one-line packaging-config change instead of a multi-week, multi-thousand-dollar re-transcode. They dropped the 4K rung after analytics showed <1% of sessions on 4K-capable screens, trimming both storage and the egress bill. And they turned on Origin Shield after a load test of the post-event spike showed MediaPackage taking direct hits from dozens of edges at once; Shield collapsed those to single origin fetches and pushed cache-hit ratio past 96% during the spike.

The event that proved it. Three months in, a marquee fight replay posted at 22:00. Concurrency went from ~3,000 to ~88,000 in twelve minutes. CloudFront absorbed it: cache-hit ratio held at ~97%, so MediaPackage packaged each hot segment once and served the rest from cache/Shield; the origin barely moved. Start-up time stayed under two seconds at the p95, and players on poor connections silently rode the ladder down to 360p instead of buffering. Meanwhile the hotlinking simply stopped working — a copied URL without a fresh signed cookie returned 403 at the edge in milliseconds. No pre-provisioning, no 2 a.m. scaling call.

The outcome. Playback quality complaints fell by roughly 80%; device coverage went from “iPhone is flaky” to “plays everywhere”; paywall leakage went to effectively zero. Steady-state cost landed around $4,200/month dominated by CloudFront egress (~$2,600), with MediaPackage (~$500), MediaConvert reserved + on-demand (~$700), S3 (~$250), and the entitlement/Lambda/WAF/DynamoDB tier (~$150) — and crucially, idle cost between events is a few hundred dollars because transcode is one-time and delivery is pay-per-byte. The entire pipeline is ~700 lines of Terraform plus two small Lambdas and a job template; nobody hand-rolls ABR ladders, manifests, multi-protocol packaging, or token verification, because MediaConvert, MediaPackage, and CloudFront own all of it.

When to use it

Use this architecture when you must deliver pre-recorded video to many viewers, on many devices, with adaptive quality, low start-up latency, and real entitlement — and you want idle cost near zero while peaks absorb themselves. The sweet spot: subscription OTT and media, corporate comms and LMS/training, sports/event replays, education and e-learning, publishing, and any “this video must play well everywhere and only for people allowed to watch it” problem. It shines because content prep and delivery scale and fail independently, because CloudFront + Origin Shield turn viral spikes into a non-event, and because the signed-token gate keeps a paywalled catalogue genuinely paywalled.

Trade-offs to go in with eyes open. This is a multi-service media pipeline — MediaConvert ladders, MediaPackage packaging configs, CloudFront cache/signing behaviour, and key management each carry a learning curve; budget for that expertise. There is a real prepare latency: a freshly uploaded title isn’t instantly playable — it must transcode and register first (minutes to longer for long-form), so plan publish workflows around it. And egress can be expensive at scale — VOD economics live and die on cache-hit ratio and ladder discipline, so cost is something you engineer, not something that just happens.

Anti-patterns to avoid. Do not serve a single-bitrate file and call it streaming — without an ABR ladder + manifest, players cannot adapt and bad-network viewers just buffer. Do not use signed URLs for adaptive streaming — you cannot sign segments the manifest references at runtime; use signed cookies path-scoped to the asset. Do not leave the CloudFront distribution without trusted_key_groups thinking the app “won’t share the URL” — an open CloudFront URL is public forever. Do not make S3 or MediaPackage public to “simplify” — CloudFront is the only public surface, full stop. Do not store HLS and DASH as separate transcoded copies when MediaPackage can package both from one CMAF ladder. Do not skip Origin Shield for spiky catalogues — without it a viral title hammers the origin from every edge at once. And do not put a 4K (or even 1080p) top rung on content/audience that never uses it — every high-bitrate byte is paid for in storage and egress forever.

Alternatives, and when they win.

MediaConvert → S3 → CloudFront (drop MediaPackage) when your catalogue is clear content, single-or-dual protocol, and stable — MediaConvert outputs HLS (and/or DASH) straight to S3, CloudFront serves it, and you sign with the same key-group gate. Simpler and cheaper; you give up just-in-time multi-protocol agility and centralised DRM. The right de-scope for many training/internal-comms libraries.
Amazon IVS instead of this entire stack when your need is interactive live streaming (low-latency live, chat, real-time) rather than VOD — IVS is purpose-built for live; this reference is the file-based VOD answer. (For live-to-VOD, IVS can auto-record to S3, which then feeds this pipeline.)
MediaLive + MediaPackage (live) when you ingest live broadcast feeds; MediaConvert is file/VOD, MediaLive is the live encoder. Many platforms run both — MediaLive for the live event, this pipeline for the replay.
A third-party online video platform (OVP) — Mux, Brightcove, JW Player, Vimeo OTT — when you want player + analytics + DRM + delivery as one managed product and would rather not own the AWS plumbing. They win on time-to-market and bundled QoE analytics; you trade cost control, deep customisation, and data ownership. Choose by whether video delivery is core to your business (own it on AWS) or a feature of it (buy an OVP).
DRM (Widevine/PlayReady/FairPlay via MediaPackage + SPEKE) layered onto this reference — not an alternative but an upgrade — when content licensing mandates hardware-backed protection (premium studio/sports rights). Signed cookies alone control access; DRM controls use (output protection, license rules). Add it when the rights holders require it; for most enterprise content, signed cookies + AES is enough.

The decision rule in one line: if you have a library of pre-recorded video that must play adaptively on every device, scale to spikes without pre-provisioning, and stay genuinely gated to entitled viewers, this S3 → MediaConvert → MediaPackage → CloudFront pipeline with signed cookies is the AWS-native answer, and the retained mezzanine underneath it is what lets you rebuild or re-secure the whole catalogue without ever asking a creator to upload again.

AWS Enterprise Architecture: Media Streaming / VOD

The business scenario

Architecture overview

Component breakdown

Implementation guidance

Enterprise considerations

Reference enterprise example

When to use it

Written by Vinod

Comments

Keep Reading

The AWS Architecting Ladder: From a Static Site to Multi-Region Active-Active

The Azure Architecting Ladder: From a Simple Web App to Mission-Critical

Azure Architecture Case Studies: Real Proposal Walkthroughs (Easy → Complex)