A regional sports broadcaster — the kind that holds the streaming rights to a national football league and a cricket board, and whose entire business is a few dozen marquee weekends a year — gets handed a non-negotiable from its content-licensing counsel and its head of distribution at the same meeting. The league’s media-rights contract now mandates studio-grade content protection on every premium title: hardware-backed DRM, forensic watermarking, and a documented chain of custody from mezzanine ingest to the subscriber’s screen, with a clause that says a single confirmed piracy leak traceable to a missing control can trigger penalties and, on renewal, loss of the rights entirely. At the same time the product team wants a back-catalogue VOD service — full matches, condensed replays, documentaries — that has to play on a 4K living-room TV, an old Android phone, and an iPhone, all at once, and survive the traffic spike when a championship final drops at full time. A media server in a rack and a “just put it behind a login” plan will not clear the rights-holder’s audit. This article is the reference architecture for building that VOD platform properly on AWS — a transcoded, just-in-time-packaged, multi-DRM, identity-gated service that a content owner’s anti-piracy team and a CISO will both sign.
The pressures stack the way they always do in media. Rights compliance means every premium asset must be encrypted with hardware-backed DRM and the keys must never be reachable by anyone outside a tightly held boundary. Device reach means the same title has to play on Widevine devices (Android, Chrome), FairPlay devices (Safari, iOS, tvOS), and PlayReady devices (Edge, Xbox, smart TVs) — three incompatible DRM ecosystems, one asset. Scale means a quiet Tuesday catalogue and a Sunday-evening final that is 50× the baseline in fifteen minutes. And cost means egress and transcoding bills that have to stay rational when 80% of the catalogue is watched 5% of the time. The pattern that satisfies all four is transcode once, package and encrypt just-in-time, deliver from cache: you produce a single set of adaptive-bitrate renditions, and only at request time do you wrap them in the streaming format and DRM the specific device needs.
Why not the obvious shortcuts
The naive fixes each fail predictably, and naming why matters because someone on the project will propose all three.
A self-managed media server (an Nginx-RTMP or Wowza box) with HLS output and a password gives you no hardware DRM, no FairPlay/PlayReady/Widevine key exchange, and no path through the rights-holder’s audit — “behind a login” is access control, not content protection, and a logged-in user can still pull the unencrypted segments straight from the network tab. Pre-packaging every title into every format × every DRM combination at ingest multiplies your storage and your transcode bill by the number of device permutations, and the day you add a new packaging format or rotate a key you re-process the entire catalogue. A single-DRM strategy (Widevine only, say) simply locks out a third of your audience — every Apple device — which for a consumer sports product is commercially fatal.
The AWS media stack threads the needle. You transcode each title once into mezzanine-quality adaptive renditions, store those, and let MediaPackage package and encrypt them just-in-time at request — HLS+FairPlay for the iPhone, DASH+Widevine for the Android, DASH/HLS+PlayReady for the Xbox, all from the same stored renditions. Encryption keys come from a DRM platform over the SPEKE standard, so the keys live with the DRM provider and AWS only ever holds them transiently. And entitlement — is this subscriber allowed to watch this title right now — is enforced before a license is ever issued, gated on the subscriber’s identity.
Architecture overview
The platform runs two distinct paths that share storage but live on completely different schedules: an asynchronous ingest-and-transcode pipeline that turns a mezzanine master into streamable renditions, and a synchronous playback pipeline that authenticates a viewer, checks entitlement, packages-and-encrypts on the fly, and streams. Keeping them separate in your head is the first step to operating this well.
The defining property of the topology is the one the rights-holder’s auditor cares about most: the stored renditions are useless on their own, and the keys that unlock them never leave the DRM boundary. S3 holds only unencrypted-but-private mezzanine renditions behind Origin Access Control; MediaPackage encrypts at request time using a content key fetched over SPEKE from the DRM platform; and the actual license that contains the decryption key is only issued to a player after entitlement passes. No single stolen artifact — not an S3 object, not a CDN cache entry, not a manifest — yields playable premium video.
Ingest / transcode path, following the asset:
- A mezzanine master (a high-bitrate ProRes or mezzanine H.264 file from the production house) lands in an S3 ingest bucket. An S3 event fires.
- The event triggers a Step Functions workflow that submits an AWS Elemental MediaConvert job. MediaConvert transcodes the master into a single ABR ladder — say 240p through 2160p HEVC/H.264 renditions — with the GOP structure, segment boundaries, and audio renditions that downstream packaging needs. The output is a set of fragmented MP4 renditions written to an S3 origin bucket.
- The workflow optionally invokes a forensic-watermarking step (a partner pre-processor or MediaConvert’s overlay) so each session can later be traced, then writes asset metadata — title id, available renditions, languages, rights window — to DynamoDB, marking the asset ready to publish.
Playback path, following the control flow:
- A subscriber opens the app. The app federates identity through Okta as the consumer IdP (OIDC), which returns a token carrying the subscriber’s id and entitlement claims — subscription tier, geographic region, and which content windows they are allowed into. (Workforce access to the operations console federates the same way, brokered to Entra ID for the staff who run the platform.)
- The app requests a playback URL from the entitlement / playback API — your application on ECS Fargate behind an Application Load Balancer, fronted by API Gateway. The API validates the Okta JWT, checks DynamoDB for the title’s rights window and the subscriber’s tier, enforces concurrent-stream limits, and — critically — decides whether to mint a session. The few secrets it needs that are not IAM roles (the Okta introspection secret, the DRM platform’s API credentials, signing keys for CloudFront) come from HashiCorp Vault via the Vault Agent sidecar with IAM-backed auth, so nothing sensitive sits in a task definition or environment variable.
- If entitled, the API returns a MediaPackage endpoint URL signed with short-lived CloudFront signed cookies/URLs scoped to that asset and session. The player fetches the manifest through the CDN — CloudFront as the primary distribution, with Akamai as a second CDN for multi-CDN resilience and peak offload (more on why below).
- On a cache miss the CDN pulls from MediaPackage, which performs just-in-time packaging: it reads the stored renditions from the S3 origin and assembles the exact format the device asked for — HLS for Apple, DASH for the rest — and applies encryption. To get the content key it calls the DRM platform over SPEKE (Secure Packager and Encoder Key Exchange): MediaPackage sends a key request, the DRM key server returns the content key plus the per-DRM PSSH/
psshsignaling for Widevine, FairPlay, and PlayReady, and MediaPackage emits an encrypted manifest that signals all three DRM systems from one package. - The player detects which DRM its platform supports, extracts the license challenge, and calls the DRM platform’s license server directly — but that license request carries the session token, and the license server calls back to your entitlement API (or validates a signed entitlement message you embedded) before it releases the key. Entitlement is checked again at license issuance, so even a leaked manifest URL yields no playable key without a valid, current entitlement.
- The player decrypts inside the device’s hardware-backed Content Decryption Module (TEE / Secure Enclave) and renders. Segments stream from cache; the manifest, keys, and entitlement decisions are the only request-time work.
Component breakdown
| Component | Service / tool | Role in the platform | Key configuration choices |
|---|---|---|---|
| Identity / SSO | Okta + Entra ID | Consumer SSO (Okta OIDC) with entitlement claims; staff console federated to Entra | OIDC; tier/region/window claims; staff conditional access on Entra |
| Entitlement API | ECS Fargate + API Gateway | Validate token, check rights window, enforce concurrency, mint signed session | JWT validation; DynamoDB rights lookup; CloudFront signing |
| Secrets | HashiCorp Vault | DRM API creds, Okta introspection secret, CloudFront signing keys | IAM auth method; dynamic leases; Vault Agent sidecar |
| Transcode | AWS Elemental MediaConvert | Mezzanine → single ABR ladder (HEVC/H.264) | One ladder; QVBR; aligned GOP/segments for JIT packaging |
| Orchestration | Step Functions + Lambda | Drive ingest → transcode → watermark → publish | Retry/backoff per job; DLQ; status to DynamoDB |
| Storage | S3 (ingest + origin) | Mezzanine masters and transcoded renditions | Origin Access Control; private; lifecycle to IA/Glacier |
| Just-in-time packaging | AWS Elemental MediaPackage | Package + encrypt renditions per device at request | HLS + DASH endpoints; SPEKE encryption; origin = S3 |
| Multi-DRM keys | SPEKE → DRM platform | Content-key exchange for Widevine/FairPlay/PlayReady | SPEKE v2; CPIX key requests; rotate per content |
| Delivery | CloudFront + Akamai | Cache and stream segments globally; multi-CDN failover | Signed URLs/cookies; OAC to origin; Akamai as peak overflow |
| Catalogue / state | DynamoDB | Asset metadata, rights windows, session and concurrency state | On-demand capacity; TTL on sessions; GSI on title/window |
| CSPM / data posture | Wiz + Wiz Code | Cloud posture, public-exposure detection, IaC scanning pre-merge | Agentless scan of S3/MediaPackage; Wiz Code gate in PRs |
| Runtime security | CrowdStrike Falcon | Runtime protection on Fargate tasks and any EC2 transcode helpers | Sensor on tasks; detections to the SOC |
| Observability | Datadog (with Dynatrace option) | QoE metrics, traces, real-user playback telemetry | RUM SDK in players; APM on Fargate; rebuffer/start-time SLOs |
| ITSM / approvals | ServiceNow | Title-publish approvals (rights cleared), incidents, change gates | Change gate before a premium title goes live; auto-ticket on DRM failure |
| CI / IaC | GitHub Actions + Argo CD + Terraform | Build/test, GitOps deploy, infra as code | OIDC to AWS (no stored creds); Argo CD syncs the cluster |
A few of these choices deserve the why, because they are the ones teams get wrong.
Why just-in-time packaging, not pre-packaging. Pre-packaging into HLS+FairPlay, DASH+Widevine, and DASH+PlayReady at ingest means you store three encrypted copies of every rendition of every title, and you re-encrypt the world every time you rotate a key or add a format. MediaPackage packages and encrypts at request time from one stored ABR ladder, so storage stays linear in titles, key rotation is a configuration change, and adding a new device format is an endpoint, not a re-encode of the catalogue. The cost is a little request-time compute and a slightly warmer cache strategy — a trade worth making at catalogue scale.
Why SPEKE, and why three DRMs from one package. The reason a single asset can serve Apple, Android, and Xbox is Common Encryption (CENC): the segments are encrypted once with a content key, and the manifest signals multiple DRM systems, each of which can derive that key through its own license flow. SPEKE is the AWS-standard protocol by which the packager asks a DRM key provider for that content key and the per-system signaling (the Widevine/PlayReady PSSH boxes and the FairPlay key URI) in one CPIX exchange. The keys live with the DRM provider; AWS holds them only long enough to encrypt. This is exactly the property the rights-holder’s auditor is looking for — a documented boundary the streaming platform itself cannot cross.
Why multi-CDN — CloudFront and Akamai. A championship final is a synchronized demand spike where the whole audience presses play inside the same five minutes. Relying on one CDN means a single point of capacity, a single peering footprint, and a single failure domain at the exact moment the business cannot afford one. CloudFront is the primary (tight AWS integration, OAC to the S3/MediaPackage origin, cheap in-region egress); Akamai runs as a second CDN for peak offload and failover, with steering at the manifest/DNS layer so a CloudFront degradation or a regional saturation reroutes viewers rather than buffering them. Multi-CDN is insurance you buy specifically for the handful of weekends that are the entire business.
Implementation guidance
Provision with Terraform, and treat IAM and the origin lockdown as the first deliverable. Get the origin access wrong and you have published unencrypted renditions to the open internet; get the signing wrong and either nobody can play or anybody can.
- The S3 ingest and origin buckets, private, with Origin Access Control so only CloudFront (and MediaPackage’s origin role) can read them — never public.
- MediaConvert job templates encoding the single ABR ladder with QVBR and segment/GOP alignment that MediaPackage can package cleanly.
- The MediaPackage endpoints (one HLS, one DASH) with SPEKE encryption pointed at the DRM key provider’s SPEKE endpoint.
- CloudFront distributions with signed-URL/cookie key groups, OAC to the origins, and a cache policy tuned for segment immutability.
- The Fargate entitlement service, API Gateway, DynamoDB tables, and Step Functions workflow — with IAM roles scoped to exactly what each needs.
A minimal Terraform shape for a SPEKE-encrypted MediaPackage HLS endpoint communicates the intent — encryption is not optional, and the key comes from the DRM provider:
resource "aws_media_packagev2_origin_endpoint" "hls" {
channel_group_name = aws_media_packagev2_channel_group.vod.name
channel_name = aws_media_packagev2_channel.title.channel_name
origin_endpoint_name = "hls-fairplay"
container_type = "TS"
hls_manifests { manifest_name = "index" }
encryption {
encryption_method { ts_encryption_method = "SAMPLE_AES" } # FairPlay on TS
speke_key_provider {
resource_id = "title-1234" # per-asset key scope
url = var.drm_speke_endpoint_url # DRM platform SPEKE URL
role_arn = aws_iam_role.speke.arn # role MediaPackage assumes
drm_systems = ["FAIRPLAY"] # DASH endpoint lists WIDEVINE/PLAYREADY
}
}
}
The pipeline that applies this runs in GitHub Actions, authenticating to AWS via OIDC federation so there is no stored access key to leak, with Argo CD doing the GitOps sync of the entitlement service onto the cluster — desired state in git, the cluster reconciled to match, every change reviewable and revertable. Terraform owns the cloud infrastructure; Ansible configures any long-lived EC2 transcode helpers or appliance-style components (codec licensing, agent installation) where a container is not the right fit.
Identity and entitlement: federate the viewer, gate twice. Consumers authenticate through Okta over OIDC; the token carries the subscription tier, region, and rights-window claims the entitlement API consumes. Staff who operate the platform log in through Okta federated to Entra ID, picking up conditional-access policies and native AWS IAM-role mapping for the console. The entitlement check happens twice on purpose: once when the playback API mints the signed session URL, and again when the DRM license server is asked for the actual key — because a signed manifest URL can be shared, but a key released only against a live, validated entitlement cannot be replayed. This double gate is what makes “is this person allowed to watch this right now” enforceable at the moment that matters, not just at the front door.
Watermarking and traceability. For premium titles under a strict rights contract, encryption proves the stream was protected; forensic watermarking proves where a leak came from. Embed a session- or subscriber-derived watermark (via a MediaConvert overlay or a partner pre-processor in the Step Functions workflow) so a pirated copy recovered in the wild can be traced to the account that leaked it — the deterrent the rights-holder’s anti-piracy team actually asks for, beyond “it was encrypted.”
Enterprise considerations
Security & content protection. The architecture is defense-in-depth around the asset: private origins behind OAC, encryption-at-rest in S3, JIT encryption with keys that never leave the DRM boundary, hardware-backed CDM decryption on-device, signed and short-lived delivery URLs, and entitlement enforced at both session-mint and license-issuance. Layer on top: (a) Wiz running continuous CSPM and sensitive-data-exposure scanning across S3, MediaPackage, and CloudFront, alerting the instant any bucket or distribution drifts toward public exposure — the posture backstop behind the policy controls; (b) Wiz Code scanning the Terraform and application repos in the pull request, so a misconfiguration (a public-read bucket policy, an over-broad IAM role) is caught before merge, not in production; © CrowdStrike Falcon sensors on the Fargate tasks and any EC2 transcode helpers for runtime threat detection, feeding the broadcaster’s SOC; (d) a DRM key-exchange failure, a spike in license denials, or a watermark-traced leak auto-raises a ServiceNow incident so security and anti-piracy have a ticket, not just a log line. AWS Organizations SCPs and Config rules deny any S3 bucket created with public access, and Wiz independently verifies the guardrail is actually holding.
Cost optimization. Egress and transcode dominate, and 80% of the catalogue is rarely watched, so engineer for the long tail.
| Lever | Mechanism | Typical effect |
|---|---|---|
| Transcode once | Single ABR ladder + JIT packaging, never pre-package per DRM | Avoids N× re-encode and N× storage |
| Encoding efficiency | QVBR + HEVC for high renditions where devices support it | Lower bitrate at equal quality → less egress |
| Storage tiering | Lifecycle cold renditions to S3 IA/Glacier; keep hot titles in Standard | Cheap storage for the unwatched tail |
| Multi-CDN steering | Send baseline to cheapest CDN; burst peaks to the second | Caps the price of the spike |
| Cache hit ratio | Long-TTL immutable segments; high offload from origin | Less MediaPackage + origin compute per view |
Meter egress and packaging cost per title and pipe it to Datadog, which the platform team uses for the cost-per-view dashboard the CFO sees; a title that is expensive to serve relative to its viewership is a tiering candidate.
Scalability. Each tier scales independently. The Fargate entitlement service scales tasks on request concurrency; API Gateway and CloudFront absorb edge load natively. MediaPackage JIT packaging scales with cache-miss volume — which is why a high cache-hit ratio is a scaling lever, not just a cost one: every cache hit is packaging you did not do. MediaConvert scales by job concurrency and is naturally elastic since ingest is asynchronous and bursty. DynamoDB on-demand absorbs the session/concurrency write spike at a final’s full-time whistle. The natural ceiling at peak is CDN and DRM-license-server throughput, which is precisely why multi-CDN and a license server sized (or autoscaled) to the synchronized spike are planned before the first big event, not after.
Failure modes, and what each one looks like. Name them before they page you.
- DRM key-exchange (SPEKE) failure — MediaPackage cannot fetch a content key, so the manifest cannot be encrypted and playback fails for everyone on that title. Mitigation: health-check the DRM SPEKE endpoint, alert on key-request error rate, and keep a documented break-glass runbook with the DRM provider.
- License-server saturation at peak — the encrypted stream is fine but devices cannot get a license, so playback stalls at start. Mitigation: autoscale/over-provision the license server for the synchronized spike and load-test it against the final’s expected concurrency.
- Single-CDN degradation — one CDN saturates or has a regional incident mid-event. Mitigation: multi-CDN steering to Akamai as overflow/failover, driven by real-user QoE telemetry, not just synthetic health checks.
- Public-origin misconfiguration — a bucket policy or distribution change exposes unencrypted renditions. Mitigation: OAC by default, SCP deny on public S3, and Wiz/Wiz Code catching it in posture scans and in the PR.
- Stale rights window — a title whose license window has expired is still playable. Mitigation: the entitlement API checks the rights window in DynamoDB at both session-mint and license-issuance, and a publish job tombstones expired titles.
Reliability & DR (RTO/RPO). Decide the numbers per tier. The transcoded renditions in S3 (cross-region replicated, the durable source of truth) give near-zero RPO for content — the catalogue is rebuildable. DynamoDB global tables give multi-region session/catalogue state with seconds RTO. MediaPackage and the entitlement service are deployed in a paired region behind multi-CDN steering, so an AWS regional event fails ingress over rather than going dark. A pragmatic target for this platform: RTO 15 minutes, RPO near-zero for content, ~1 minute for session state, with the rendition store and DynamoDB the real recovery guarantees and the DRM provider’s own multi-region SLA underwriting the license path. The asynchronous ingest pipeline can tolerate a longer RTO than playback — viewers feel an outage, a delayed transcode they do not.
Observability. Instrument quality-of-experience end to end in Datadog (Dynatrace is the equivalent option where a shop standardizes on it): a Real-User-Monitoring SDK in each player emits start-up time, rebuffer ratio, bitrate distribution, DRM-license success rate, and playback-failure reasons, while APM traces the entitlement service’s embed-token → rights-check → sign-URL path. Emit the metrics the business actually cares about — concurrent streams, start failures by device/DRM, rebuffer ratio at peak, and cost-per-view by title — and alert on DRM-license success rate specifically, because it is the leading indicator that a key-exchange or license-server problem is about to ruin an event. New premium titles pass a ServiceNow change approval (rights cleared, watermarking confirmed) before going live, giving licensing a documented gate.
Governance. Pin MediaConvert job templates and DRM configurations in version control so encoding and protection behavior do not drift; promote changes through the pipeline. Keep the entitlement logic and IaC in git, reviewable and instantly revertable, with Argo CD as the single source of deployed truth. Apply Organizations SCPs and Config rules to deny public S3 and require encryption, with Wiz as the independent check that the controls are real. Log every entitlement decision and license issuance for audit and anti-piracy forensics — the chain of custody the rights contract demands.
Explicit tradeoffs
Accept these or do not build it. Multi-DRM is genuinely complex: three DRM ecosystems with three license flows, the SPEKE/CPIX exchange to operate, and device-specific quirks (FairPlay’s TS/SAMPLE-AES vs. CENC on the rest) that you will spend time on. JIT packaging trades a little request-time compute and a careful cache strategy for linear storage and painless key rotation — usually right, but a tiny catalogue with no key-rotation needs might pre-package and skip the moving part. The double entitlement gate adds a callback hop at license time that a free, ungated service would not need. And the multi-CDN, watermarking, license-server-autoscaling, and posture-scanning machinery are all overhead you can skip for an internal demo and absolutely cannot skip for a premium sports product under a rights contract with penalty clauses.
The alternatives, and when they win. If you are streaming non-premium, unprotected content (user-generated clips, internal training video, a corporate Moodle-hosted course library where access control at the LMS is sufficient), you do not need multi-DRM at all — plain HLS/DASH behind signed URLs is simpler and cheaper, and the platform here is overkill. If your protection requirement is single-ecosystem (an Android-TV-only set-top product), one DRM removes a great deal of this complexity. If you need ultra-low-latency live rather than VOD, the packaging and protection ideas carry over but the encoder/packager path and latency budget differ materially. And if you want a fully managed turnkey path, AWS Elemental MediaTailor / MediaConvert + MediaPackage as a bundled workflow gets a VOD service running quickly; graduate to this full multi-DRM, multi-CDN, identity-gated platform when a rights contract, device reach, peak scale, or anti-piracy obligations demand it.
The shape of the win
For the broadcaster, the payoff is not “a video player.” It is that the league’s anti-piracy team audits the platform — keys that never leave the DRM boundary, hardware-backed decryption, forensic watermarking, entitlement enforced at the license, a documented chain of custody — and clears it to carry the premium rights for another cycle, which a media-server-and-a-login plan would never have cleared. That renewal is the entire business. Everything upstream — the transcode-once ladder, the SPEKE multi-DRM, the just-in-time packaging, the Okta-gated double entitlement check, the Vault-held DRM credentials, the multi-CDN failover to Akamai, the Wiz posture scanning, the Datadog QoE telemetry — exists to make a rights-holder, a CISO, and a CFO each say yes, and to keep saying yes on the one Sunday evening a quarter when the whole audience presses play at once. The architecture here is the destination; start narrower if your content is unprotected, but this is where a premium, at-scale VOD service under a real rights contract has to land.