AWS Lesson 28 of 123

Change Data Capture with DynamoDB Streams: Lambda Triggers, EventBridge Pipes, and Exactly-Once Processing

Change data capture off DynamoDB looks deceptively simple — turn on a stream, point a Lambda at it, ship the deltas somewhere. The trouble starts the first time a single poison record wedges a shard, an OpenSearch index drifts out of sync, or someone discovers that “exactly-once” was never a property the stream promised. DynamoDB Streams give you an ordered, at-least-once feed of item-level mutations. Everything downstream — ordering across reprocessing, deduplication, partial-batch recovery — is yours to engineer. This is how to build that pipeline so it survives throttles, redrives, and the occasional malformed item, using Lambda event source mappings where you need code and EventBridge Pipes where you do not.

The reason this is hard is that the failure modes are invisible until load. A pipeline that processes ten records a minute in dev never reveals that you have no idempotency, no bisect, no DLQ — and then a flash sale drives one partition key hot, the iterator age on that shard climbs past forty minutes, a malformed item silently blocks every update behind it, and your “real-time” search index is now lying about stock. The platform did exactly what you configured; you simply never configured it for the bad day. So this article treats DynamoDB CDC the way a senior engineer learns it the hard way: not “how do I move a change” but “how do I move a change such that it survives replay, isolates poison, preserves per-key order, and never silently drops data.”

By the end you will know which of the four delivery surfaces (native Stream + Lambda ESM, native Stream + EventBridge Pipes, Kinesis Data Streams for DynamoDB, or a glue Lambda fan-out) fits a given freshness/retention/ordering requirement; exactly which event-source-mapping knobs (BatchSize, ParallelizationFactor, MaximumRecordAgeInSeconds, BisectBatchOnFunctionError, ReportBatchItemFailures) prevent a stalled shard; and how to make every downstream write idempotent so at-least-once delivery is a non-event. Because this is a reference you will return to mid-incident, the knobs, the error/limit numbers, the view types, and the symptom→cause→confirm→fix playbook are all laid out as scannable tables — read the prose once, then keep the tables open when the iterator age starts climbing.

What problem this solves

A relational database hands you a transaction log and a dozen mature CDC tools that tail it. DynamoDB hands you a Stream: an ordered, append-only feed of every INSERT, MODIFY, and REMOVE, retained for 24 hours, partitioned to match your table’s shards. That is a genuine gift — you get item-level deltas with per-partition-key ordering and no triggers to write — but it is a primitive, not a pipeline. The delivery contract is at-least-once, the retention is hard-capped at 24 hours, and the ordering guarantee is per partition key only, never table-wide. Everything that makes a production CDC pipeline trustworthy sits in the gap between that primitive and your sinks.

What breaks without this knowledge is depressingly consistent. A team points a single Lambda at the stream with default settings, fans out to OpenSearch + S3 + a warehouse from that one function, and ships it. Then: (1) a malformed item throws, the whole batch retries forever, and the shard stalls — every change behind the poison record is blocked for hours; (2) a redrive replays a batch and the search index double-applies a delete, or a stale replay overwrites a newer document; (3) one slow sink (OpenSearch under merge pressure) stalls the shared function, so the data lake and warehouse go stale too; (4) a record that needed reprocessing ages past the 24-hour window before anyone noticed, and it is gone for good. None of these are exotic. All of them are the default outcome of treating the stream as if it were exactly-once, infinitely retained, and globally ordered.

Who hits this: anyone projecting DynamoDB state into a search index, a cache, a data lake, an audit trail, or another service. It bites hardest on high-write, hot-key workloads (a single partition going hot starves the one-poller-per-shard model), multi-sink fan-out (coupling sinks behind one function), and first-time CDC builders who assume the stream gives them guarantees it explicitly does not. The fix is almost never “make the stream stronger” — the stream is what it is — it is “engineer idempotency, bisect, DLQ, and bus-level fan-out around it.” Here is the whole field at a glance: every guarantee the stream gives you, every one it withholds, and where you make up the difference.

Property What the native Stream guarantees What it does NOT guarantee Where you close the gap
Delivery At-least-once to the consumer Exactly-once Idempotent downstream write (conditional-on-SequenceNumber)
Ordering Per partition key, within a shard lineage Global / table-wide ordering Design so consumers only assume per-key order
Retention 24 hours, rolling Anything beyond 24h S3 system-of-record via Firehose; one-time scan for backfill
Failure isolation Records stay readable for 24h Auto-skip of a poison record BisectBatchOnFunctionError + on-failure destination
Partial success — (whole-batch retry by default) Per-record checkpointing ReportBatchItemFailures
Fan-out One stream, up to ~2 ESMs efficiently Independent retry per sink from one Lambda EventBridge bus/Pipe, one rule+DLQ per sink
Dedup across replay Suppression of duplicates Sink-native upsert + external version from SequenceNumber

Learning objectives

By the end of this article you can:

Prerequisites & where this fits

You should already understand DynamoDB’s data model — partition keys, sort keys, items, and that a table’s throughput is sharded by partition key. Familiarity with the basics of Amazon DynamoDB, In Depth: Tables, Keys, Capacity Modes, Indexes & Streams is assumed — this article picks up where the “Streams” section there ends and goes deep on consuming them. You should be comfortable with AWS Lambda at the level of AWS Lambda, In Depth: Runtimes, Triggers, Layers, Concurrency & Every Setting, reading IAM-scoped CLI/Terraform, and JSON event shapes.

This sits in the event-driven / serverless integration track. It is downstream of table design — if your partition keys create hot shards, no consumer tuning saves you, so DynamoDB Single-Table Design: Modeling Access Patterns, GSIs, and Hot Partition Avoidance is upstream of everything here. It pairs tightly with Designing Event-Driven Architectures with Amazon EventBridge: Buses, Rules, Schemas, and Archive/Replay (the bus you fan out to) and Resilient Messaging with SQS and SNS: Fan-Out, FIFO Ordering, DLQs, and Poison-Message Handling (the DLQ and poison-handling patterns reused wholesale). The “land it in a lake” sink is the front half of A Structured Logging Pipeline on AWS: JSON Logs, CloudWatch Metric Filters, and Firehose to OpenSearch.

A quick map of who owns what during an incident, so you escalate to the right layer fast:

Layer What lives here Typical owner Failure classes it causes
Table + key design Partition keys, write distribution Data / app team Hot shard, throttles → consumer falls behind
Stream Shards, view type, 24h retention Platform + app Records age out; wrong view type starves consumers
Event source mapping / Pipe The managed poller + its knobs Platform / app Stalled shard, whole-batch retry, no DLQ
Processor (Lambda) Your code, idempotency, error handling App team Poison records, non-idempotent writes
Sinks (OpenSearch / S3 / DW) Durability, freshness, versioning App + data team Index drift, duplicate apply, small-object tax
Observability IteratorAge, DLQ depth, throttles Platform / SRE Silent stalls nobody alarmed on

Core concepts

Six mental models make every later decision obvious.

A Stream is an ordered, append-only log, sharded like the table. Records for a given partition key always land in the same shard lineage and are delivered in commit order. That per-partition ordering is the single most important property to internalize — there is no global ordering across the table, only within a partition key. Shard topology mirrors the table’s partition structure and changes as the table splits/merges partitions, so your consumer must tolerate shards appearing and closing.

Delivery is at-least-once; “exactly-once” is something you build downstream. Redrives, bisect retries, and poller restarts all mean a sink can see the same change more than once. There is no setting that makes the stream exactly-once. The good news: every record hands you a stable item key and a monotonic SequenceNumber — exactly the tools an idempotent write needs.

Retention is 24 hours, hard. A record not consumed within 24 hours is gone. This drives two rules: set MaximumRecordAgeInSeconds below 24h so poison records are evicted to a DLQ while still readable, and treat S3 (fed via Firehose) as your replayable system of record once the window rolls off — the stream cannot backfill history.

The poller is managed; you tune it, you don’t write it. A Lambda event source mapping (ESM) is a poller AWS runs for you. You never write the GetShardIterator/GetRecords loop — you set BatchSize, ParallelizationFactor, batching window, retries, and failure handling, and the platform polls, batches, invokes, and checkpoints.

A stalled shard is the master failure mode. By default, if your function throws on a batch, the ESM retries the entire batch, and because the shard checkpoint has not advanced, it keeps retrying until success, age-out, or retry-exhaustion. One un-processable record blocks everything behind it. You see it as a climbing iterator age. Two mechanisms fix it: partial-batch responses and bisect-on-error.

Fan-out belongs at the bus, not in one function. If a single mega-Lambda writes to OpenSearch + S3 + a warehouse, one slow sink stalls the shard for all of them. Land changes on an EventBridge bus (often via a Pipe) and let independent rules deliver to each sink with its own retry budget and DLQ.

The vocabulary in one table

Pin down every moving part before the deep sections; the glossary repeats these for lookup.

Term One-line definition Where it lives Why it matters to CDC
Stream Ordered, append-only log of item changes On the table (enable per table) The source of every change
Shard An ordered partition of the stream Within the stream Unit of ordering and parallelism
StreamViewType What each record carries (keys/new/old/both) Stream spec Dictates what consumers can compute
SequenceNumber Monotonic per-shard record id On every record The idempotency / versioning anchor
Event source mapping (ESM) Managed Lambda poller On the Lambda Where you tune batch/parallelism/failure
ParallelizationFactor Concurrent invokes per shard (1–10) ESM config Drains a hot shard without resharding
ReportBatchItemFailures Per-record checkpoint contract ESM + function response Stops one bad record blocking good ones
Bisect-on-error Split-and-retry a failing batch ESM config Isolates a poison record
On-failure destination / DLQ Where failure context lands SQS/SNS (ESM), SQS (Pipe) Points you at records to redrive
EventBridge Pipes Managed source→filter→enrich→target Standalone resource No-code filter-enrich-route
Iterator age How far behind the consumer is CloudWatch metric The leading indicator of a stall
Kinesis Data Streams for DynamoDB Alternative change feed Kinesis stream you own Long retention, many fan-out consumers

Delivery surfaces: native Streams, Pipes, Kinesis, or glue Lambda

Before any tuning, pick the right surface. Four options carry DynamoDB changes, and they differ on retention, ordering, dedup, freshness, and how much code you own. Choosing wrong here is the most expensive mistake in the article — you cannot tune your way out of a surface that doesn’t give you the retention or ordering you need.

Surface Retention Ordering Dedup Freshness Code you own Best for
Native Stream + Lambda ESM 24h Per partition key (strict) At-least-once (you dedupe) Sub-second to seconds The processor Real compute, transactional multi-sink writes
Native Stream + EventBridge Pipes 24h Per partition key (strict) At-least-once Seconds Only enrichment (optional) Filter-enrich-route with no poller code
Kinesis Data Streams for DynamoDB Up to 1 year Best-effort (reorder on seq) May duplicate (you dedupe) Seconds Consumer(s) Long retention, many independent fan-out consumers
Glue Lambda fan-out (one ESM, code fans out) 24h Per partition key At-least-once Sub-second The whole fan-out Small scale, tight code control, few sinks

The native Stream’s headline advantage is strict per-key ordering and no duplication at the source; its costs are the 24-hour cap and a practical limit on how many consumers can read it efficiently. Kinesis Data Streams for DynamoDB inverts that: configurable retention up to a year and clean multi-consumer fan-out, at the price of best-effort ordering and possible duplicates — so you must reorder and dedupe on a sequence number. The trade is real and worth stating plainly:

Dimension Native DynamoDB Stream Kinesis Data Streams for DynamoDB
Max retention 24 hours Up to 365 days (configurable)
Ordering Strict per partition key Best-effort; reorder on ApproximateCreationDateTime/seq
Duplicates At-least-once (rare in practice) Possible; dedupe required
Consumers ~2 efficient ESMs per stream Many (enhanced fan-out, shared, Firehose, KCL)
Throughput model Managed, tied to table Provisioned/on-demand Kinesis shards
When to pick Strict order, short window Long replay, many sinks, analytics teams

Enable the Kinesis surface when you need any of: retention beyond 24 hours, several independent teams consuming the same changes, or direct Firehose delivery to a lake without a Lambda in the path. Stay on the native Stream when strict per-key ordering matters most or your consumer count is small. A quick decision table:

If you need… It’s probably… Choose
Strict per-key order, one or two consumers A real-time projector Native Stream + Lambda ESM
Filter + reshape + route, no poller code A glue pipeline Native Stream + EventBridge Pipes
Replay older than 24h / many fan-out consumers An analytics backbone Kinesis Data Streams for DynamoDB
A handful of sinks, tight code control, small scale A bespoke processor Glue Lambda fan-out
Both real-time projection AND long replay A hybrid Native Stream for projection + Kinesis for lake

Stream internals: shards, records, and view types

A DynamoDB Stream is an ordered, append-only log of item-level changes, retained for 24 hours, composed of shards whose topology mirrors the table’s partition structure. Enable it and choose a StreamViewType, which controls what each record carries:

aws dynamodb update-table \
  --table-name orders \
  --stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES
resource "aws_dynamodb_table" "orders" {
  name             = "orders"
  hash_key         = "pk"
  billing_mode     = "PAY_PER_REQUEST"
  stream_enabled   = true
  stream_view_type = "NEW_AND_OLD_IMAGES"
  attribute {
    name = "pk"
    type = "S"
  }
}

The four view types are not cosmetic — they dictate what your consumers can compute. For CDC you almost always want NEW_AND_OLD_IMAGES: the diff is what lets you route on transitions (“status went from PENDING to SHIPPED”) and reconstruct deletes — on a REMOVE event there is no new image, only the old one.

StreamViewType Record contains Record size impact Use when Cannot do
KEYS_ONLY Key attributes only Smallest You re-read the item yourself, or only need invalidation signals Compute diffs; reconstruct deletes
NEW_IMAGE Item after the write Medium Projecting current state to a search index or cache See what changed from; capture old value on delete
OLD_IMAGE Item before the write Medium Audit “what changed from”, soft-delete capture Project current state on insert/modify
NEW_AND_OLD_IMAGES Both images Largest Computing diffs, conditional routing on field transitions (nothing — fullest fidelity)

Each stream record carries an eventName of INSERT, MODIFY, or REMOVE, plus the keys, the requested image(s), the SequenceNumber, and (on TTL deletes) a userIdentity block. Know the record anatomy cold — it is what your filters and idempotency keys read:

Record field Meaning Present when CDC use
eventName INSERT / MODIFY / REMOVE Always Route by mutation type
dynamodb.Keys The item’s key attributes (typed) Always The stable idempotency key
dynamodb.NewImage Item after the write Not on REMOVE (or KEYS_ONLY) Project current state
dynamodb.OldImage Item before the write Not on INSERT (or KEYS_ONLY) Diffs; reconstruct deletes
dynamodb.SequenceNumber Monotonic per-shard id Always Idempotency + external version
dynamodb.ApproximateCreationDateTime When the change happened Always Latency measurement; Kinesis reorder
userIdentity Set to dynamodb.amazonaws.com for TTL TTL deletes only Filter housekeeping deletes
eventSourceARN The stream ARN Always Redrive target

A subtle trap: a TTL deletion arrives as REMOVE with userIdentity.principalId = dynamodb.amazonaws.com and type = Service. If you mirror deletes downstream, filter TTL expiries explicitly or you replicate housekeeping you did not intend to.

The stream’s own limits are worth memorizing because several of them shape your tuning. These are the numbers that bite:

Limit / quota Value Consequence if you hit it
Retention 24 hours Unconsumed records lost permanently
Max record size ~400 KB (item size) Large items still fit; very wide items inflate batches
Stream view type changes Requires disable + re-enable stream New stream ARN; consumers must re-point
Efficient simultaneous readers ~2 processes per shard More readers risk GetRecords throttling
Shard lineage Splits/merges with table partitions Consumer must handle new/closed shards
GetRecords throughput 5 calls/sec, up to 1,000 records or 1 MB Over-reading throttles; ESM manages this for you

Consuming with Lambda: batch size, window, and parallelization

The native stream is consumed through a Lambda event source mapping — a managed poller AWS runs on your behalf. You tune it; you do not write the polling loop. Three knobs dominate throughput and latency:

aws lambda create-event-source-mapping \
  --function-name orders-cdc-processor \
  --event-source-arn "$STREAM_ARN" \
  --starting-position LATEST \
  --batch-size 100 \
  --maximum-batching-window-in-seconds 5 \
  --parallelization-factor 4 \
  --maximum-retry-attempts 3 \
  --bisect-batch-on-function-error \
  --maximum-record-age-in-seconds 21600 \
  --function-response-types ReportBatchItemFailures \
  --destination-config '{"OnFailure":{"Destination":"'"$DLQ_ARN"'"}}'

Every tunable on the mapping, what it does, its default, its bounds, and the trade-off — this is the table you keep open when you create or adjust an ESM:

Setting What it controls Default Range / values Trade-off / gotcha
BatchSize Max records per invocation 100 1–10,000 (streams) Bigger amortizes overhead but widens failure blast radius and raises duration
MaximumBatchingWindowInSeconds Wait before invoking 0 0–300 Higher = fewer/fuller invokes but more latency
ParallelizationFactor Concurrent invokes per shard 1 1–10 Drains a hot shard; cross-key order within a shard relaxes (per-key holds)
StartingPosition Where to begin LATEST / TRIM_HORIZON LATEST ignores history; TRIM_HORIZON replays the 24h window
MaximumRetryAttempts Retries before giving up on a record -1 (infinite) -1, or 0–10,000 Infinite + no bisect = a poison record stalls forever
MaximumRecordAgeInSeconds Evict records older than this -1 (= 24h) -1, or 60–604,800 (capped at 24h effective) Set below 24h so poison hits DLQ while readable
BisectBatchOnFunctionError Split a failing batch and retry halves false true/false The safety net for unhandled throws
FunctionResponseTypes Enables partial-batch responses (none) ReportBatchItemFailures Precise per-record checkpointing
DestinationConfig.OnFailure Where failure context goes (none) SQS or SNS ARN Without it, exhausted records vanish silently
TumblingWindowInSeconds Stateful aggregation window 0 0–900 Enables windowed aggregation across invokes
Enabled Poller on/off true true/false Disable to pause without deleting the mapping
FilterCriteria Pre-invoke event filtering (none) Up to 5 patterns Drops non-matching records before you pay

ParallelizationFactor deserves a warning. Ordering is preserved per partition key even with a factor above 1: the poller hashes each record’s partition key to one of the concurrent invocations, so all records for a key still go to the same invocation in order. What you lose is ordering across different keys within the same shard — almost always fine, because cross-key ordering was never guaranteed table-wide. Use it freely as long as your processing logic only assumes per-key order. How the parallelization factor interacts with order and throughput:

Parallelization factor Concurrency per shard Per-key order Cross-key order in a shard When to use
1 (default) 1 Preserved Preserved Strict in-shard order needed; low throughput
2–4 2–4 Preserved Relaxed Moderate hot-shard catch-up
5–10 5–10 Preserved Relaxed Drain a badly hot shard during a spike

--starting-position LATEST begins at the tip and ignores history; TRIM_HORIZON replays everything still in the 24-hour window. For a brand-new pipeline that must backfill, TRIM_HORIZON plus a one-time scan-and-load for older data is the usual combination — the stream alone cannot reach beyond 24 hours. The starting-position choices side by side:

Starting position Begins at Replays history? Use for
LATEST The tip (new changes only) No Steady-state pipelines; you backfill separately
TRIM_HORIZON Oldest record still in the 24h window Yes (≤24h) New pipeline catching up recent changes
(one-time Scan + load) The full table snapshot N/A (not the stream) Backfilling data older than 24h

Rather than reason about each knob in isolation, start from the workload and read off a recipe. A decision table mapping the requirement to the ESM configuration:

If your requirement is… Set Because
Lowest possible latency MaximumBatchingWindowInSeconds=0, small BatchSize Invoke as soon as records arrive
Highest throughput / fewest invokes BatchSize=100–1000, window 5 Full batches amortize overhead
Catch up a hot shard fast ParallelizationFactor=8–10 Drain one shard across many invokes
Strict in-shard cross-key order ParallelizationFactor=1 No concurrent invokes per shard
Survive a poison record BisectBatchOnFunctionError + finite MaxRetryAttempts + DLQ Isolate and evict it
Never reprocess good records FunctionResponseTypes=ReportBatchItemFailures Per-record checkpointing
Evict before the 24h cliff MaximumRecordAgeInSeconds=21600 DLQ while still readable
Backfill a brand-new pipeline StartingPosition=TRIM_HORIZON + one-time Scan Replay 24h + load older data

The concurrency limits that bound how far you can push parallelism — real numbers, because they cap your catch-up:

Limit Value What hits it
ParallelizationFactor max 10 per shard Hot-shard catch-up ceiling
BatchSize max (streams) 10,000 records Very large batches widen failure blast radius
MaximumBatchingWindowInSeconds max 300 Latency vs invoke-count trade
Account concurrent executions (default) 1,000 (raisable) Total Lambda concurrency across functions
Effective concurrency per stream shards × ParallelizationFactor The real parallel ceiling for one stream
Efficient readers per shard ~2 More risks GetRecords throttling
GetRecords 5/sec, ≤1,000 records or 1 MB The poller manages this for you

Ordering and partial-batch failures: bisect-on-error and checkpointing

Here is the failure mode that bites everyone. By default, if your function throws on a batch, the ESM retries the entire batch — and because the shard’s checkpoint has not advanced, it keeps retrying the same batch until it succeeds, expires past MaximumRecordAgeInSeconds, or exhausts MaximumRetryAttempts. A single un-processable record blocks every record behind it in that shard. That is a stalled shard, and you see it as a climbing iterator age.

Two mechanisms fix this; use both.

Partial batch responses let the function tell the poller exactly which records failed so the rest are checkpointed and never reprocessed. Set --function-response-types ReportBatchItemFailures and return the failed sequence numbers:

def handler(event, context):
    failures = []
    for record in event["Records"]:
        seq = record["dynamodb"]["SequenceNumber"]
        try:
            process(record)
        except Exception:
            failures.append({"itemIdentifier": seq})
    return {"batchItemFailures": failures}

The contract has a sharp edge worth memorizing: because ordering matters, the poller treats the earliest reported failure as the checkpoint. Everything at or after that sequence number is retried; everything before it is considered done. So return failures honestly — do not report a late record as failed while silently dropping an earlier one, or you will skip data. Returning an empty list ({"batchItemFailures": []}) means the whole batch succeeded. The exact response contract and how the poller interprets each shape:

Function returns Poller interprets as Records reprocessed Trap
{"batchItemFailures": []} Whole batch succeeded None
{"batchItemFailures":[{"itemIdentifier": seq}]} Fail from earliest listed seq That seq and everything after List the earliest real failure
Throws an exception Whole batch failed Entire batch One bad record burns the whole budget (without bisect)
Returns null / nothing Whole batch succeeded None Swallowing an error here silently drops data
Reports a late seq, drops an earlier one Checkpoints past the dropped one Only from the late seq Data loss — never do this

Bisect-on-error (--bisect-batch-on-function-error) handles the case where the function throws instead of reporting failures. On error the poller splits the batch in two and retries each half, recursively narrowing down to the single offending record. That record then goes to the on-failure destination once retries are exhausted, and the rest of the batch proceeds. Without bisect, one bad record can burn your entire retry budget on a 100-record batch. The two mechanisms compared — ship both, but know which does what:

Mechanism Precision Reprocesses good records? Handles unhandled throw? Role
ReportBatchItemFailures Per record (exact seq) No Only if you catch and report Primary — precise checkpointing
BisectBatchOnFunctionError Narrows to one record via halving Yes (re-runs halves) Yes Safety net for exceptions you didn’t anticipate
MaximumRetryAttempts Batch/record budget Bounds how long a failure retries
MaximumRecordAgeInSeconds Time-based eviction Evicts to DLQ before 24h age-out

Rule of thumb: ReportBatchItemFailures is the primary mechanism — precise, avoids reprocessing good records. BisectBatchOnFunctionError is the safety net for unexpected exceptions and serialization bugs you did not anticipate. Ship both.

EventBridge Pipes: filtering, enrichment, and routing without glue Lambdas

A large fraction of CDC “processors” are glue: filter the events you care about, reshape them, send them to a target. EventBridge Pipes does exactly that as a managed integration — source, optional filter, optional enrichment, target — with no poller code to own. The source can be a DynamoDB Stream directly. The four stages and what each is for:

Pipe stage Purpose Optional? Cost timing Typical implementation
Source Read the DynamoDB Stream (batch/window/retry like an ESM) Required Stream ARN + dynamodb_stream_parameters
Filter Drop non-matching records before you pay Optional Runs first, cheapest Up to 5 EventBridge patterns
Enrichment Hydrate/transform the event Optional After filter Lambda, Step Functions Express, API GW, API destination
Target Where the final payload goes Required After enrichment EventBridge bus, SQS, SNS, Step Functions, Lambda, …

The filter stage runs before you pay for enrichment or invocation, so push as much selectivity into it as you can. Pipes filter patterns match the stream record shape, including the DynamoDB-typed attribute values:

{
  "eventName": ["MODIFY"],
  "dynamodb": {
    "NewImage": {
      "status": { "S": ["SHIPPED"] }
    }
  }
}

The Terraform wiring is compact, and the dynamodb_stream_parameters block carries the same batch/window/retry semantics as a Lambda ESM:

resource "aws_pipes_pipe" "orders_shipped" {
  name     = "orders-shipped-to-bus"
  role_arn = aws_iam_role.pipe.arn
  source   = aws_dynamodb_table.orders.stream_arn
  target   = aws_cloudwatch_event_bus.commerce.arn

  source_parameters {
    dynamodb_stream_parameters {
      starting_position                  = "LATEST"
      batch_size                         = 100
      maximum_batching_window_in_seconds = 5
      maximum_retry_attempts             = 4
      on_partial_batch_item_failure      = "AUTOMATIC_BISECT"
      dead_letter_config {
        arn = aws_sqs_queue.pipe_dlq.arn
      }
    }
    filter_criteria {
      filter {
        pattern = jsonencode({
          eventName = ["MODIFY"]
          dynamodb  = { NewImage = { status = { S = ["SHIPPED"] } } }
        })
      }
    }
  }
}

The enrichment stage can hydrate the event — look up a customer record, call a pricing service — and return a transformed payload, all without you running a poller. The target transformer then reshapes the final payload with an input template. The net effect: a filter-enrich-route pipeline that used to be 150 lines of Lambda becomes declarative infrastructure, and the only code you keep is genuine business logic in the enrichment. The decision between Pipes and a Lambda ESM, made explicit:

If the work is… Lean toward Why
Filter + reshape + route EventBridge Pipes No poller code; filter runs before you pay
Non-trivial transformation needing libraries Lambda ESM Real code, real error handling
Transactional writes to multiple sinks Lambda ESM Coordinated, conditional writes
Enrich from one service, then route Pipes (Lambda enrichment) Keep only business logic as code
Fan-out to many independent consumers Pipes → EventBridge bus One rule + DLQ per consumer
Windowed aggregation across batches Lambda ESM (tumbling window) Pipes has no tumbling window

Choose Pipes when the work is route and reshape. Keep a Lambda ESM when the work is compute — non-trivial transformation, transactional writes to multiple sinks, or anything needing libraries and real error handling.

The Pipes target options, end to end, with what each is good for as a CDC destination:

Pipe target Delivery shape Good CDC use Note
EventBridge bus Event onto a bus Fan-out to many independent rules/sinks The canonical decoupling target
SQS queue Message to a queue Buffer + a single consumer with backpressure Pair with a DLQ
SNS topic Pub/sub fan-out Push to multiple subscribers No replay/buffer
Step Functions (Standard) Start an execution Long-running orchestration per change Async; durable
Step Functions (Express) Synchronous workflow Short enrichment/route logic Also valid as enrichment
Lambda Invoke a function Real compute on the routed event When you do want code
Kinesis / Firehose Record to a stream Land in a lake / re-stream Firehose → S3 for the system of record
API destination HTTP call Push to an external SaaS/webhook Built-in retry + auth

Idempotency and exactly-once semantics in downstream sinks

DynamoDB Streams are at-least-once. Redrives, bisect retries, and poller restarts all mean a downstream sink can see the same change more than once. “Exactly-once” downstream is therefore not a delivery setting you toggle — it is an idempotent write you design. The good news is every stream record hands you the tools: a stable item key and a monotonic SequenceNumber.

The cleanest pattern is a conditional write keyed on sequence. Store the last applied sequence number per item and reject anything not strictly newer. This collapses duplicates and protects against out-of-order application during reprocessing:

def apply_idempotent(key, new_image, sequence):
    try:
        table.put_item(
            Item={**new_image, "_seq": sequence},
            ConditionExpression="attribute_not_exists(#s) OR #s < :seq",
            ExpressionAttributeNames={"#s": "_seq"},
            ExpressionAttributeValues={":seq": sequence},
        )
    except ClientError as e:
        if e.response["Error"]["Code"] != "ConditionalCheckFailedException":
            raise
        # Duplicate or stale replay -> safe no-op

DynamoDB SequenceNumber values are strictly increasing within a shard for a given key, which is exactly the scope this guard needs. For sinks that lack conditional writes, fall back to a deterministic idempotency key — partitionKey#sortKey#sequenceNumber — and let the sink’s natural upsert (OpenSearch _id, an S3 object key, a primary-key MERGE) absorb the duplicate. The principle is invariant: make replay a no-op, never an additive side effect. The idempotency strategy per sink type:

Sink Idempotency mechanism Dedup key Out-of-order protection
DynamoDB (another table) Conditional PutItem on _seq Item key _seq < :seq rejects stale
OpenSearch index with external version _id = item key _version_type=external rejects lower version
S3 (data lake) Deterministic object key pk#sk#seq.json Same key overwrites; immutable log
Relational warehouse MERGE / upsert on PK Primary key Compare a version/seq column
SQS → consumer Idempotency table at consumer MessageDeduplicationId / pk#sk#seq Consumer-side last-seq check
Cache (Redis/Memcached) SET keyed by item key Item key Conditional SET on stored seq

The failure modes idempotency defends against, so you can reason about why each guard exists:

Without idempotency you get… Caused by The guard that prevents it
Double-applied insert/modify Redrive or bisect replay Conditional-on-seq / upsert by id
Re-issued delete on an already-deleted item Replay of a REMOVE Idempotent delete (no-op if absent)
Newer doc overwritten by older replay Out-of-order reprocessing External version / _seq < :seq
Duplicate rows in the warehouse At-least-once + naive INSERT MERGE on PK
Inflated counters/aggregates Additive side effect on replay Make the write a set, not an increment

Fan-out to OpenSearch, S3, and analytics targets

Real pipelines feed several sinks with different durability and freshness needs. Resist the urge to fan out from a single mega-Lambda — couple them and one slow target stalls the shard for all of them. Fan out at the bus instead: a Pipe lands changes on an EventBridge bus, and independent rules deliver to each consumer with its own retry budget and DLQ.

Each sink has a different freshness/durability/cost profile — match the sink to the requirement:

Sink Freshness Durability role Cost driver Per-sink failure handling
OpenSearch Sub-second to seconds Queryable projection (rebuildable) Cluster + indexing Bus rule retry + DLQ; _version guard
Firehose → S3 Buffered (≈60s / 5MB) System of record (replayable) Per-GB + PUTs Firehose retry + S3 error bucket
Redshift / Athena Minutes (after S3 land) Analytical store Compute + scan Reload from immutable S3
Another DynamoDB table Sub-second Materialized view RU/s Conditional-on-seq write
ElastiCache Sub-second Cache (rebuildable) Node hours Conditional SET; TTL

A single Lambda target can also do the OpenSearch fan-out directly when you want code-level control — emit per-document failures via ReportBatchItemFailures so a 409 on one doc does not block the batch:

from opensearchpy import helpers

def to_action(record):
    key = record["dynamodb"]["Keys"]["pk"]["S"]
    if record["eventName"] == "REMOVE":
        return {"_op_type": "delete", "_index": "orders", "_id": key}
    img = deserialize(record["dynamodb"]["NewImage"])
    version = int(record["dynamodb"]["SequenceNumber"])
    return {
        "_op_type": "index", "_index": "orders", "_id": key,
        "_version": version, "_version_type": "external", "_source": img,
    }

Mapping each eventName to the right sink action — the table that keeps your projector correct:

eventName Image present OpenSearch action S3 action Warehouse action
INSERT NewImage index (upsert by _id) Write pk#sk#seq.json MERGE insert
MODIFY New + Old index (upsert by _id) Write new version object MERGE update
REMOVE (user) OldImage only delete by _id Write tombstone object MERGE soft-delete flag
REMOVE (TTL) OldImage + userIdentity Filter out (or delete) Skip or tag as housekeeping Skip

Handling poison records with on-failure destinations and DLQs

A poison record is one that will never succeed no matter how many times you retry — a schema your code cannot parse, a downstream constraint it will always violate. The entire point of MaximumRetryAttempts, MaximumRecordAgeInSeconds, and an on-failure destination is to evict that record so the shard keeps moving.

For a Lambda ESM, the on-failure destination (SQS or SNS) receives metadata about the failed batch, not the record bodies themselves — shard ID, the failing sequence-number range, and an error summary. That is deliberate: the records still live in the stream for 24 hours, so the destination’s job is to point you at them. Your redrive tooling reads the DynamoDBStreamArn, shard, and sequence-number window from the failure message and re-reads the offending records with GetShardIterator at AT_SEQUENCE_NUMBER. An EventBridge Pipes DLQ behaves the same way — it captures failure context after retries are exhausted.

{
  "requestContext": { "functionArn": "arn:aws:lambda:...:orders-cdc-processor" },
  "responseContext": { "statusCode": 200, "functionError": "Unhandled" },
  "DDBStreamBatchInfo": {
    "shardId": "shardId-00000001700000000000-abcdef01",
    "startSequenceNumber": "700000000000000000001",
    "endSequenceNumber": "700000000000000000099",
    "approximateArrivalOfFirstRecord": "2026-06-08T14:00:00Z",
    "batchSize": 99,
    "streamArn": "arn:aws:dynamodb:...:table/orders/stream/2026-06-08T00:00:00.000"
  }
}

What the failure message contains, and what you do with each field during a redrive:

Field in DDBStreamBatchInfo Meaning Redrive use
shardId The shard that failed GetShardIterator target
startSequenceNumber First failing seq Iterator at AT_SEQUENCE_NUMBER
endSequenceNumber Last failing seq Stop boundary for the re-read
streamArn The stream Which stream to read
approximateArrivalOfFirstRecord When it arrived How close to the 24h cliff you are
batchSize How many records Scope of the redrive

Alarm on ApproximateAgeOfOldestRecord and on DLQ depth, and make the redrive a deliberate, reviewed action — not an automatic retry that re-poisons the pipeline. Set MaximumRecordAgeInSeconds comfortably below 24 hours (e.g. 6 hours) so records are evicted to the DLQ while they are still readable in the stream; if a record ages past the 24-hour retention before you redrive, it is gone for good. The on-failure surface differs slightly between ESM and Pipes:

Aspect Lambda ESM on-failure EventBridge Pipes DLQ
Destination types SQS or SNS SQS (or SNS)
Payload Batch metadata (shard + seq range) Failure context after retries
Records included No (still in stream 24h) No (still in stream 24h)
Trigger After MaximumRetryAttempts / age After maximum_retry_attempts
Redrive method Re-read shard at AT_SEQUENCE_NUMBER Re-read shard at AT_SEQUENCE_NUMBER
Bisect setting --bisect-batch-on-function-error on_partial_batch_item_failure = AUTOMATIC_BISECT

The metrics you alarm on are the difference between catching a stall in minutes and discovering it from a customer complaint. The CloudWatch reference for a DynamoDB-Streams CDC pipeline — namespace, what it measures, and a starting alarm threshold:

Metric Namespace What it tells you Starting alarm threshold
IteratorAge / GetRecords.IteratorAgeMilliseconds AWS/Lambda / AWS/DynamoDB How far behind the consumer is (the stall signal) > 60,000 ms sustained 5 min
ApproximateAgeOfOldestRecord AWS/Lambda (ESM) Oldest unprocessed record’s age > 1/4 of MaximumRecordAgeInSeconds
Errors AWS/Lambda Function invocation failures > 0 sustained (with retries looping)
Throttles AWS/Lambda Concurrency ceiling hit > 0 sustained 5 min
ConcurrentExecutions AWS/Lambda Headroom vs account/reserved limit > 80% of reserved
DeadLetterErrors / DLQ ApproximateNumberOfMessagesVisible AWS/Lambda / AWS/SQS Poison records captured ≥ 1 (page on any)
ReadThrottleEvents AWS/DynamoDB (stream) Too many readers per shard > 0 sustained
ProvisionedConcurrencySpilloverInvocations AWS/Lambda PC pool overflow (if using PC) > 0

The error and exception strings you will actually see, what each means on this path, and the first move:

Error / condition Where it surfaces Likely cause How to confirm First fix
ConditionalCheckFailedException Idempotent PutItem Duplicate or stale replay (expected) It’s a no-op by design Swallow it; do NOT re-raise
version_conflict_engine_exception (409) OpenSearch index Older version rejected by external versioning Stale replay arrived after newer Treat as success; the guard worked
ProvisionedThroughputExceededException GetRecords Too many readers per shard ReadThrottleEvents > 0 ≤2 readers; move to Kinesis fan-out
TrimmedDataAccessException GetShardIterator/redrive Record aged past 24h before redrive ApproximateAgeOfOldestRecord near 24h Lower MaxRecordAge; alarm earlier
ExpiredIteratorException Long-running consumer Shard iterator older than ~5 min unused Iterator reused too late Re-acquire the iterator
ResourceNotFoundException (stream) ESM create / Pipe Stream ARN stale after view-type change View type changed → new ARN Re-point the ESM/Pipe to the new ARN
Unhandled (in DLQ functionError) On-failure destination Function threw; batch exhausted retries DLQ message present Redrive from shard + seq range
Lambda Task timed out CloudWatch logs Per-record work exceeds function timeout Duration ≈ configured timeout Smaller batch; raise timeout; optimize

Architecture at a glance

The diagram traces a change as it actually flows, then maps each failure class onto the exact hop where it bites. Read it left to right. A write commits to the orders table and lands on its Stream (NEW_AND_OLD_IMAGES, sharded, 24-hour retention) — badge 1 marks where records age out of the window if a stall lets them. The capture/poll zone holds the two managed pollers: the Lambda ESM (batch 100, parallelization 1–10) and EventBridge Pipes (filter, no code) — badge 2 sits on the ESM, where a hot shard stalls the single poller per shard. In process, the CDC processor writes idempotently (conditional on _seq) — badge 3 is the duplicate / out-of-order apply class — and the DLQ captures the shardId + sequence range when a poison record wedges the shard (badge 4). The fan-out sinks — OpenSearch (upsert by _id/_version), Firehose→S3 (NDJSON, gzip), encrypted with a KMS CMK — carry badge 5, the sink-drift / fan-out-coupling class. Finally CloudWatch watches IteratorAge, the leading indicator for every stall.

The whole method is in that left-to-right path: a change is captured once, processed exactly-once via an idempotent write, fanned out at the bus so each sink owns its retries, and observed so a climbing iterator age pages you before the 24-hour cliff. The five numbered badges are the five ways it goes wrong, and the legend narrates each as symptom · confirm · fix — so when the pager fires you localize the failure to a hop, read the cause, run the named confirm, and apply the fix.

DynamoDB Streams change-data-capture architecture: a write commits to the orders DynamoDB table and lands on its Stream with NEW_AND_OLD_IMAGES across sharded 24-hour-retention shards; managed pollers (Lambda event source mapping with batch size 100 and parallelization factor 1 to 10, and EventBridge Pipes filtering with no code) capture the changes; a CDC processor Lambda writes idempotently with a conditional-on-sequence expression while an SQS dead-letter queue captures the shard id and sequence range of poison records; changes fan out to OpenSearch (upsert by id with external version), Kinesis Data Firehose to S3 as gzipped newline-delimited JSON, all encrypted with a KMS customer-managed key; and CloudWatch alarms on iterator age. Five numbered badges mark where records age out of the 24-hour window, a hot shard stalls the poller, duplicates or out-of-order applies occur, a poison record wedges the shard, and sink drift from fan-out coupling — each narrated symptom, confirm, fix in the legend

Real-world scenario

A retail platform team — call them MeridianStock — ran inventory on a single DynamoDB table fronting 2,000 stores, with a Lambda ESM projecting every change into an OpenSearch index that powered the storefront’s “in stock near you” search. Average load was 600 writes/second; the team was five engineers; the monthly bill for the table + stream + Lambda + OpenSearch ran about ₹95,000. On a normal day it was fine.

During a flash sale, one SKU’s partition went hot. The ESM’s iterator age on that shard climbed past 40 minutes, and search results went stale right when traffic peaked — the exact failure that costs money. Worse, a malformed bundle item (a nested map their deserializer choked on) had been silently stuck in that shard for hours, blocking every inventory update behind it because the whole batch kept failing and retrying. Their original ESM had BatchSize=500, ParallelizationFactor=1, no partial-batch reporting, and no bisect — so a single bad record consumed the entire retry budget on a 500-record batch, over and over.

The constraint was twofold: they needed the hot shard to catch up without resharding a 2,000-store table mid-sale, and they could not let one un-parseable item hold the rest hostage. The breakthrough was reading the right metric. GetRecords.IteratorAgeMilliseconds was pinned near 2,400,000 (40 minutes) on exactly one shard while the others sat near zero — a textbook hot-shard-plus-poison signature, not a capacity problem. Scaling the table would have done nothing.

They made three changes, all configuration. They raised ParallelizationFactor to 8 so the hot shard drained across eight concurrent invocations (ordering held, because all records for a given SKU still hashed to one invocation). They switched the function to ReportBatchItemFailures and returned precise failed sequence numbers, so good records checkpointed immediately instead of being reprocessed behind the poison item. And they enabled BisectBatchOnFunctionError with MaximumRecordAgeInSeconds=21600, so the malformed bundle item was narrowed down and evicted to a DLQ within minutes instead of wedging the shard for hours.

aws lambda update-event-source-mapping \
  --uuid "$ESM_UUID" \
  --parallelization-factor 8 \
  --batch-size 100 \
  --maximum-record-age-in-seconds 21600 \
  --bisect-batch-on-function-error \
  --function-response-types ReportBatchItemFailures \
  --destination-config '{"OnFailure":{"Destination":"'"$DLQ_ARN"'"}}'

The outcome: iterator age on the hot shard fell back under 30 seconds within the next sale, the poison bundle item surfaced in the DLQ with its exact sequence range for an engineer to fix offline, and the storefront search stopped lying about stock during the one window where accuracy mattered most. The following sprint they also moved the data-lake sink off the projector Lambda and onto an EventBridge Pipe → bus, so a slow OpenSearch never again stalled the lake. No table redesign, no resharding — just the ESM knobs used the way they were designed. The incident as a timeline, because the order of moves is the lesson:

Time Symptom Action taken Effect What it should have been
T+0 Search stale during sale (alert on stale results, not the pipeline) Alarm on IteratorAge, not user reports
T+10m One shard’s age at 40 min Considered scaling the table Would not help Read per-shard IteratorAge first
T+20m Confirmed hot shard + poison ParallelizationFactor 1 → 8 Hot shard drains 8-way Correct catch-up lever
T+25m Poison still blocking ReportBatchItemFailures on Good records checkpoint Stop reprocessing good behind bad
T+30m Poison record isolated Bisect + MaxRecordAge=21600 bundle item to DLQ in minutes Evict while still readable
+1 sprint Lake coupled to projector Pipe → EventBridge bus Sinks decoupled Fan out at the bus from day one

Advantages and disadvantages

DynamoDB Streams give you a genuinely powerful change feed, but the model’s strengths and its sharp edges are two sides of the same coin. Weigh them honestly:

Advantages (why this model helps you) Disadvantages (why it bites)
Ordered, item-level deltas with per-key ordering and no triggers to write Ordering is per key only — never table-wide; consumers must not assume global order
Managed pollers (Lambda ESM / Pipes) — you tune, you don’t write the loop At-least-once delivery means you must build idempotency; nothing is exactly-once
ReportBatchItemFailures + bisect give precise poison isolation Defaults are unsafe: infinite retries, no bisect, no DLQ → a poison record stalls a shard for hours
24-hour retention buffers consumers and enables redrive from the stream The 24-hour cap is hard — miss the window and the record is gone; you need S3 as the system of record
EventBridge Pipes turns glue into declarative infrastructure Pipes is route-and-reshape only — real compute still needs a Lambda
ParallelizationFactor drains a hot shard without resharding A hot partition key still bottlenecks one shard; tuning helps, key design fixes
Native Stream never duplicates at the source (vs Kinesis) Only ~2 efficient readers per shard; many consumers force the Kinesis surface
Tight integration with KMS, IAM, CloudWatch Several failure modes are invisible until load (hot shard, SNAT-like coupling, ageing out)

The model is right whenever you need item-level CDC with strict per-key ordering and a short replay window — search projection, cache invalidation, audit trails, materialized views. It bites hardest on hot-key workloads, multi-sink fan-out done in one function, and teams that ship the defaults. Every disadvantage is manageable — but only if you know it exists, which is the entire point of building idempotency, bisect, DLQ, and bus-level fan-out in from the start.

Hands-on lab

Build a minimal CDC pipeline end to end — a table with a stream, a Lambda ESM with the safety knobs on, an idempotent projection, and a deliberate poison record — then watch it recover. Free-tier-friendly; teardown at the end. Run in CloudShell.

Step 1 — Variables and a table with a stream.

export AWS_PAGER=""
TABLE=cdc-orders-lab
DLQ=cdc-orders-dlq
FN=cdc-orders-processor

aws dynamodb create-table \
  --table-name "$TABLE" \
  --attribute-definitions AttributeName=pk,AttributeType=S \
  --key-schema AttributeName=pk,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST \
  --stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES
aws dynamodb wait table-exists --table-name "$TABLE"
STREAM_ARN=$(aws dynamodb describe-table --table-name "$TABLE" \
  --query 'Table.LatestStreamArn' --output text)
echo "$STREAM_ARN"

Expected: a stream ARN ending in a timestamp. Note it — the ESM points at it.

Step 2 — Create the DLQ and capture its ARN.

DLQ_URL=$(aws sqs create-queue --queue-name "$DLQ" --query QueueUrl --output text)
DLQ_ARN=$(aws sqs get-queue-attributes --queue-url "$DLQ_URL" \
  --attribute-names QueueArn --query 'Attributes.QueueArn' --output text)
echo "$DLQ_ARN"

Step 3 — A processor that is idempotent and reports per-record failures. Save as app.py and zip it. It throws on a deliberately malformed item so you can watch bisect isolate it:

import json
def handler(event, context):
    failures = []
    for r in event["Records"]:
        seq = r["dynamodb"]["SequenceNumber"]
        try:
            img = r["dynamodb"].get("NewImage", {})
            if img.get("poison", {}).get("BOOL"):     # the malformed item
                raise ValueError("poison record")
            print("applied", r["dynamodb"]["Keys"]["pk"]["S"], seq)
        except Exception as e:
            print("FAILED", seq, str(e))
            failures.append({"itemIdentifier": seq})
    return {"batchItemFailures": failures}

Step 4 — Deploy the function (assumes an existing Lambda execution role with stream + SQS + logs permissions; see Security notes for the exact policy).

zip fn.zip app.py
aws lambda create-function --function-name "$FN" \
  --runtime python3.12 --handler app.handler \
  --role "$LAMBDA_ROLE_ARN" --zip-file fileb://fn.zip

Step 5 — Create the event source mapping with every safety knob on.

aws lambda create-event-source-mapping \
  --function-name "$FN" --event-source-arn "$STREAM_ARN" \
  --starting-position TRIM_HORIZON \
  --batch-size 10 --maximum-batching-window-in-seconds 5 \
  --parallelization-factor 2 \
  --maximum-retry-attempts 2 --maximum-record-age-in-seconds 3600 \
  --bisect-batch-on-function-error \
  --function-response-types ReportBatchItemFailures \
  --destination-config '{"OnFailure":{"Destination":"'"$DLQ_ARN"'"}}'

Expected: a mapping with State progressing to Enabled. Confirm: aws lambda list-event-source-mappings --function-name "$FN" --query 'EventSourceMappings[0].State'.

Step 6 — Write a few good items and one poison item, then watch.

aws dynamodb put-item --table-name "$TABLE" --item '{"pk":{"S":"ORDER#1"},"status":{"S":"PENDING"}}'
aws dynamodb put-item --table-name "$TABLE" --item '{"pk":{"S":"ORDER#2"},"poison":{"BOOL":true}}'
aws dynamodb put-item --table-name "$TABLE" --item '{"pk":{"S":"ORDER#3"},"status":{"S":"SHIPPED"}}'
sleep 30
aws logs tail "/aws/lambda/$FN" --since 5m

Expected in the logs: applied ORDER#1, a FAILED line for ORDER#2, and applied ORDER#3 — proof the good records checkpointed while the poison one was isolated and routed to the DLQ after its retries.

Step 7 — Confirm the poison landed in the DLQ.

aws sqs receive-message --queue-url "$DLQ_URL" --wait-time-seconds 5 \
  --query 'Messages[0].Body' --output text

Expected: a JSON body with DDBStreamBatchInfo carrying the shardId and the failing sequence range. The lab steps mapped to what each proves:

Step What you did What it proves
1 Table + NEW_AND_OLD_IMAGES stream The source feed and its view type
5 ESM with bisect + partial-batch + DLQ The safety knobs are real and composable
6 Good + poison items Good records checkpoint; poison is isolated
7 Read the DLQ message Failure context (shard + seq) is what you redrive from

Cleanup.

ESM=$(aws lambda list-event-source-mappings --function-name "$FN" --query 'EventSourceMappings[0].UUID' --output text)
aws lambda delete-event-source-mapping --uuid "$ESM"
aws lambda delete-function --function-name "$FN"
aws sqs delete-queue --queue-url "$DLQ_URL"
aws dynamodb delete-table --table-name "$TABLE"

Cost note. On-demand DynamoDB, a tiny Lambda, and an SQS queue for an hour cost a few rupees; deleting the four resources stops everything. There is no charge for enabling a stream — you pay per GetRecords call the ESM makes, which is negligible at this scale.

Common mistakes & troubleshooting

This is the playbook — the part you bookmark. First as a scannable table you can read while the iterator age climbs, then the detail underneath for the entries that bite hardest.

# Symptom Root cause Confirm (exact metric / command) Fix
1 One shard’s iterator age climbs to minutes; downstream stale Hot partition key + one-poller-per-shard GetRecords.IteratorAgeMilliseconds high on one shard, flat on others ParallelizationFactor up to 10; ReportBatchItemFailures on
2 Whole pipeline wedged; same batch retried forever Poison record + no bisect + infinite retries Invocation errors loop; iterator age climbs everywhere BisectBatchOnFunctionError + on-failure DLQ + finite MaxRetryAttempts
3 Search index double-applies or shows stale doc Non-idempotent write; out-of-order replay OpenSearch _version regresses; counts drift Conditional-on-_seq; external versioning from SequenceNumber
4 Records vanished; never processed Aged past 24h before redrive ApproximateAgeOfOldestRecord was near 24h MaximumRecordAgeInSeconds well below 24h; alarm earlier
5 Housekeeping deletes replicated downstream TTL REMOVE not filtered Record has userIdentity.principalId=dynamodb.amazonaws.com Filter TTL deletes in the function/filter
6 One slow sink stalls all sinks Mega-Lambda fan-out couples sinks Iterator age climbs when only OpenSearch is slow Fan out at an EventBridge bus/Pipe; one rule+DLQ per sink
7 REMOVE events have no new data to project Expecting NewImage on delete eventName=REMOVE has only OldImage Use OldImage; reconstruct delete from keys
8 Consumers see no field-transition info Wrong StreamViewType (KEYS_ONLY/NEW_IMAGE) Record lacks OldImage Re-enable stream as NEW_AND_OLD_IMAGES
9 DLQ message has no record body Expecting records, got metadata Body is DDBStreamBatchInfo, not items Re-read shard at AT_SEQUENCE_NUMBER
10 GetRecords throttling; lag despite low function load Too many readers per shard ReadThrottleEvents / throttled GetRecords ≤2 efficient readers; move to Kinesis fan-out
11 Reported failures but data still skipped Reported a late seq, dropped an earlier one Gap in processed sequence numbers Always report the earliest real failure
12 Kinesis CDC shows duplicates / out of order Kinesis is best-effort, not native Stream Same record id twice; out-of-order seq Dedupe + reorder on seq; or use native Stream

The expanded form for the entries that cause the most lost hours:

1. One shard’s iterator age climbs to minutes; downstream goes stale. Root cause: A hot partition key funnels writes into one shard, and the default one-poller-per-shard model falls behind. Confirm: aws cloudwatch get-metric-statistics --namespace AWS/DynamoDB --metric-name ... — or simpler, the Lambda IteratorAge metric high on the one shard. The signature is one shard high while the rest are near zero. Fix: Raise ParallelizationFactor (up to 10) so the shard drains across concurrent invocations; per-key order still holds because each key hashes to one invocation. Enable ReportBatchItemFailures so good records checkpoint immediately. The durable fix is key design — see single-table modeling — but tuning buys you the incident.

2. The whole pipeline is wedged; the same batch retries forever. Root cause: A poison record with default settings (infinite retries, no bisect) — the batch fails, the checkpoint never advances, it retries indefinitely. Confirm: Function invocation errors loop on the same batch; iterator age climbs across shards. Fix: BisectBatchOnFunctionError to isolate the record by halving, a finite MaximumRetryAttempts, and an on-failure destination so the isolated record lands in a DLQ. Combine with ReportBatchItemFailures so you rarely reach bisect at all.

3. The search index double-applies a change or shows a stale document. Root cause: A non-idempotent write, or an out-of-order replay overwriting a newer document during reprocessing. Confirm: OpenSearch _version regresses; row counts in the warehouse drift from the table. Fix: Conditional write keyed on SequenceNumber (attribute_not_exists(_seq) OR _seq < :seq); for OpenSearch use _version_type=external with the sequence-derived version so a lower version is rejected.

4. Records vanished and were never processed. Root cause: They aged past the 24-hour retention before anyone redrove them. Confirm: ApproximateAgeOfOldestRecord was approaching 86,400 seconds. Fix: Set MaximumRecordAgeInSeconds well below 24h (e.g. 21,600) so poison records hit the DLQ while still readable; alarm on iterator age at minutes, not hours.

6. One slow sink stalls all sinks. Root cause: A single mega-Lambda fans out to every sink, so one slow target (OpenSearch under merge pressure) stalls the shard for all of them. Confirm: Iterator age climbs precisely when one sink is slow; the others are healthy but starved. Fix: Land changes on an EventBridge bus (via a Pipe) and give each sink its own rule, retry budget, and DLQ. The projector’s slowness no longer blocks the lake.

Iterator age is the master signal, but its shape tells you which failure you have. Read the pattern, not just the number:

Iterator-age pattern It’s probably… Confirm Do this
High on one shard, flat on the rest Hot partition key (± poison) Per-shard IteratorAge ParallelizationFactor up; fix key design
Climbing across all shards together Poison + no bisect, or function too slow Looping Errors; duration ≈ timeout Bisect + DLQ; smaller batch; optimize
Climbs only when one sink is slow Mega-Lambda fan-out coupling Sink dependency latency vs age Fan out at the bus/Pipe
Sawtooth (climbs, drops, repeats) Under-provisioned concurrency vs bursts Throttles spiking on bursts Raise concurrency; ParallelizationFactor
Flat-high near MaxRecordAge then drops Records being evicted to DLQ DLQ depth rising Expected — go fix the poison offline

Best practices

Security notes

DynamoDB Streams inherit the table’s encryption and are governed by their own IAM action set; the consumer needs least-privilege access to read the stream and write the DLQ, nothing more.

The minimal consumer policy — what to grant and why each action is needed:

IAM action On resource Why the consumer needs it
dynamodb:DescribeStream The stream ARN Discover shards and topology
dynamodb:GetShardIterator The stream ARN Position a read (incl. AT_SEQUENCE_NUMBER for redrive)
dynamodb:GetRecords The stream ARN Read the change records
dynamodb:ListStreams The stream ARN Enumerate streams on the table
sqs:SendMessage The DLQ ARN Write failure context on exhaustion
kms:Decrypt The table’s CMK Decrypt encrypted stream records
(sink-specific writes) Each sink ARN Project the change (scoped per sink)
data "aws_iam_policy_document" "cdc_consumer" {
  statement {
    actions   = ["dynamodb:DescribeStream", "dynamodb:GetRecords",
                 "dynamodb:GetShardIterator", "dynamodb:ListStreams"]
    resources = [aws_dynamodb_table.orders.stream_arn]
  }
  statement {
    actions   = ["sqs:SendMessage"]
    resources = [aws_sqs_queue.cdc_dlq.arn]
  }
  statement {
    actions   = ["kms:Decrypt"]
    resources = [aws_kms_key.table.arn]
  }
}

Cost & sizing

The bill for a CDC pipeline is dominated by the consumers and sinks, not the stream itself — enabling a stream is free, and you pay only for the GetRecords calls the managed poller makes. What actually drives the cost and how to right-size it:

A rough monthly picture for a mid-size pipeline (600 writes/s projected to OpenSearch + a lake):

Cost driver What you pay for Rough INR / month How to right-size
Stream GetRecords Read-request units the poller makes ~₹500–2,000 Sensible batch size + window (few, full reads)
Lambda (processor) Invocations × duration × memory ~₹4,000–12,000 Larger batches; tune memory to per-record work
OpenSearch domain Cluster hours (data + master nodes) ~₹40,000–90,000 Size to query load; rebuild from S3, run lean
Firehose → S3 Per-GB ingest + storage + PUTs ~₹2,000–6,000 Let Firehose batch/compress; partition by date
SQS DLQ Requests (tiny) < ₹200 Negligible
Kinesis (if used) Shard-hours or per-GB + retention ~₹6,000–20,000 Only when >24h replay / many consumers

The lesson MeridianStock learned: most “we need a bigger pipeline” instincts are actually “we mis-tuned the poller” or “we coupled our sinks.” Fix the configuration and the fan-out topology first; the bill usually goes down, not up. There is no free tier for streams specifically, but the per-GetRecords cost is small enough that the consumers dominate every realistic bill.

Interview & exam questions

1. What ordering guarantee does a DynamoDB Stream actually provide? Strict ordering per partition key, within a shard lineage — records for a given key are delivered in commit order. There is no global/table-wide ordering. Consumers must only assume per-key order; this is what makes ParallelizationFactor > 1 safe.

2. Is delivery exactly-once? How do you achieve exactly-once downstream? No — delivery is at-least-once; redrives and retries can replay a record. Exactly-once downstream is an idempotent write you design: a conditional write keyed on SequenceNumber in DynamoDB, external versioning in OpenSearch, or a deterministic pk#sk#seq key with the sink’s natural upsert.

3. A single poison record has stalled a shard for hours. Why, and what two settings fix it? With defaults (infinite retries, no bisect) the failing batch never checkpoints, so it retries forever and blocks everything behind it. Fix with BisectBatchOnFunctionError (isolate the record by halving) plus an on-failure destination, and ReportBatchItemFailures so good records checkpoint immediately. Bound it with a finite MaximumRetryAttempts and a MaximumRecordAgeInSeconds below 24h.

4. What does ParallelizationFactor do, and what does it cost you? It runs up to 10 concurrent invocations per shard, draining a hot shard without resharding. Per-key order is preserved (each key hashes to one invocation); you relax ordering across keys within a shard — almost always acceptable since cross-key order was never guaranteed.

5. How does the ReportBatchItemFailures checkpoint contract work, and what’s the trap? You return the failed sequence numbers; the poller treats the earliest reported failure as the checkpoint — everything at or after it is retried, everything before is done. The trap: reporting a late failure while dropping an earlier record silently skips data. Always report the earliest real failure.

6. When do you choose EventBridge Pipes over a Lambda ESM? When the work is route and reshape — filter, optionally enrich, deliver — with no poller code to own. Keep a Lambda ESM when the work is genuine compute: non-trivial transformation needing libraries, transactional writes to multiple sinks, or windowed aggregation (tumbling windows are ESM-only).

7. Why is a 24-hour stream not enough, and what do you pair with it? Records age out after 24 hours and are gone; you cannot backfill history from the stream. Pair it with S3 (via Firehose) as the immutable, replayable system of record, and a one-time Scan-and-load for data older than the window. Set MaximumRecordAgeInSeconds below 24h so poison records hit the DLQ while still readable.

8. How do you keep an OpenSearch projection from being corrupted by an out-of-order replay? Derive a monotonic version from the record’s SequenceNumber and index with _version_type=external, so OpenSearch rejects any write with a lower version. Set _id to the item key so writes are idempotent upserts and a REMOVE is a delete by the same _id.

9. What’s the difference between native DynamoDB Streams and Kinesis Data Streams for DynamoDB? Native Streams: 24-hour retention, strict per-key order, at-least-once, ~2 efficient readers per shard. Kinesis for DynamoDB: retention up to a year, best-effort order (reorder on sequence), possible duplicates, and many fan-out consumers. Pick Kinesis for long replay or many consumers; native for strict order and a short window.

10. The on-failure destination has a message but no record body. Why, and how do you recover the records? The destination receives batch metadata (shardId, failing sequence range, stream ARN), not the records — they still live in the stream for 24h. Recover by GetShardIterator at AT_SEQUENCE_NUMBER with the range from the message and re-reading the offending records for a deliberate redrive.

11. How do you handle TTL deletes you don’t want to replicate? A TTL deletion arrives as REMOVE with userIdentity.principalId=dynamodb.amazonaws.com. Filter those explicitly (in the function or the Pipe filter pattern) so housekeeping expiries are not mirrored to downstream sinks.

12. Iterator age is climbing on exactly one shard while others are flat. What does that tell you? A hot partition key is funneling writes into that shard and the poller is falling behind (often compounded by a poison record). Confirm with per-shard IteratorAge; mitigate with ParallelizationFactor and ReportBatchItemFailures, and fix the root cause in the data model (key design).

These map to the AWS Certified Developer – Associate (DVA-C02)develop event-driven solutions, Lambda event source mappings, DynamoDB Streams — and the Solutions Architect – Associate / Professional for the integration and fan-out patterns. A compact cert mapping:

Question theme Primary cert Objective area
Stream ordering / view types DVA-C02 Develop with DynamoDB; event-driven
ESM tuning, bisect, partial-batch DVA-C02 Lambda event source mappings
EventBridge Pipes vs Lambda DVA-C02 / SAA-C03 Event-driven integration
Idempotency / exactly-once DVA-C02 / SAP-C02 Resilient, idempotent design
Kinesis vs native Streams SAA-C03 / DAS Choosing a streaming surface
KMS / IAM scoping for streams SCS-C02 / DVA-C02 Secure data + least privilege

Quick check

  1. A redrive replays a batch and your OpenSearch index now shows an older document than before. What single mechanism prevents this, and what value do you derive it from?
  2. One shard’s IteratorAge is at 30 minutes while every other shard is near zero. What is the most likely root cause, and which ESM knob buys you time?
  3. True or false: setting MaximumRetryAttempts infinite (the default) with no bisect is a safe configuration for a CDC processor.
  4. You need to replay changes from eight days ago and have three independent teams consuming the same feed. Native Stream or Kinesis Data Streams for DynamoDB — and why?
  5. Your on-failure SQS message contains a DDBStreamBatchInfo block but none of the actual records. How do you get the records to fix them?

Answers

  1. External versioning derived from the record’s SequenceNumber: index into OpenSearch with _version set to the sequence (cast to a number) and _version_type=external, so a write with a lower version is rejected and a stale replay can’t overwrite a newer doc. In DynamoDB the equivalent is a conditional write attribute_not_exists(_seq) OR _seq < :seq.
  2. A hot partition key is funneling writes into that one shard, so the single poller per shard falls behind (often with a poison record compounding it). Raise ParallelizationFactor (up to 10) to drain the shard across concurrent invocations — per-key order still holds — and enable ReportBatchItemFailures. The durable fix is the data model.
  3. False. With infinite retries and no bisect, a single poison record fails the batch, the checkpoint never advances, and it retries forever — stalling every record behind it. Use BisectBatchOnFunctionError, a finite MaximumRetryAttempts, MaximumRecordAgeInSeconds below 24h, and an on-failure DLQ.
  4. Kinesis Data Streams for DynamoDB. The native Stream caps retention at 24 hours (so eight days is impossible) and supports only ~2 efficient readers per shard. Kinesis offers retention up to a year and clean multi-consumer fan-out — at the cost of best-effort ordering and possible duplicates, so dedupe and reorder on a sequence number.
  5. The destination carries batch metadata, not records — the records still live in the stream for 24 hours. Read the shardId and startSequenceNumber/endSequenceNumber from the message, call GetShardIterator at AT_SEQUENCE_NUMBER, and re-read the offending records for a deliberate, reviewed redrive (provided they haven’t aged past 24h).

Glossary

Next steps

You can now build a CDC pipeline off DynamoDB Streams that survives replay, isolates poison, preserves per-key order, and fans out cleanly. Build outward:

awsdynamodbstreamscdcevent-driven
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments