Almost every DynamoDB problem I have been called in to fix traces back to a decision someone made in the first five minutes — a key they chose because it was “the obvious id”, a capacity mode they left on the default, an index they bolted on later to make one slow query go away. DynamoDB punishes those early decisions harder than a relational database does, because it gives you almost none of the escape hatches you are used to: no joins, no ad-hoc WHERE on any column, no “just add an index and the optimiser will sort it out”. A customerId partition key looks fine in the demo and then melts under a Black-Friday hot partition. A table left on provisioned capacity at five write units throttles the moment a campaign launches, and the team blames “DynamoDB being slow” when DynamoDB was doing exactly what it was told. Someone enables a Global Secondary Index with the wrong projection and quietly doubles their write bill. The service is extraordinary — single-digit-millisecond latency at any scale, genuinely hands-off operations — but only if you understand the handful of concepts underneath the friendly Create table button.
This is the deep dive that closes that gap. Amazon DynamoDB is AWS’s fully managed, serverless, key-value and document NoSQL database. You do not provision servers, choose instance types, patch anything, or manage replication; you create a table, define its keys, and read and write items via an API, and AWS spreads your data across a fleet of storage nodes that scales horizontally to effectively unlimited size and throughput. By the end of this lesson you will know the full data model (tables, items, attributes, and the all-important partition key + sort key), exactly how partitioning and hashing distribute your data and what causes hot partitions; both capacity modes (on-demand and provisioned with auto scaling) and the RCU/WCU arithmetic behind them; the difference between a Local Secondary Index and a Global Secondary Index and when to reach for each; DynamoDB Streams and change data capture; TTL; the DAX in-memory cache; transactions; the eventual-versus-strong consistency model; global tables for multi-Region; and point-in-time recovery, backups, and encryption. Every concept comes with the real aws CLI to drive it.
Learning objectives
By the end of this lesson you will be able to:
- Model data with DynamoDB’s partition key and optional sort key, and explain exactly how DynamoDB hashes the partition key to place items — and what creates and how to avoid hot partitions.
- Choose between on-demand and provisioned capacity (with auto scaling), and calculate read capacity units (RCUs) and write capacity units (WCUs) for a workload, including the effect of item size, strong vs eventual reads, and transactions.
- Decide between a Local Secondary Index (LSI) and a Global Secondary Index (GSI), choose the right projection, and understand the cost and consistency implications of each.
- Build change-data-capture and event-driven pipelines on DynamoDB Streams (and Kinesis Data Streams), with Lambda triggers and the right
StreamViewType. - Apply the operational features — TTL, DAX caching, transactions (
TransactWriteItems/TransactGetItems), eventual vs strong consistency, global tables, PITR/on-demand backups, and encryption at rest. - Drive all of the above with the real
aws dynamodbCLI and reason about the bill.
Prerequisites & where this fits
You should be comfortable with IAM users, roles, and policies, because every DynamoDB call is authorised by IAM (there is no separate database login), and a sense of what AWS Lambda does will help when we wire up Streams. No prior NoSQL experience is assumed — every term is defined as we go. This lesson sits in the Databases module of the AWS Zero-to-Hero course, alongside the relational RDS & Aurora deep dive; think of the two as the relational and the NoSQL halves of the same chapter. It is the foundation for the two advanced DynamoDB lessons it links at the end: single-table design and access patterns and change data capture with DynamoDB Streams.
Core concepts
Key-value and document, not relational. A relational database stores rows in tables with a fixed schema and lets you query any column, join tables, and let an optimiser figure out the plan. DynamoDB does almost none of that. It stores items (think “rows”, but schemaless beyond the key) addressed by a primary key, and it is ruthlessly optimised for one thing: fetching items by their key in single-digit milliseconds, at any scale, with predictable cost. The trade is that you must know your access patterns up front and design your keys and indexes around them — there is no SELECT * FROM t WHERE anyColumn = ? that stays fast as the table grows. This is why people say you “model for your queries, not for your entities” in DynamoDB; the single-table design lesson is entirely about doing that well.
Tables, items, and attributes. A table is a collection of items; an item is a collection of attributes (name/value pairs); an attribute has a data type. The only thing every item in a table must share is the primary key attributes — everything else is free-form, so two items in the same table can have completely different attributes. Items are limited to 400 KB each (the sum of attribute names and values), which is a hard design constraint: large blobs go in S3 with a pointer stored in DynamoDB. Attribute types are scalar (S string, N number, B binary, BOOL, NULL), document (M map, L list — these nest arbitrarily, which is the “document database” part), and set (SS string set, NS number set, BS binary set — unordered, no duplicates).
The primary key: partition key, optionally plus a sort key. This is the single most important decision you make. The primary key takes one of two forms:
- Simple primary key — just a partition key (also called the hash key). Each item is uniquely identified by its partition-key value, which must be unique across the table.
GetItemneeds exactly that value. - Composite primary key — a partition key plus a sort key (also called the range key). Items are grouped into the same partition by partition-key value and then sorted within the partition by sort-key value. The combination must be unique. This unlocks the most useful operation in DynamoDB —
Query, which fetches a whole partition (or a contiguous slice of it) efficiently and in sort order.
Query vs Scan (learn this before anything else). A Query targets a single partition key and optionally a range of sort-key values; it reads only matching items and is fast and cheap. A Scan reads every item in the table (or index) and filters afterwards; it is slow and expensive and you should treat it as a code smell in any hot path. The whole art of DynamoDB modelling is arranging your keys and indexes so every access pattern is a Query (or a GetItem/BatchGetItem) and never a Scan.
Serverless and horizontally scaled. DynamoDB has no instances to size. Behind the scenes your table’s data is spread across many partitions (storage units, each on solid-state storage and replicated across three Availability Zones for durability), and DynamoDB adds partitions automatically as your data grows past ~10 GB per partition or as you push more throughput. You never see partitions directly, but understanding that they exist is the key to understanding both performance and hot partitions.
How partitioning and hashing work (and hot partitions)
DynamoDB decides which physical partition an item lives on by running the partition-key value through an internal hash function; the hash output maps the item to one partition. Items with the same partition-key value always land on the same partition (that is what makes Query efficient — they are physically together and sorted by sort key). Items with different partition-key values are spread across partitions roughly uniformly if the key values are diverse.
That last clause is everything. A partition has finite limits — historically a guideline of ~3,000 RCUs and ~1,000 WCUs and ~10 GB per partition. If your access concentrates on one partition-key value, all that traffic hits one partition and you get a hot partition: throttling on that key even though the table’s total provisioned (or on-demand) capacity is nowhere near exhausted. Classic causes:
- A low-cardinality partition key (e.g.
status = "ACTIVE", orcountry = "IN") — a handful of values means a handful of partitions absorbing all traffic. - A time-based partition key (e.g.
date = "2026-06-15") where today’s date takes all of today’s writes — a “hot tail”. - A single popular item — one celebrity user, one viral product — concentrating reads.
Adaptive capacity mitigates this somewhat: DynamoDB automatically reallocates throughput toward partitions that need it (and can isolate a single hot item onto its own partition), so transient skew often “just works”. But adaptive capacity cannot save a fundamentally bad key — if all your traffic targets one value, there is nothing to rebalance. The design fixes are: choose a high-cardinality partition key (user id, order id — something with millions of distinct values), and where a naturally skewed key is unavoidable, write-shard it by appending a suffix (date#0…date#9) and fanning reads across the shards. The single-table design lesson covers hot-partition avoidance in depth.
Capacity modes: on-demand vs provisioned
Every table runs in one of two capacity modes, which determine how you pay for throughput and whether you manage it.
| On-demand | Provisioned | |
|---|---|---|
| You specify | Nothing (it scales itself) | RCUs and WCUs (a target throughput) |
| Pricing | Per request (per million reads/writes) | Per provisioned unit-hour, whether used or not |
| Scaling | Instant, automatic, unlimited (up to table/account limits) | Fixed unless auto scaling adjusts it; bursts use a token bucket |
| Best for | Spiky/unpredictable traffic, new apps, dev/test, “set and forget” | Steady, predictable traffic where you can forecast load |
| Cost shape | More per request, zero when idle | Cheaper per request if well-utilised, pays even when idle |
| Throttling | Rare (only at very high sudden scale beyond previous peak) | Happens when demand exceeds provisioned + burst |
| Switching | You can switch modes once every 24 hours | Same |
Read & write capacity units (the arithmetic you must know). In provisioned mode you buy throughput in units, and the same units describe what on-demand requests cost:
- 1 WCU = one write of up to 1 KB per second. A 3 KB item write costs 3 WCUs (round up to the next KB). A transactional write costs 2× WCUs (it is done twice under the hood).
BatchWriteItemcosts the sum of its individual writes (no discount). - 1 RCU = one strongly consistent read of up to 4 KB per second, or two eventually consistent reads of up to 4 KB/s (eventual reads are half the cost), or one transactional read (which costs 2× RCUs). A 4 KB item read strongly = 1 RCU; eventually = 0.5 RCU; transactionally = 2 RCUs. An 8 KB item read strongly = 2 RCUs. Round item size up to the next 4 KB.
So a Query returning ten 4 KB items eventually consistent costs 10 × 0.5 = 5 RCUs; the same strongly consistent costs 10 RCUs; in a transaction, 20 RCUs. Internalise the strong = full, eventual = half, transactional = double rule and the 1 KB-write / 4 KB-read granularity — it is exam gold and it is how you forecast a bill.
Burst capacity and the token bucket (provisioned mode). Provisioned mode is not a hard wall. DynamoDB accumulates unused capacity (up to the last 5 minutes / 300 seconds’ worth) into a burst bucket and lets short spikes draw it down, so brief overruns don’t throttle. But burst is best-effort and finite; sustained overload throttles once the bucket empties. (On-demand has its own behaviour: it serves up to double your previous peak instantly, and ramps higher within ~30 minutes — so a brand-new table or a never-before-seen spike can still throttle until it “learns” the new peak. You can pre-warm with warm throughput settings.)
Auto scaling (provisioned mode). Rather than guess a fixed number, you enable Application Auto Scaling, which watches a target utilisation (default 70%) of consumed-to-provisioned capacity and raises or lowers provisioned RCUs/WCUs between a min and max you set, via CloudWatch alarms. It reacts in minutes, not seconds, so it is great for daily cycles but not for instantaneous spikes — for those, on-demand is usually the better answer. You can also buy reserved capacity (a 1- or 3-year commitment on a baseline of provisioned units) for a steep discount on steady workloads.
Which mode? Start new and unpredictable workloads on on-demand — it is the safe default and you never throttle from under-provisioning. Move to provisioned + auto scaling (and consider reserved capacity) once traffic is steady and predictable enough that the per-request maths favours it. Because you can switch only once per 24 hours, treat the switch as a deliberate decision, not a knob to fiddle.
Secondary indexes: LSI vs GSI
By default you can only efficiently fetch items by the primary key. A secondary index lets you query by other attributes by maintaining an alternate key structure that DynamoDB keeps in sync automatically. There are two kinds, and choosing wrongly is a common and expensive mistake.
| Local Secondary Index (LSI) | Global Secondary Index (GSI) | |
|---|---|---|
| Partition key | Same as the table’s partition key | Any attribute (different partition key allowed) |
| Sort key | A different attribute (alternate sort key) | Any attribute (optional sort key) |
| When created | Only at table creation — cannot add/remove later | Anytime — add or delete on a live table |
| How many | Up to 5 per table | Up to 20 per table (default; raisable) |
| Consistency | Supports strong and eventual reads | Eventual only (never strongly consistent) |
| Capacity | Shares the base table’s RCUs/WCUs | Its own provisioned RCUs/WCUs (or on-demand) |
| Size limit | 10 GB per partition-key value (item collection limit) | No item-collection size limit |
| Key uniqueness | Index keys need not be unique | Index keys need not be unique |
The mental model. An LSI is “same partition, different sort order” — it lets you query the same set of items grouped by the same partition key, but ordered/filtered by a different attribute (e.g. items for a user sorted by lastUpdated instead of by itemId). Because it shares the partition, it can be strongly consistent, and it counts against the 10 GB per-partition item-collection limit — which is the LSI’s biggest gotcha (a single partition key with an LSI can never exceed 10 GB of items). A GSI is a genuinely different table-like view: any attribute as the partition key, its own throughput, eventually consistent, addable anytime. GSIs are the workhorse — single-table designs are built on a handful of overloaded GSIs.
Projections (what attributes the index copies). An index stores a copy of certain attributes from the base item; you choose how much via the projection type:
| Projection | What’s copied into the index | Trade-off |
|---|---|---|
| KEYS_ONLY | Only the index keys + the base table keys | Smallest/cheapest; but a query often needs a follow-up GetItem on the base table to get other attributes |
| INCLUDE | Keys + a named list of extra attributes | Balanced — project exactly the attributes your queries return |
| ALL | Every attribute of the item | Most convenient (queries are self-contained), largest storage and highest write cost |
If a query reads an attribute not projected into the index, DynamoDB does not transparently fetch it for a GSI — you only get the projected attributes (for a GSI; with an LSI it can fetch non-projected attributes from the base table at extra read cost). So choose INCLUDE with exactly the attributes your queries return: ALL is convenient but you pay to write a full copy on every base-item write, and KEYS_ONLY saves storage but forces extra reads.
GSI write amplification and throttling (the costly gotcha). Every write to the base table that touches a projected attribute is also a write to each affected GSI, billed separately against that GSI’s capacity. Five GSIs with ALL projection means roughly 6× the write cost of an un-indexed table. Worse, on a provisioned GSI, if the GSI’s own write capacity can’t keep up, writes to the base table are throttled too (because DynamoDB won’t let the index fall arbitrarily behind). The fixes: provision the GSI generously (or use on-demand), and project only what you need.
DynamoDB Streams and change data capture
A DynamoDB Stream is an ordered, time-ordered log of item-level changes in a table — every create, update, and delete — retained for 24 hours. Turning it on gives you a powerful, exactly-the-right-shape change-data-capture (CDC) feed to drive event-driven architectures: replicate to another store, maintain an aggregate, send a notification, index into OpenSearch, and so on.
What each record contains — the StreamViewType. When you enable a stream you pick how much of the change it carries:
StreamViewType |
Record contains | Use when |
|---|---|---|
| KEYS_ONLY | Only the key attributes of the changed item | You just need to know which item changed and will re-fetch it |
| NEW_IMAGE | The entire item after the change | You need the new state (e.g. to project/replicate it) |
| OLD_IMAGE | The entire item before the change | You need the prior state (e.g. audit, undo, diff) |
| NEW_AND_OLD_IMAGES | Both before and after | You need to diff (compute exactly what changed) — the richest, most common choice for CDC |
Ordering and processing. Stream records are organised into shards that mirror the table’s partitions, and DynamoDB guarantees ordering per partition key (records for the same item are delivered in the order the changes happened) — but not a single global order across the whole table. You consume a stream two ways: with the DynamoDB Streams Kinesis-style API (and the Kinesis Client Library) for custom consumers, or — far more commonly — with a Lambda trigger via an event source mapping, where Lambda polls the shards for you and invokes your function with batches of records. Because delivery is at-least-once, your consumer must be idempotent. The Streams CDC lesson goes deep on ordering, idempotency, batching/parallelisation, error handling (bisect-on-error, on-failure destinations), and EventBridge Pipes.
Streams vs Kinesis Data Streams for DynamoDB. As an alternative you can stream changes to an Amazon Kinesis Data Stream instead of (or as well as) the native stream. The difference: native DynamoDB Streams retain 24 hours and are consumed via Lambda/KCL with per-partition ordering; Kinesis Data Streams offer longer retention (up to 365 days), more/larger consumers, and integration with the broader Kinesis ecosystem (Firehose, Data Analytics), at the cost of running and paying for the Kinesis stream and accepting Kinesis’s at-least-once, possibly-duplicated, possibly-out-of-order-on-resharding semantics. Choose native Streams for tight, ordered Lambda triggers; choose Kinesis for fan-out to many consumers, long retention, or analytics pipelines.
Time to Live (TTL): automatic expiry
TTL lets DynamoDB delete expired items automatically and for free. You designate one numeric attribute as the TTL attribute and store an epoch timestamp (seconds since 1970, UTC) in it; a background process deletes items once that time passes. Key facts that trip people up:
- Deletion is not instantaneous — it typically happens within 48 hours of expiry (a background sweep), so do not rely on TTL for precise timing. To read as if expired items were gone, add a filter expression excluding items whose TTL is in the past.
- TTL deletes are free (no WCUs consumed) — a big reason to use it for session data, caches, and time-series cleanup instead of scanning-and-deleting.
- TTL deletions appear in DynamoDB Streams (with a distinguishing
userIdentityofprincipalId: dynamodb.amazonaws.com), so you can react to expiry (e.g. archive to S3 on expiry). - One TTL attribute per table; the attribute must be a Number; items missing the attribute or with a non-numeric value are simply never expired.
DAX: the in-memory cache
DynamoDB Accelerator (DAX) is a fully managed, in-memory, write-through cache that sits in front of DynamoDB and speaks the DynamoDB API, so adopting it is largely a client-library swap — point the DAX client at the DAX cluster endpoint instead of DynamoDB and your GetItem/Query/Scan calls are cached. It turns single-digit-millisecond reads into single-digit-microsecond reads and absorbs read-heavy/hot-key traffic so it never reaches the table.
| DAX | |
|---|---|
| What it accelerates | Reads (item cache for GetItem/BatchGetItem; query cache for Query/Scan) |
| Writes | Write-through: writes go to DynamoDB and update the cache |
| Consistency | Eventually consistent only — DAX cannot serve strongly consistent reads (those bypass DAX) |
| Form factor | A cluster of nodes inside your VPC (a primary + read replicas across AZs) — you size the node type and count |
| When it helps | Read-heavy, repeated reads, hot keys, microsecond latency targets |
| When it does not | Write-heavy workloads, strongly-consistent read needs, low cache-hit ratios, very large items |
DAX is the right tool when reads dominate and slight staleness is acceptable; it is the wrong tool if you need strong consistency or your workload is write-heavy. Note it runs as provisioned nodes (not serverless), so it has an always-on cost — size it to your working set.
Transactions: all-or-nothing across items
DynamoDB supports ACID transactions across multiple items and multiple tables in a single Region via two APIs:
TransactWriteItems— up to 100Put/Update/Delete/ConditionCheckoperations that all succeed or all fail atomically. Use it for “move money from A to B”, “create order and decrement inventory”, or “claim a unique username”.TransactGetItems— up to 100Getoperations returning a consistent snapshot across items.
Two essentials: transactional operations cost double the normal capacity (a transactional write = 2 WCUs per KB, a transactional read = 2 RCUs per 4 KB), and a transaction can fail with a TransactionCanceledException if a condition check fails or two transactions conflict on the same item — your code must handle and retry as appropriate. Transactions are scoped to one Region (they do not span global-table replicas). For single-item conditional logic you usually don’t need a full transaction — a plain PutItem/UpdateItem with a condition expression (e.g. attribute_not_exists(pk) to create-only, or optimistic locking with a version attribute) is cheaper and sufficient.
Read consistency: eventual vs strong
DynamoDB replicates every item across three copies in different Availability Zones for durability. That replication is why reads come in two flavours:
- Eventually consistent reads (the default) may not reflect a very recent write (a write that hasn’t yet propagated to the replica you happened to read) — but typically catch up within a second. They cost half an RCU per 4 KB. Use them everywhere you can tolerate momentary staleness (the vast majority of reads).
- Strongly consistent reads always return the most recent committed write, by reading the leader replica. They cost a full RCU per 4 KB, have slightly higher latency, are not available on GSIs, and fall back to an error if a replica is unavailable (less resilient). Use them only where read-after-write correctness is required (e.g. “did my write land?”).
Two caveats worth memorising: GSIs are always eventually consistent (you cannot request a strong read on a GSI), and global tables are always eventually consistent across Regions (a strong read is only ever “strong” within a single Region). So “strongly consistent” never means “globally consistent”.
Global tables: multi-Region, active-active
A global table is a single DynamoDB table replicated across multiple AWS Regions, with active-active read and write in every Region. DynamoDB asynchronously propagates writes between Regions (typically within a second), giving you low-latency local access for users in each Region and a Region-level disaster-recovery posture out of the box. Essentials:
- Replication is built on DynamoDB Streams, so the table must have streams enabled (
NEW_AND_OLD_IMAGES); current (“v2”) global tables are managed for you. - Cross-Region replication is eventually consistent — a write in
ap-south-1shows up inus-east-1after a short lag. - Conflicts (the same item written in two Regions at almost the same time) are resolved last-writer-wins using a reconciliation timestamp. If your design can’t tolerate that, partition writes by Region (each Region “owns” certain keys).
- You can add or remove replica Regions on a live table; each replica is billed for its own storage and replicated write capacity (rWCUs).
Global tables give multi-Region resilience and locality cheaply, provided your application can live with eventual cross-Region consistency and last-writer-wins.
Backup, restore, and point-in-time recovery
DynamoDB offers two complementary protections:
- Point-in-time recovery (PITR) — when enabled, DynamoDB continuously backs up the table so you can restore to any second in the last 35 days (a rolling window). It protects against accidental writes/deletes and “bad deploy” data corruption. Restoring always creates a new table (it never overwrites the source); you then repoint your app. PITR adds a per-GB storage charge.
- On-demand backups — a full, manual (or AWS Backup-scheduled) snapshot retained until you delete it, for long-term/compliance retention beyond the 35-day PITR window. Backups and restores do not consume table capacity and have no performance impact.
Both restore to a new table; you can also do cross-Region and cross-account restores via AWS Backup. For DR, PITR covers “oops” within 35 days while global tables cover Region loss in real time — they solve different problems and are often used together.
Encryption, security, and access control
Encryption at rest is always on — every DynamoDB table is encrypted, you cannot turn it off. You choose the key:
| Key option | Who owns/manages | Cost | When |
|---|---|---|---|
| AWS owned key (default) | AWS, fully transparent | Free | Default; you don’t need key control or an audit trail |
AWS managed key (aws/dynamodb) |
AWS, in your account’s KMS | KMS charges | You want CloudTrail visibility of key use without managing a key |
| Customer managed key (CMK) | You, in KMS | KMS + per-request | You need control over rotation, key policies, and the ability to disable the key (which disables table access) |
In transit, all API calls are over HTTPS/TLS. Access control is pure IAM — there is no database user/password. IAM policies authorise actions (dynamodb:GetItem, Query, PutItem, …) on table and index ARNs, and DynamoDB supports remarkably fine-grained access control: you can restrict a principal to specific items or even specific attributes using the dynamodb:LeadingKeys condition key (e.g. “a user may only read items whose partition key equals their own user id”) — the backbone of multi-tenant designs. Add VPC endpoints (Gateway type) to keep traffic off the public internet, and CloudTrail logs the control-plane and (optionally, via data events) the data-plane.
The DynamoDB landscape at a glance
The diagram above ties the pieces together: items hashed by partition key onto partitions (with the sort key ordering items inside a partition), the two capacity modes feeding throughput, GSIs/LSIs as alternate query views, Streams emitting an ordered change log into Lambda/Kinesis (and powering global tables), DAX caching reads in front, and PITR/backups and KMS encryption wrapping the table — the same mental map to keep while you read the rest of this lesson.
Creating a table: every setting
Whether you use the console, CLI, or IaC, a table is defined by the same set of choices. Here is every one, with the what/choices/default/when/gotcha treatment.
| Setting | What it is / choices | Default | When / gotcha |
|---|---|---|---|
| Table name | Unique per Region per account | — | Immutable; choose a convention (app-env-entity) |
| Partition key | Name + type (S/N/B) — the hash key |
required | Immutable after creation; pick a high-cardinality value |
| Sort key | Optional name + type — the range key | none | Adds Query/range power; immutable; the combination must be unique |
| Capacity mode | On-demand or Provisioned | On-demand (console default) | Switchable once per 24 h; on-demand = safe default |
| Provisioned RCU/WCU | Throughput numbers (provisioned mode) | 5/5 (console) | Enable auto scaling with min/max + target % instead of fixing |
| Table class | Standard or Standard-IA (Infrequent Access) | Standard | Standard-IA: cheaper storage, pricier throughput — for large, rarely-read tables |
| Secondary indexes | LSIs (creation-time only) and GSIs (anytime) | none | LSIs share base capacity + 10 GB collection limit; GSIs have own capacity |
| Encryption | AWS owned / AWS managed / CMK | AWS owned | Always on; CMK for control + audit |
| DynamoDB Streams | Off, or on with a StreamViewType |
Off | Required for global tables & CDC; 24 h retention |
| Kinesis data stream | Optionally also stream to Kinesis | Off | For long retention / fan-out / analytics |
| TTL | Optional TTL attribute (Number, epoch seconds) | Off | Free deletes within ~48 h; deletions appear in Streams |
| PITR | Continuous backup (35-day restore) | Off (on by default for new tables in console as of recent updates) | Per-GB cost; restores to a new table |
| Deletion protection | Block accidental DeleteTable |
Off | Turn on for any production table |
| Tags | Key/value metadata | none | For cost allocation & governance |
Create a table with the CLI (composite key, on-demand, streams, PITR, deletion protection).
REGION=ap-south-1
aws dynamodb create-table \
--table-name AppData \
--attribute-definitions \
AttributeName=PK,AttributeType=S \
AttributeName=SK,AttributeType=S \
--key-schema \
AttributeName=PK,KeyType=HASH \
AttributeName=SK,KeyType=RANGE \
--billing-mode PAY_PER_REQUEST \
--stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES \
--deletion-protection-enabled \
--tags Key=env,Value=lab \
--region $REGION
aws dynamodb wait table-exists --table-name AppData --region $REGION
aws dynamodb update-continuous-backups --table-name AppData \
--point-in-time-recovery-specification PointInTimeRecoveryEnabled=true --region $REGION
Provisioned mode with auto scaling instead uses --billing-mode PROVISIONED --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5, then aws application-autoscaling register-scalable-target + put-scaling-policy on dynamodb:table:ReadCapacityUnits/WriteCapacityUnits with a TargetTrackingScaling policy at 70%.
Add a GSI to a live table (any attribute, INCLUDE projection).
aws dynamodb update-table --table-name AppData \
--attribute-definitions AttributeName=GSI1PK,AttributeType=S AttributeName=GSI1SK,AttributeType=S \
--global-secondary-index-updates '[{"Create":{
"IndexName":"GSI1",
"KeySchema":[{"AttributeName":"GSI1PK","KeyType":"HASH"},{"AttributeName":"GSI1SK","KeyType":"RANGE"}],
"Projection":{"ProjectionType":"INCLUDE","NonKeyAttributes":["status","total"]}}}]' \
--region $REGION
The GSI back-fills in the background (the table stays available); watch IndexStatus go CREATING → ACTIVE.
After creation: what you can (and can’t) change
| Operation | Can you? | Notes |
|---|---|---|
| Change the partition/sort key | No | Keys are immutable — you must create a new table and migrate (export → transform → import). |
| Change capacity mode | Yes, once per 24 h | On-demand ⇄ provisioned. |
| Adjust provisioned RCU/WCU | Yes, anytime | Decreases are limited per day; auto scaling handles this for you. |
| Add/remove a GSI | Yes, anytime | Adding back-fills online; you can have GSIs in different states. |
| Add/remove an LSI | No | LSIs exist only from table creation. |
| Change a GSI’s projection | No | Delete and recreate the GSI with the new projection. |
| Enable/disable Streams | Yes (re-enabling starts a new stream, no history) | Required for global tables. |
| Enable/disable TTL, PITR, deletion protection | Yes, anytime | — |
| Change table class | Yes | Standard ⇄ Standard-IA. |
| Change encryption key | Yes | Switch among AWS-owned/managed/CMK. |
| Add a replica Region (global table) | Yes | Streams must be on; each replica billed separately. |
Hands-on lab
In this lab you create an on-demand table (so it costs essentially nothing), write and read items, run a Query, add a GSI, enable TTL, take a backup, and clean up. Uses the aws CLI (CloudShell or local).
1. Create an on-demand table with a composite key and a stream.
REGION=ap-south-1
aws dynamodb create-table --table-name LabOrders \
--attribute-definitions AttributeName=PK,AttributeType=S AttributeName=SK,AttributeType=S \
--key-schema AttributeName=PK,KeyType=HASH AttributeName=SK,KeyType=RANGE \
--billing-mode PAY_PER_REQUEST \
--stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES \
--region $REGION
aws dynamodb wait table-exists --table-name LabOrders --region $REGION
Expected: the wait returns once the table is ACTIVE.
2. Write a few items (two orders for one customer).
aws dynamodb put-item --table-name LabOrders --region $REGION --item '{
"PK":{"S":"CUST#42"},"SK":{"S":"ORDER#2026-06-15#1001"},
"status":{"S":"PLACED"},"total":{"N":"1299"},"city":{"S":"Mumbai"}}'
aws dynamodb put-item --table-name LabOrders --region $REGION --item '{
"PK":{"S":"CUST#42"},"SK":{"S":"ORDER#2026-06-15#1002"},
"status":{"S":"PLACED"},"total":{"N":"499"},"city":{"S":"Mumbai"}}'
Expected: both calls return with no error.
3. GetItem (one item by full key) and Query (all orders for the customer).
aws dynamodb get-item --table-name LabOrders --region $REGION \
--key '{"PK":{"S":"CUST#42"},"SK":{"S":"ORDER#2026-06-15#1001"}}' \
--consistent-read # strongly consistent read
aws dynamodb query --table-name LabOrders --region $REGION \
--key-condition-expression "PK = :c AND begins_with(SK, :p)" \
--expression-attribute-values '{":c":{"S":"CUST#42"},":p":{"S":"ORDER#2026-06-15"}}' \
--query "Items[].SK.S" --output table
Expected: the get-item returns the 1001 order (strongly consistent); the query returns both SKs in sort order — and note we used begins_with on the sort key, the canonical DynamoDB range pattern.
4. Add a GSI to query by status (a different access pattern), then query it.
aws dynamodb update-table --table-name LabOrders --region $REGION \
--attribute-definitions AttributeName=status,AttributeType=S AttributeName=total,AttributeType=N \
--global-secondary-index-updates '[{"Create":{
"IndexName":"byStatus",
"KeySchema":[{"AttributeName":"status","KeyType":"HASH"},{"AttributeName":"total","KeyType":"RANGE"}],
"Projection":{"ProjectionType":"ALL"}}}]'
# wait for the GSI to finish back-filling:
aws dynamodb describe-table --table-name LabOrders --region $REGION \
--query "Table.GlobalSecondaryIndexes[0].IndexStatus" --output text
# once it prints ACTIVE:
aws dynamodb query --table-name LabOrders --index-name byStatus --region $REGION \
--key-condition-expression "#s = :v" \
--expression-attribute-names '{"#s":"status"}' \
--expression-attribute-values '{":v":{"S":"PLACED"}}' \
--query "Items[].SK.S" --output table
Expected: IndexStatus transitions CREATING → ACTIVE; the GSI query returns both orders by status — an access pattern the base key could not serve. (Note GSI reads are eventually consistent — --consistent-read is rejected here.)
5. Enable TTL on an expiresAt attribute.
aws dynamodb update-time-to-live --table-name LabOrders --region $REGION \
--time-to-live-specification "Enabled=true,AttributeName=expiresAt"
aws dynamodb describe-time-to-live --table-name LabOrders --region $REGION
Expected: TTL status ENABLED on expiresAt. (Items get deleted within ~48 h of their epoch timestamp passing — free of charge.)
6. Take an on-demand backup, then list it.
aws dynamodb create-backup --table-name LabOrders --backup-name LabOrders-snap --region $REGION
aws dynamodb list-backups --table-name LabOrders --region $REGION \
--query "BackupSummaries[].{name:BackupName,status:BackupStatus}" --output table
Expected: a backup with status AVAILABLE.
7. Cleanup.
# delete the backup
BK=$(aws dynamodb list-backups --table-name LabOrders --region $REGION \
--query "BackupSummaries[0].BackupArn" --output text)
aws dynamodb delete-backup --backup-arn "$BK" --region $REGION
# delete the table (this also removes its GSIs and stream)
aws dynamodb delete-table --table-name LabOrders --region $REGION
aws dynamodb wait table-not-exists --table-name LabOrders --region $REGION
Validation: aws dynamodb describe-table --table-name LabOrders --region $REGION eventually returns ResourceNotFoundException. (If delete-table is blocked, you left deletion protection on — aws dynamodb update-table --no-deletion-protection-enabled first.)
Cost note (INR-aware): an on-demand table with a handful of items and requests costs effectively nothing — DynamoDB’s free tier includes 25 GB of storage and a generous monthly allowance of on-demand requests, and you pay only per request beyond that. The things that quietly cost money: provisioned capacity left running (you pay per unit-hour even idle — which is why this lab used on-demand), GSIs with ALL projection (extra storage + a write per base write), PITR (per-GB), on-demand backups (persist until deleted — step 7 removes it), and a DAX cluster (always-on nodes). Deleting the table and the backup leaves nothing billing.
Common mistakes & troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
ProvisionedThroughputExceededException / throttling on one key |
Hot partition from a low-cardinality or time-based partition key | Choose a high-cardinality key; write-shard skewed keys; rely on adaptive capacity for transient skew. |
| Throttling though total capacity looks fine (provisioned) | A single partition/key is the bottleneck, not the table total | Same as above; for steady spikes, switch to on-demand. |
| A query is slow and expensive and reads the whole table | You’re using Scan, not Query |
Redesign keys/indexes so the access pattern is a Query/GetItem; never Scan in a hot path. |
ValidationException: ... ConsistentRead ... not supported on ... index |
Asked for a strongly consistent read on a GSI | GSIs are eventually consistent only; drop --consistent-read or use an LSI/base table. |
| Writes to the base table suddenly throttle after adding a GSI | The GSI’s (provisioned) write capacity can’t keep up | Provision the GSI generously or use on-demand; project fewer attributes. |
| “I need to add an LSI but can’t” | LSIs are creation-time only | Use a GSI instead, or recreate the table with the LSI. |
| Query returns items missing some attributes | Those attributes aren’t projected into the GSI | Use INCLUDE (the needed attrs) or ALL projection; recreate the GSI to change projection. |
Item won’t save: Item size has exceeded the maximum |
Item exceeds the 400 KB limit | Store the large blob in S3, keep a pointer in DynamoDB; split the item. |
| TTL items still showing up | TTL deletes within ~48 h, not instantly; or wrong attribute type | Add a filter expression excluding expired items; ensure the TTL attribute is a Number epoch. |
| Cross-Region reads seem stale | Global tables are eventually consistent across Regions | Expected; design for it (last-writer-wins / Region-owned keys). |
Best practices
- Model from access patterns, not entities. List every query the app must serve first, then design keys and a small number of GSIs to make each one a
Query/GetItem— never aScan. (See the single-table design lesson.) - Pick a high-cardinality partition key and write-shard any naturally skewed (time-based, low-cardinality) key to avoid hot partitions.
- Default new workloads to on-demand; move to provisioned + auto scaling (and reserved capacity) only once traffic is steady and the per-request maths favours it.
- Project only what you query into GSIs (
INCLUDEoverALLwhere you can) to control write amplification and cost, and provision GSIs generously so they never throttle the base table. - Use eventually consistent reads by default (half the cost) and reserve strong reads for genuine read-after-write needs.
- Use TTL for expiring data (free deletes), DAX for read-heavy/hot-key/microsecond workloads, and transactions only when you genuinely need multi-item atomicity (they cost double).
- Turn on PITR and deletion protection for every production table; take on-demand backups before risky migrations; use global tables for multi-Region resilience and locality.
- Drive Streams from Lambda for CDC, make consumers idempotent, and choose
NEW_AND_OLD_IMAGESwhen you need to diff changes.
Security notes
- Encryption at rest is always on; use a customer-managed KMS key when you need rotation control, key policies, and the ability to revoke access (disabling the key disables the table). All API traffic is TLS.
- Authorise with least-privilege IAM scoped to specific table/index ARNs and actions; use
dynamodb:LeadingKeysfor fine-grained, per-item access (e.g. a user reads only their own partition) — the backbone of multi-tenant designs. - Restrict attributes where needed with the
dynamodb:Attributescondition and projection expressions so callers can’t read columns they shouldn’t. - Keep traffic private with a Gateway VPC endpoint for DynamoDB; there is no public database endpoint to lock down, but the endpoint keeps requests off the internet.
- Audit with CloudTrail — control-plane events always, and enable data events for sensitive tables to log item-level access.
- Lock down backups and exports (S3 export buckets) — they contain all your data; control who can
RestoreTable,ExportTableToPointInTime, and read the export bucket. - Beware
Scanin IAM-permissive roles — a broaddynamodb:Scanpermission can exfiltrate an entire table; grantQuery/GetIteminstead where possible.
Interview & exam questions
-
What is the difference between a partition key and a sort key, and what does each enable? The partition key (hash key) determines which physical partition an item lives on (via an internal hash) and, alone, identifies an item in a simple primary key. Adding a sort key makes a composite key: items with the same partition key are stored together and sorted by sort key, which enables the efficient
Query(fetch a partition or a contiguous range). The partition key drives distribution; the sort key drives ordering/range within a partition. -
What causes a hot partition and how do you avoid it? Concentrating traffic on one (or few) partition-key values — a low-cardinality key, a time-based key (today’s date), or a single popular item. The table’s total capacity may be fine while one partition throttles. Avoid it with a high-cardinality partition key and by write-sharding skewed keys (suffix
#0..#N); adaptive capacity absorbs transient skew but can’t fix a fundamentally bad key. -
On-demand vs provisioned capacity — when each? On-demand scales automatically and you pay per request — best for spiky/unpredictable or new workloads and to avoid throttling from under-provisioning. Provisioned (with auto scaling and optionally reserved capacity) is cheaper per request for steady, predictable traffic if well-utilised, but you pay for provisioned units even when idle and can throttle when demand exceeds provisioned + burst. You can switch modes once per 24 h.
-
How do you compute RCUs and WCUs? 1 WCU = one 1 KB write/sec (round up; transactional = 2×). 1 RCU = one strongly consistent 4 KB read/sec, two eventually consistent 4 KB reads/sec (eventual = half), or one transactional read = 2 RCUs (round size up to 4 KB). E.g. a strongly-consistent read of an 8 KB item = 2 RCUs; eventually = 1 RCU.
-
LSI vs GSI — the key differences? An LSI shares the table’s partition key with a different sort key, must be created with the table, shares the base table’s capacity, supports strong reads, and is bound by the 10 GB item-collection limit. A GSI can use any attribute as its key, can be added/removed anytime, has its own capacity, is eventually consistent only, and has no collection-size limit. GSIs are the everyday tool; LSIs are for “same partition, alternate sort, strong read” cases.
-
What are index projections and why do they matter? A projection controls which attributes are copied into the index: KEYS_ONLY (keys only — smallest, may force a base-table follow-up read), INCLUDE (keys + named attributes — balanced), ALL (every attribute — convenient but largest, and you pay a full index write per base write). For a GSI, queries see only projected attributes. Choose
INCLUDEwith exactly what your queries return. -
What is a DynamoDB Stream and what are the StreamViewType options? An ordered, 24-hour log of item-level changes (create/update/delete) for CDC.
StreamViewTypeselects the payload: KEYS_ONLY, NEW_IMAGE, OLD_IMAGE, or NEW_AND_OLD_IMAGES (both — needed to diff). Ordering is guaranteed per partition key; delivery is at-least-once, so consumers (often Lambda via an event source mapping) must be idempotent. -
Eventual vs strong consistency — and where can’t you get strong? Eventual reads (default, half the cost) may briefly miss the latest write; strong reads (full cost, slightly higher latency, less resilient) always return the latest committed write. You cannot get a strong read on a GSI, and global tables are eventually consistent across Regions — “strong” is only ever within one Region.
-
What does DAX accelerate, and what can’t it do? DAX is a write-through, in-memory cache in front of DynamoDB that turns millisecond reads into microsecond reads for
GetItem/Query/Scan. It cannot serve strongly consistent reads (those bypass it) and doesn’t help write-heavy workloads; it runs as provisioned nodes in your VPC (always-on cost). -
How do DynamoDB transactions work and what do they cost?
TransactWriteItems/TransactGetItemsgive all-or-nothing ACID across up to 100 items/multiple tables in one Region. They cost double the normal capacity and can fail withTransactionCanceledExceptionon a failed condition or a conflict (retry). For single-item atomicity, a condition expression onPutItem/UpdateItemis cheaper. -
What is a global table and what consistency does it offer? A multi-Region, active-active replicated table (built on Streams) giving local low-latency reads/writes per Region and Region-level DR. Cross-Region replication is eventually consistent with last-writer-wins conflict resolution — so design for eventual consistency or give each Region ownership of certain keys.
-
PITR vs on-demand backups — when each? PITR continuously backs up and restores to any second in the last 35 days (great for “oops” recovery); on-demand backups are manual snapshots retained indefinitely for long-term/compliance needs. Both restore to a new table and don’t consume table capacity. For Region loss, use global tables, not backups.
Quick check
- You need every order for one customer, newest first, in one efficient call. What primary-key shape and which operation?
- True or false: you can add a Local Secondary Index to an existing table.
- A strongly-consistent read of a 10 KB item costs how many RCUs? Eventually consistent?
- Your GSI write capacity is too low in provisioned mode — what happens to writes on the base table?
- Which
StreamViewTypedo you choose when you need to compute exactly what changed on each update?
Answers
- A composite primary key (partition key =
customerId, sort key = something time/order-ordered likeORDER#<timestamp>), queried withQuery(optionally withScanIndexForward=falsefor newest-first). Same partition + sorted = one efficient range read. - False. LSIs can be created only at table creation; for an existing table use a GSI.
- 10 KB rounds up to 12 KB → 3 × 4 KB = 3 RCUs strongly consistent; eventually consistent is half, so 1.5 RCUs.
- The base table’s writes are throttled too — DynamoDB won’t let the GSI fall arbitrarily behind. Provision the GSI generously or use on-demand.
NEW_AND_OLD_IMAGES— it carries both the before and after item so the consumer can diff them.
Exercise
Design the DynamoDB table(s) and indexes for a multi-tenant SaaS task tracker that must serve these access patterns: (a) get a single task by id; (b) list all tasks in a project, sorted by due date; © list all tasks assigned to a user across projects, filtered by status; (d) expire tasks 90 days after completion automatically; (e) react to every task change to update a per-project “open task count”. For each access pattern, state the key or index and the operation (GetItem/Query) you would use — and justify why none of them needs a Scan. Specify: the partition/sort key for the base table and how you avoid a hot partition for a very large project; the GSI(s) with their keys and projection (and why that projection); how you implement (d) with TTL; and how you implement (e) with Streams (which StreamViewType and why, and how you keep the consumer idempotent). Then choose a capacity mode with a one-line justification, and decide whether this design warrants DAX and/or a global table. Finally, write the aws dynamodb create-table and update-table (GSI) commands to build it.
Certification mapping
- AWS Certified Developer – Associate (DVA-C02): core territory — the data model and partition/sort keys,
QueryvsScan, LSI vs GSI and projections, RCU/WCU maths, eventual vs strong reads, DynamoDB Streams + Lambda triggers, transactions, conditional writes/optimistic locking, TTL, and DAX. - AWS Certified Solutions Architect – Associate (SAA-C03): when to choose DynamoDB vs RDS, on-demand vs provisioned (+ auto scaling), GSIs for access patterns, global tables for multi-Region, DAX for read acceleration, PITR/backup, and encryption.
- AWS Certified SysOps Administrator – Associate (SOA-C02): operating tables — capacity/auto scaling, CloudWatch metrics (
ThrottledRequests,ConsumedRead/WriteCapacityUnits), PITR/backups, and adaptive-capacity/hot-partition troubleshooting. - AWS Certified Solutions Architect – Professional / Data Engineer: deeper modelling (single-table design), Streams/Kinesis CDC pipelines, global-table conflict design, and cost optimisation (reserved capacity, table classes, projection tuning).
Glossary
- Table / item / attribute — a collection of items; an item is a set of attributes (name/value); the only shared structure is the primary key.
- Partition key (hash key) — the attribute hashed to choose an item’s partition; drives data distribution.
- Sort key (range key) — the second part of a composite key; orders items within a partition and enables
Queryranges. - Partition — an internal storage unit (≈10 GB, ~3,000 RCU / ~1,000 WCU) replicated across 3 AZs; tables grow by adding partitions.
- Hot partition — a partition receiving disproportionate traffic (from a skewed key), causing throttling despite spare table capacity.
- RCU / WCU — read/write capacity units: 1 RCU = one strong 4 KB read/s (or two eventual); 1 WCU = one 1 KB write/s.
- On-demand / provisioned — pay-per-request auto-scaling mode vs pre-provisioned throughput (with auto scaling/reserved capacity).
- LSI / GSI — Local (same PK, alt sort, creation-time, shares capacity, strong reads) / Global (any key, anytime, own capacity, eventual) Secondary Index.
- Projection — which attributes an index copies: KEYS_ONLY, INCLUDE, or ALL.
- DynamoDB Stream — ordered 24-h change log (per-partition order);
StreamViewType= KEYS_ONLY / NEW_IMAGE / OLD_IMAGE / NEW_AND_OLD_IMAGES. - TTL — automatic, free deletion of items past an epoch-seconds timestamp attribute (within ~48 h).
- DAX — DynamoDB Accelerator, a managed in-memory write-through read cache (microsecond reads; no strong reads).
- Transaction —
TransactWriteItems/TransactGetItems: ACID across ≤100 items in one Region, at 2× capacity cost. - Eventual vs strong consistency — default half-cost reads that may lag vs full-cost reads of the latest write (no strong on GSIs/across Regions).
- Global table — a multi-Region, active-active replicated table (eventually consistent, last-writer-wins).
- PITR — point-in-time recovery: restore to any second in the last 35 days (to a new table).
Next steps
You now know DynamoDB end to end — the data model and partition/sort keys, how partitioning and hashing cause and prevent hot partitions, both capacity modes and the RCU/WCU maths, LSIs vs GSIs and projections, Streams and CDC, TTL, DAX, transactions, the consistency model, global tables, and PITR/backups/encryption. From here:
- Turn this into real schemas with DynamoDB Single-Table Design: Modeling Access Patterns, GSIs, and Hot Partition Avoidance.
- Build reliable event-driven pipelines off the change log in Change Data Capture with DynamoDB Streams: Lambda Triggers, EventBridge Pipes, and Exactly-Once Processing.
- Next in the course we move from the data store to the API in front of it: Amazon API Gateway, In Depth: REST vs HTTP vs WebSocket APIs, Integrations & Authorizers.