DynamoDB Internals (Interview Deep Dive)
DynamoDB is a fully managed key-value and document store that scales horizontally via partitioning. Your interview narrative should always anchor on how keys route work, what consistency you buy, and how hot partitions happen.
Logical model: tables of items (rows) with attributes (columns). Every item includes primary key attributes—either a simple partition key (hash key) or composite (partition key + sort key).
Partition keys and adaptive capacity
DynamoDB splits data across partitions behind the scenes based on partition key distribution and storage size. Each partition has a throughput budget (Provisioned mode) or shares on-demand burst characteristics (with caveats). If one partition key value dominates traffic (hot key), that shard saturates—throttling (ProvisionedThroughputExceededException) even if table-wide capacity looks fine.
Adaptive capacity (and successors in newer accounts/configurations) redistributes unused capacity toward hot partitions—helps, but does not erase bad key design.
Mitigations:
- Salt/diversify keys: prepend random shards (
USER#<shard>#123) for writes; merge on read (expensive), or use write sharding patterns judiciously. - caching with caching for read-heavy hot keys.
- SQS/Kinesis buffering for write spikes.
See sharding for the general pattern; DynamoDB enforces it whether or not you planned it.
Sort keys: adjacency and access patterns
When you add a sort key, items sharing a partition key form an ordered collection—ideal for 1:N relationships (user → orders by time), version streams, and range queries.
Design composite keys so your query patterns become Query on a single partition, not Scan. Scan is rarely acceptable at scale.
LSI (Local Secondary Index): same partition key, alternate sort key—shares base table partition layout; limited flexibility.
GSI (Global Secondary Index): projects attributes under a new partition/sort scheme—different routing; asynchronous replication; eventual consistency on index reads.
Index design drives cost (storage + WCUs/RCUs on GSIs) and duplicate write amplification.
Consistency model on reads
DynamoDB defaults to eventually consistent reads (cheaper). Strongly consistent reads return the latest acknowledged write for that key (extra cost). Frame this with consistency models: “strong” here is not cross-item transactions unless you use Transact APIs with explicit constraints.
TransactWriteItems / TransactGetItems provide ACID across multiple items within an account/region with limits—great for invariant enforcement, but watch item collection sizes, latency, and conflict behaviors.
DynamoDB Streams and change data capture
Streams emit ordered records of item-level changes (keys, new/old images as configured). Typical uses:
- Materialized views in Elasticsearch/OpenSearch
- Fan-out to analytics (Kinesis/S3 via pipelines)
- Trigger-style Lambdas (mind partial failure and idempotency)
Streams are not replacement for Kafka at cross-org scale, but tight integration for AWS-native designs.
Link failure handling to idempotency: consumers must tolerate duplicates and at-least-once delivery.
DAX (DynamoDB Accelerator)
DAX is an in-memory write-through cache for DynamoDB with microsecond read latencies for cache hits. It maintains item/cache consistency semantics suitable for many read-mostly paths—still treat as caching layer with ASG sizing and TTL behaviors.
Mention DAX when someone asks how to cut read latency without rewriting the data model.
Capacity modes and cost psychology
- On-demand: pay per request; great for spiky/unknown loads; watch steady-state cost.
- Provisioned: requires tuning RCU/WCU; use auto-scaling policies; mind hot partitions still.
Burst buckets and throttling behavior appear in case studies—pair with latency and throughput.
Interview phrase
“I start from access patterns, then choose PK/SK so each query is a targeted partition read; I only GSIs I can afford to feed; I watch hot partitions and use cache/DAX or key sharding when a celebrity key spikes; streams feed downstream projections with idempotent consumers.”
Related reading
Single-table design vs relational normalization
Advanced DynamoDB modeling often uses single-table design: multiple entity types coexist in one table, distinguished by SK prefixes (USER#123, ORDER#2024-01#...) and sparse GSIs for alternative access paths. The benefit is fewer round trips and prejoined item collections; the cost is cognitive load and migration rigor.
Contrast with normalized SQL in Postgres deep dive: pick DynamoDB when access patterns are known and stable; reach for SQL when ad hoc analytics and joins dominate.
Time to live (TTL) and background deletion
TTL attributes let DynamoDB delete items asynchronously—great for session artifacts and ephemeral logs. Deletion is not millisecond-precise; do not use TTL as a safety interlock without margin.
Backups, PITR, and restores
Point-in-time recovery (PITR) offers second-level restore windows for oopsies—operational insurance. On-demand backups for milestones. Interview mention: restore time grows with table size and cross-table coordination still needs app planning.
Global tables and cross-region
Global Tables provide multi-master style replication with last-writer-wins conflict resolution—powerful, but you must own causal expectations. Link to replication and your product’s RPO/RTO story from availability math.
Cost gotchas
- Throttled requests that clients retry naively amplify cost and load.
- GSI projection choices (
ALLvsKEYS_ONLYvsINCLUDE) change storage and read patterns. - Scans in production batch jobs can dominate spend—prefer segmented parallel scans with care or upstream pre-aggregation.
When interviewers ask for “SQL on Dynamo”
PartiQL and DynamoDB console queries help ad hoc tasks, not magic cross-partition joins. For analytics, export to S3 + Athena, or stream to a warehouse. Positioning this clearly shows platform maturity.
Conditional expressions and optimistic locking
ConditionExpression guards writes (attribute_exists, numeric compares)—essential for optimistic concurrency with version attributes. Combine with idempotency for safe retries: distinguish business conflicts from transient errors.
DynamoDB local vs production parity
DynamoDB Local aids unit tests but differs in latency, capacity enforcement, and edge semantics—treat integration tests against real endpoints in staging for correctness confidence.
Interview recap checklist
- Partition key spreads writes; sort key enables range queries per partition.
- GSIs trade storage/WCU/RCU for new access paths—projection discipline matters.
- Streams export changes—consumers must be idempotent.
- Hot keys need cache, key diversification, or adaptive insights—not blind retries.
Last updated on
Spotted something unclear or wrong on this page?