THN Interview Prep

Databases & caching

Data systems are where constraints become real: latency, correctness, money, and compliance all meet the storage layer. Staff-level answers name consistency guarantees, failure modes (replica lag, split brain, hot keys), and operational habits (migrations, backups, observability)—not only “pick Postgres.”

How to use this page

  • Use Core database ideas as a vocabulary checklist (ACID, isolation, plans, replication).
  • Use Caching alongside databases to keep source of truth vs derived copies explicit.
  • Skim Diagrams before mock interviews—practice narrating arrows out loud.
  • Run Things to prepare as a literal checklist for projects or interview stories.
  • Staff / SDE3 bar (Node.js + AWS): pair RDS (B-tree OLTP) with Dynamo-style partitioned thinking where relevant; treat ElastiCache as a consistency problem, not only latency—internals live under Storage engines, CAP & distributed data and Application caching & consistency.

Topic study plan (deep pages)

Each topic under /databases/topics/... follows: Core details → Understanding → Senior understanding → Diagram.

TopicFocus
Storage models & access pathsSQL vs NoSQL, document modeling, data warehouse, use cases, when to choose which
Storage engines, CAP & distributed dataB-tree vs LSM, CAP/PACELC, quorum, read repair, sharding—RDS / Dynamo mental models
AWS data services & Node.jsRDS Proxy, DynamoDB GSI/LSI/streams, ElastiCache cluster, Memcached vs Redis
CDC & outbox projectionsTransactional outbox, CDC, dual-write antipattern, projections
Transactions & isolationACID vs BASE, isolation & MVCC, 2PC vs saga, scopes & locks
Indexing & query tuningNormalization, index patterns, query optimization, indexing Q&A
Replication, sharding & operationsLeaders, lag, partitions, backups, RPO/RTO
Application caching & consistencyElastiCache/Redis topologies, eviction (LRU/LFU/W-TinyLFU), penetration/herd/avalanche, dual-write

Core database ideas

Storage models (when you’d reach for each)

Deep page: Storage models & access paths (SQL vs NoSQL, warehouse vs OLTP, document modeling, when to choose which).

FamilyStrengthTypical costs / caveats
Relational (OLTP)joins, constraints, transactionsschema migration discipline
Wide-column / KVhuge scale, simple access patternsawkward ad-hoc analytics
Documentflexible shapes, nested readscross-document consistency harder
Search / analytics storesrankings, aggregationsnot a source of truth for money
Time-seriesmetrics, retention policiescardinality & retention ops

Interview line: “I’d choose the model after access paths and consistency requirements, not the logo on the box.”

ACID (know what each letter guards)

LetterGuards againstImplication
Atomicitypartial writesbatch work or explicit compensation
Consistencyinvalid states your rules forbidconstraints + app invariants together
Isolationconcurrent anomaliespick level per transaction, not globally maximal
Durabilityacknowledged lossWAL / replication; fsync semantics matter

Isolation anomalies (name → fix intuition)

AnomalyOne-line intuitionCommon levers
Dirty readseeing uncommitted workisolation ↑, shorter txns
Non-repeatable readrow changes between reads in one txnlocking, snapshot isolation
Phantom readnew rows appear in rangerange locks / higher isolation sparingly
Lost updatetwo writers clobberoptimistic version column, SELECT … FOR UPDATE

Not every endpoint needs SERIALIZABLE—cost and deadlock risk are real; scope transactions to the smallest critical section.

Indexing & the query path

Deep page: Indexing & query tuning (normalization, index Q&A, optimization patterns).

  • B-tree (and friends) power most default indexes; composite index column order matches filter + sort patterns.
  • Covering indexes avoid heap/table lookups when the planner can satisfy the query from the index alone.
  • Cardinality & statistics skew plans—stale stats → wrong nested loops vs seq scans.

Operational loop:

  1. Identify hot query text (slow log / pg_stat_statements class metrics).
  2. EXPLAIN (ANALYZE) or vendor equivalent—compare estimated vs actual rows.
  3. Fix with query rewrite before index soup; add indexes only when the plan proves it.

Schema evolution

Treat migrations as deploy trains: additive-first columns, dual-write / backfill phases, backward-compatible reads until cutover. Breaking changes are timelines, not single PRs.

Connections & pools

Exhausted connection pools look like “the database is slow.” Watch wait time to acquire a connection, not only query duration.

Related depth: Database query path, Profiling services & async, /backend.
AWS + integration: AWS data services & Node.js, CDC & outbox.


Distributed data (production realism)

Replication basics

PatternRead freshnessOps story
Single leaderstrong on primary; replicas may lagfailover, promotion drills
Multi-leaderconflict handling requiredrare; careful merge rules
quorum / consensuspricey; correctness-critical configsoperational complexity

Replica lag surfaces as intermittent “wrong data” unless you engineer read-your-writes (sticky to primary, version checks, or user-visible staleness).

Partitioning / sharding

Choose a partition key from real access paths—resharding later is expensive. Watch hot partitions (celebrity keys) and plan salting or key redesign early.

Backup & recovery (you should be able to narrate)

  • RPO/RTO targets tied to product risk.
  • Restore drills on a schedule—not “we assume backups work.”
  • Logical vs physical backup tradeoffs (portability vs speed).

Caching alongside databases

Caches are derived state. The database (or durable event log) remains authority unless you consciously accept staleness with product guardrails.

Layers (stack north to south)

LayerLatency winStaleness / safety traps
Browser / CDNhuge for assets & staticpersonalized HTML at edge
Application memorymicrosecond classper-instance inconsistency
Shared remote (Redis class)cross-instance sharingthundering herd, key poisoning
DB buffer poolautomaticnot a substitute for app cache design

Patterns (who owns invalidation?)

PatternFlowOwns freshness
Cache-asideapp reads cache, loads DB on miss, populatesapp + TTL/events
Read-throughcache library loads on misscache service + TTL
Write-throughwrites go to cache + DB togethertighter but write latency
Write-behindwrites acknowledged before durable flushrisk—queue loss

Mitigate cache stampede: TTL jitter, request coalescing / single-flight, probabilistic early refresh.

Related depth: Caching & consistency, Caching & retry amplification.


Diagrams

Authority vs derived cache

Loading diagram…

Cache-aside read path

Loading diagram…

Transaction scope vs long work

Loading diagram…

Things to prepare

Before you design or pitch a data layer

  • Access paths: list the 5–10 hottest queries and their filters/sorts.
  • Consistency: which reads must be linearizable vs OK with seconds of lag?
  • Growth: data volume curve, retention, archival, legal holds.
  • Multi-tenancy: tenant key in every table/index story; noisy neighbor controls.
  • Migration runway: additive schema steps, backfills, dual reads/writes.

Before production hardening

  • Backups with documented restore drill frequency and owner.
  • Replication lag dashboards and alerts tied to user-facing symptoms.
  • Connection pool limits per service; saturation metrics.
  • Slow query logging and top N review cadence.
  • Secrets rotation path; least privilege DB roles.
  • Idempotency for writers (see /backend/topics/idempotency-at-least-once).

Interview prep (90 minutes, repeatable)

BlockMinutesOutput
Draw read + write path with cache + primary + replica15can narrate every arrow
Walk one slow query from log → plan → fix20EXPLAIN vocabulary
Cache invalidation story for one feature15events, TTL, stampede
Failure: replica lag + dual write15UX + detection
Schema change without downtime15additive phases

Memorize one crisp line for each: isolation level choice, stampede mitigation, RPO/RTO, when not to cache.


Pitfalls

  • Treating read replicas as strongly consistent without version checks or routing rules.
  • Giant transactions holding locks while calling HTTP or queues.
  • Unbounded caches without eviction, TTL, or memory caps—OOM incidents.
  • Cache as source of truth for balances or inventory without durable reconciliation.
  • Indexing every column without examining plans—write amplification and bloated storage.

  • /backend — lifecycle, timeouts, messaging, caching entry points.
  • /performance — tail latency, retries, thundering herds, DB performance notes.
  • /security — injection, least privilege, secrets.

Mark this page when you finish learning it.

Spotted something unclear or wrong on this page?

On this page