THN Interview Prep

Storage models & access paths

Core knowledge (memorize the spine)

IdeaOne sentence
Access pathHow a query reaches rows with the fewest round-trips for your hottest patterns.
OLTPMany short read/write transactions; row-oriented; strong constraints.
Warehouse / OLAPHeavy scans & aggregates over history; columnar-friendly; batch / interactive BI.
SQLDeclarative language; relational engines implement it with tables, joins, transactions.
NoSQLFamily of stores (document, KV, wide-column, graph)—different consistency and query models, not “no schema.”
Polyglot persistenceMultiple stores on purpose, each owning a bounded workload with clear sync rules.

Interviews reward: name workloads first, then vendor. “We need Postgres” without QPS shape + consistency + query shape is weak signal.


SQL vs NoSQL (what the distinction actually means)

SQL usually implies a relational engine: tabular data, joins, ACID transactions, constraints, mature query planner. You think in schemas and normal forms (then denormalize where measured).

NoSQL is an umbrella:

CategoryData unitSweet spotRough tradeoff
DocumentBSON/JSON-like documentsAggregate-root reads, flexible attributesCross-document transactions often limited
KV / wide-columnkey → value / wide rowsMassive scale, simple get/put by keyAd-hoc relational queries painful
Graphnodes & edgesTraversal workloadsOps & query discipline at scale
Time-series(time, tags) → metricsRetention, downsamplingNot general OLTP

Misleading shortcut: “NoSQL = infinitely scalable SQL.” Scale depends on partition key, hot keys, replication, and accepting consistency windows—not the label.


Document databases (NoSQL) — how teams use them

A document bundles fields and often nested arrays/objects (e.g. order + line items).

PatternMeaningWhen
EmbeddedChild array/objects inside parent documentBounded children; read with parent in one fetch
ReferencedParent stores id of other docs; app or DB resolvesLarge / growing child sets; shared entities across parents
Aggregate root (DDD)Document boundary = one consistency unitUpdates stay inside one doc when possible

Use cases document stores fit well:

  • Catalog / product blobs with varying attributes (facets in search layer).
  • User profiles with moderate nested settings (watch document size).
  • Content / CMS with revision metadata.
  • Session / cart style state with TTL (still mind durability rules).

Use cases that hurt if you only have documents:

  • Ledger, balances, inventory needing serializable-ish invariants → relational or event-sourced core + projections.
  • Many-to-many with heavy join-like fan-out across keys → relational or graph + careful caching.
  • Warehouse-style analytics → copy to OLAP / columnar—do not brute-force on OLTP.
Loading diagram…

OLTP vs data warehouse

DimensionOLTP (operational)Data warehouse / OLAP
Queriespoint lookups, short joins, writesScans, aggregations, window logic
Latencyms–low hundreds ms for user pathsseconds to minutes (or precomputed)
Freshnessnear-real-time source of truthMinutes–hours behind is normal (ETL/ELT)
Schemanormalized operational schemaStar/snowflake, wide fact tables, denormalized dims
Computerow stores, indexes, transactionsColumnar, parallel query, partitions

Typical architecture: OLTP system of recordCDC / batch export → warehouse → BI / ML features. Do not run giant aggregations on the primary without guardrails (replica, limits, or move to warehouse).

Loading diagram…

When to choose which database (decision checklist)

Answer in order:

  1. Consistency: money / inventory / uniqueness → need transactions + constraints? Default relational for that core.
  2. Query shape: heavy joins & ad-hoc filters → relational; get by id + bounded embed → document viable.
  3. Scale pattern: predictable partition key and tolerable eventual reads? Wide-column/KV class; hot key plan mandatory.
  4. Search / ranking: primary store still authoritative; Elasticsearch/OpenSearch etc. as secondary index with sync lag understood.
  5. Analytics: warehouse for historical aggregates, not the OLTP primary.
  6. Team & ops: migration story, backup, observability—boring tech you operate well beats exotic tech you cannot recover.
Loading diagram…

Use cases at a glance

ScenarioOften fitsWhy
E-commerce orders + paymentsRelational (core)balances, inventory, idempotency keys
Product catalog with facetsDocument + search indexflexible attrs + ranking
Session / feature flagsKV / Redis classTTL, high churn
Metrics / IoTTime-seriesretention, compression
Social “friends of friends”Graph (or relational at moderate scale)path queries
Executive dashboardsWarehousejoins across history, pre-aggregation
Full-text searchSearch engineinverted index; not sole ledger

Interview questions & model answers

Q1: “SQL or NoSQL for our startup?”

Answer: Start from access paths and invariants. If you need foreign keys, multi-row transactions, and flexible reporting joins, a relational database is the default. Add NoSQL when a specific workload (e.g. document-shaped reads at huge scale with a clear partition story) justifies a second store—not the reverse.

Q2: “When would you use a document DB vs Postgres JSONB?”

Answer: Postgres JSONB when you still want transactions across tables, joins, and one engine to operate. A document-native store when horizontal partition by document id, embedded aggregate reads, and operational model (replication, drivers) fit the team—always check cross-document consistency needs.

Q3: “How do you keep search in sync with the database?”

Answer: Treat search as eventually consistent: dual write is risky; prefer outbox / CDC / queue with idempotent indexing, version or cursor for replays, and monitoring lag. User-facing reads that need freshness go to primary or read-after-write routing.

Q4: “OLTP vs warehouse—why both?”

Answer: OLTP optimizes short consistent reads/writes. Warehouses optimize batch analytics without destroying OLTP with full scans. You replicate or export into the warehouse for BI and historical queries while keeping operational paths fast and safe.


Diagram (access path drives the model)

Loading diagram…

See also

Last updated on

Spotted something unclear or wrong on this page?

On this page