THN Interview Prep

Transaction Isolation: From Read Uncommitted to Serializable

Definition

Isolation defines how concurrent transactions interact. SQL defines phenomena (dirty read, non-repeatable read, phantom read) and isolation levels that forbid some of them. MVCC (Multi-Version Concurrency Control) implements isolation by keeping version chains of rows; readers often see a snapshot as of a transaction start time without blocking writers.

Standard levels (ANSI/SQL, implementations vary slightly):

LevelDirty readNon-repeatable readPhantom read
Read UncommittedAllowedAllowedAllowed
Read CommittedNot allowedAllowedAllowed
Repeatable ReadNot allowedNot allowedDB-specific (often partial)
SerializableNot allowedNot allowedNot allowed
  • Dirty read — Transaction reads uncommitted data from another transaction.
  • Non-repeatable read — Same row read twice returns different committed values.
  • Phantom read — Range query returns new rows committed by another transaction (new "phantom" rows appear).

MVCC snapshot — A transaction reads from a consistent snapshot of committed versions; updates create new versions. Snapshot isolation (common in Postgres) prevents dirty/non-repeatable reads for single-row patterns but may allow write skew (not a phantom in the classic sense) unless Serializable (SSI in Postgres) closes that gap.

Why it matters in interviews

Payment and inventory stories hinge on phantoms and write skew. Saying "we use transactions" without a level is incomplete. Interviewers expect you to connect ORM defaults (often Read Committed) to bugs under concurrency.

Tradeoffs

  • Read Committed — Good throughput, fewer locks; weaker guarantees.
  • Repeatable Read — Stronger for aggregates in a transaction; still subtle anomalies without true serializable.
  • Serializable — Safest semantics; cost: more retries (serialization failures), index-range locking or SSI overhead.

Concrete examples

  1. Double-spend prevention — Two txs read balance, both decide enough funds—need Serializable or explicit row locks / compare-and-swap, not naive Read Committed.
  2. Reporting jobRepeatable Read or snapshot so totals do not shift mid-query if acceptable for analytics isolation window.
  3. Seat booking / inventoryPhantom in range locks matters when inserting new reservations in the same seat class range—serializable or careful locking on inventory rows.

How to say it in 30 seconds

"I pick isolation by anomaly: Read Committed for many OLTP paths; Serializable when range predicates and concurrent inserts cause phantoms or write skew. I know my DB: Postgres uses MVCC and SSI for serializable; I mention retries on serialization failure."

Common follow-up questions

  • Difference between Repeatable Read and Snapshot isolation? Overlap in MVCC systems; snapshot prevents certain anomalies RR aimed at; write skew can still happen under snapshot unless serializable.
  • How does MVCC avoid readers blocking writers? Readers use old versions; writers add new versions; vacuum reclaims old rows.
  • Why do ORMs make this harder? Long transactions and implicit retries hide isolation; explicit transaction boundaries and idempotency keys matter (see idempotency.md).

See also: System design curriculum overview

Last updated on

Spotted something unclear or wrong on this page?

On this page