THN Interview Prep

RESHADED — System Design Interview Framework

A 7-step framework you drive proactively in any HLD interview. Spend ~45 min total; budget per step shown.


R — Requirements (5 min)

  • Functional: ask "what should the user be able to do?". Get 3-5 clear use cases.
  • Non-Functional: scale (DAU, QPS, storage), latency targets, availability, consistency.
  • Out of scope: explicitly drop auth/billing/analytics if not asked — saves time.
  • Outcome: a written list both you and the interviewer agree on.

Phrase: "Before I jump in, let me confirm scope."


E — Estimations (3 min)

Back-of-envelope. Show the math.

  • DAU -> QPS = DAU * actions / 86400. Multiply by peak factor 2-3x.
  • Storage = daily writes * payload size * retention.
  • Bandwidth = QPS * payload size.
  • Cache = 80/20 of working set.

Phrase: "Let me sanity-check the scale."


S — Storage / Schema (5 min)

  • Pick SQL vs NoSQL with why (relations? transactions? horizontal scale? access pattern?).
  • Define entities, primary key, indexes, partition key.
  • Show one sample row.

H — High-level diagram (8 min)

Draw: Client -> CDN -> LB -> API GW -> Services -> Cache -> DB / Queue / Blob.

Talk while drawing. Don't over-detail components yet — just place them.


A — APIs (3 min)

REST or gRPC. List the 3-5 critical endpoints with method, path, request, response.


D — Deep-Dives (15 min)

The interviewer steers here. Be ready to deep-dive any of:

  • ID generation (Snowflake, UUID, ticket server).
  • Write path / read path.
  • Sharding key choice + rebalancing.
  • Cache strategy (write-through/back/around) + invalidation.
  • Hot key / celebrity problem.
  • Consistency: read-your-writes, monotonic, causal.
  • Fan-out: push vs pull vs hybrid.
  • Idempotency keys.
  • Backpressure & retries (exponential backoff, jitter).

E — Edge cases & failures (3 min)

  • Node crash mid-write -> WAL, replay.
  • Partial network partition -> quorum behavior.
  • Hot partition -> resharding.
  • Thundering herd on cold cache -> request coalescing, jittered TTL.
  • Data corruption -> checksums, gossip.

D — Done: tradeoffs & followups (3 min)

Summarize 2-3 key tradeoffs explicitly. Surface 1-2 things you'd do "with more time" (multi-region, ML ranking, cost optimization).

Phrase: "To summarize, the key tradeoff was X over Y because Z. With more time I'd add ABC."


Anti-patterns to avoid

  • Jumping straight to a diagram without scope/estimations.
  • Over-engineering early (don't put Kafka + Spark + ML in a URL shortener).
  • Naming techs without justifying why over alternatives.
  • Ignoring hot-key / failure modes — interviewers love asking.
  • Silence. Keep narrating your reasoning.

Checklist (print and pin)

[ ] R: 3-5 functional, NFRs (scale, latency, availability, consistency)
[ ] E: QPS, storage, bandwidth, cache - with math
[ ] S: schema + index + partition key, SQL vs NoSQL with why
[ ] H: end-to-end diagram, all hops named
[ ] A: 3-5 API endpoints
[ ] D: deep-dive on 2-3 components (driven by interviewer)
[ ] E: failure modes + edge cases
[ ] D: tradeoffs summary + followups

Last updated on

Spotted something unclear or wrong on this page?

On this page