RESHADED — System Design Interview Framework

A 7-step framework you drive proactively in any HLD interview. Spend ~45 min total; budget per step shown.

R — Requirements (5 min)

Functional: ask "what should the user be able to do?". Get 3-5 clear use cases.
Non-Functional: scale (DAU, QPS, storage), latency targets, availability, consistency.
Out of scope: explicitly drop auth/billing/analytics if not asked — saves time.
Outcome: a written list both you and the interviewer agree on.

Phrase: "Before I jump in, let me confirm scope."

E — Estimations (3 min)

Back-of-envelope. Show the math.

DAU -> QPS = DAU * actions / 86400. Multiply by peak factor 2-3x.
Storage = daily writes * payload size * retention.
Bandwidth = QPS * payload size.
Cache = 80/20 of working set.

Phrase: "Let me sanity-check the scale."

S — Storage / Schema (5 min)

Pick SQL vs NoSQL with why (relations? transactions? horizontal scale? access pattern?).
Define entities, primary key, indexes, partition key.
Show one sample row.

H — High-level diagram (8 min)

Draw: Client -> CDN -> LB -> API GW -> Services -> Cache -> DB / Queue / Blob.

Talk while drawing. Don't over-detail components yet — just place them.

A — APIs (3 min)

REST or gRPC. List the 3-5 critical endpoints with method, path, request, response.

D — Deep-Dives (15 min)

The interviewer steers here. Be ready to deep-dive any of:

ID generation (Snowflake, UUID, ticket server).
Write path / read path.
Sharding key choice + rebalancing.
Cache strategy (write-through/back/around) + invalidation.
Hot key / celebrity problem.
Consistency: read-your-writes, monotonic, causal.
Fan-out: push vs pull vs hybrid.
Idempotency keys.
Backpressure & retries (exponential backoff, jitter).

E — Edge cases & failures (3 min)

Node crash mid-write -> WAL, replay.
Partial network partition -> quorum behavior.
Hot partition -> resharding.
Thundering herd on cold cache -> request coalescing, jittered TTL.
Data corruption -> checksums, gossip.

D — Done: tradeoffs & followups (3 min)

Summarize 2-3 key tradeoffs explicitly. Surface 1-2 things you'd do "with more time" (multi-region, ML ranking, cost optimization).

Phrase: "To summarize, the key tradeoff was X over Y because Z. With more time I'd add ABC."

Anti-patterns to avoid

Jumping straight to a diagram without scope/estimations.
Over-engineering early (don't put Kafka + Spark + ML in a URL shortener).
Naming techs without justifying why over alternatives.
Ignoring hot-key / failure modes — interviewers love asking.
Silence. Keep narrating your reasoning.

Checklist (print and pin)

[ ] R: 3-5 functional, NFRs (scale, latency, availability, consistency)
[ ] E: QPS, storage, bandwidth, cache - with math
[ ] S: schema + index + partition key, SQL vs NoSQL with why
[ ] H: end-to-end diagram, all hops named
[ ] A: 3-5 API endpoints
[ ] D: deep-dive on 2-3 components (driven by interviewer)
[ ] E: failure modes + edge cases
[ ] D: tradeoffs summary + followups

Mark this page when you finish learning it.

Last updated on

Spotted something unclear or wrong on this page?

On this page

RESHADED — System Design Interview Framework R — Requirements (5 min)E — Estimations (3 min)S — Storage / Schema (5 min)H — High-level diagram (8 min)A — APIs (3 min)D — Deep-Dives (15 min)E — Edge cases & failures (3 min)D — Done: tradeoffs & followups (3 min)Anti-patterns to avoid Checklist (print and pin)