RESHADED — System Design Interview Framework
A 7-step framework you drive proactively in any HLD interview. Spend ~45 min total; budget per step shown.
R — Requirements (5 min)
- Functional: ask "what should the user be able to do?". Get 3-5 clear use cases.
- Non-Functional: scale (DAU, QPS, storage), latency targets, availability, consistency.
- Out of scope: explicitly drop auth/billing/analytics if not asked — saves time.
- Outcome: a written list both you and the interviewer agree on.
Phrase: "Before I jump in, let me confirm scope."
E — Estimations (3 min)
Back-of-envelope. Show the math.
- DAU -> QPS =
DAU * actions / 86400. Multiply by peak factor 2-3x. - Storage =
daily writes * payload size * retention. - Bandwidth =
QPS * payload size. - Cache = 80/20 of working set.
Phrase: "Let me sanity-check the scale."
S — Storage / Schema (5 min)
- Pick SQL vs NoSQL with why (relations? transactions? horizontal scale? access pattern?).
- Define entities, primary key, indexes, partition key.
- Show one sample row.
H — High-level diagram (8 min)
Draw: Client -> CDN -> LB -> API GW -> Services -> Cache -> DB / Queue / Blob.
Talk while drawing. Don't over-detail components yet — just place them.
A — APIs (3 min)
REST or gRPC. List the 3-5 critical endpoints with method, path, request, response.
D — Deep-Dives (15 min)
The interviewer steers here. Be ready to deep-dive any of:
- ID generation (Snowflake, UUID, ticket server).
- Write path / read path.
- Sharding key choice + rebalancing.
- Cache strategy (write-through/back/around) + invalidation.
- Hot key / celebrity problem.
- Consistency: read-your-writes, monotonic, causal.
- Fan-out: push vs pull vs hybrid.
- Idempotency keys.
- Backpressure & retries (exponential backoff, jitter).
E — Edge cases & failures (3 min)
- Node crash mid-write -> WAL, replay.
- Partial network partition -> quorum behavior.
- Hot partition -> resharding.
- Thundering herd on cold cache -> request coalescing, jittered TTL.
- Data corruption -> checksums, gossip.
D — Done: tradeoffs & followups (3 min)
Summarize 2-3 key tradeoffs explicitly. Surface 1-2 things you'd do "with more time" (multi-region, ML ranking, cost optimization).
Phrase: "To summarize, the key tradeoff was X over Y because Z. With more time I'd add ABC."
Anti-patterns to avoid
- Jumping straight to a diagram without scope/estimations.
- Over-engineering early (don't put Kafka + Spark + ML in a URL shortener).
- Naming techs without justifying why over alternatives.
- Ignoring hot-key / failure modes — interviewers love asking.
- Silence. Keep narrating your reasoning.
Checklist (print and pin)
[ ] R: 3-5 functional, NFRs (scale, latency, availability, consistency)
[ ] E: QPS, storage, bandwidth, cache - with math
[ ] S: schema + index + partition key, SQL vs NoSQL with why
[ ] H: end-to-end diagram, all hops named
[ ] A: 3-5 API endpoints
[ ] D: deep-dive on 2-3 components (driven by interviewer)
[ ] E: failure modes + edge cases
[ ] D: tradeoffs summary + followupsLast updated on
Spotted something unclear or wrong on this page?