HLD Template
Every file in
system-design/hld/MUST follow this 9-section shape.Visibility: section 5. High-Level Architecture must include at least one
```mermaiddiagram (flowchart or similar) so the design is scannable without reading prose first.
# Design <System Name>
## 1. Requirements
### Functional
- Bullet list. What can users do? Be explicit about scope.
### Non-Functional
- Scale (DAU, QPS, storage).
- Latency targets (p50 / p99 read / write).
- Availability (e.g., 99.99%).
- Consistency (strong / eventual / read-your-writes).
- Durability.
### Out of Scope
- Things you intentionally won't design (auth, billing, ...).
## 2. Back-of-Envelope Estimations
- DAU -> QPS (read & write).
- Storage / day, year, 5yr.
- Bandwidth in & out.
- Cache size (working set rule of thumb: 80/20).
- Show the math, not just the number.
## 3. API Design
REST / gRPC. List endpoints with method, path, request, response, error codes.
```http
POST /v1/<resource>
Body: { ... }
-> 201 { id, ... }4. Data Model
- Entities + fields + types.
- SQL vs NoSQL choice with why.
- Indexes, partition key, sort key.
- Sample row.
5. High-Level Architecture
Mermaid diagram. Show: clients -> CDN/LB -> API gateway -> services -> caches -> DBs / queues / blob store.
Loading diagram…
6. Component Deep-Dives
For each non-trivial component:
- Responsibilities.
- Data structures / algorithms inside.
- Why this tech (Kafka vs SQS, Redis vs Memcached, Postgres vs DynamoDB).
- Failure modes + handling.
Common ones to deep-dive: ID generation, write path, read path, fan-out (push/pull/hybrid), sharding strategy, cache strategy, hot-key handling.
7. Bottlenecks & Mitigations
- Where will it break first as scale 10x?
- Hot keys, celebrity problem, thundering herd, cache stampede, head-of-line blocking.
- Mitigations: consistent hashing, replication, sharding, request coalescing, backpressure.
8. Tradeoffs
Table: choice -> alternative -> why we picked this.
| Decision | Alternative | Why we picked |
|---|---|---|
| Push fan-out | Pull | Latency for active users |
| DynamoDB | Postgres | Horizontal scale, predictable p99 |
9. Follow-ups (interviewer drill-downs)
- "What if traffic 100x?"
- "How do you ensure exactly-once?"
- "How do you migrate the data model?"
- "Multi-region active-active?"
- "Cost optimization?"
Last updated on
Spotted something unclear or wrong on this page?