THN Interview Prep

HLD Template

Every file in system-design/hld/ MUST follow this 9-section shape.

Visibility: section 5. High-Level Architecture must include at least one ```mermaid diagram (flowchart or similar) so the design is scannable without reading prose first.


# Design <System Name>

## 1. Requirements

### Functional
- Bullet list. What can users do? Be explicit about scope.

### Non-Functional
- Scale (DAU, QPS, storage).
- Latency targets (p50 / p99 read / write).
- Availability (e.g., 99.99%).
- Consistency (strong / eventual / read-your-writes).
- Durability.

### Out of Scope
- Things you intentionally won't design (auth, billing, ...).

## 2. Back-of-Envelope Estimations
- DAU -> QPS (read & write).
- Storage / day, year, 5yr.
- Bandwidth in & out.
- Cache size (working set rule of thumb: 80/20).
- Show the math, not just the number.

## 3. API Design
REST / gRPC. List endpoints with method, path, request, response, error codes.

```http
POST /v1/<resource>
Body: { ... }
-> 201 { id, ... }

4. Data Model

  • Entities + fields + types.
  • SQL vs NoSQL choice with why.
  • Indexes, partition key, sort key.
  • Sample row.

5. High-Level Architecture

Mermaid diagram. Show: clients -> CDN/LB -> API gateway -> services -> caches -> DBs / queues / blob store.

Loading diagram…

6. Component Deep-Dives

For each non-trivial component:

  • Responsibilities.
  • Data structures / algorithms inside.
  • Why this tech (Kafka vs SQS, Redis vs Memcached, Postgres vs DynamoDB).
  • Failure modes + handling.

Common ones to deep-dive: ID generation, write path, read path, fan-out (push/pull/hybrid), sharding strategy, cache strategy, hot-key handling.

7. Bottlenecks & Mitigations

  • Where will it break first as scale 10x?
  • Hot keys, celebrity problem, thundering herd, cache stampede, head-of-line blocking.
  • Mitigations: consistent hashing, replication, sharding, request coalescing, backpressure.

8. Tradeoffs

Table: choice -> alternative -> why we picked this.

DecisionAlternativeWhy we picked
Push fan-outPullLatency for active users
DynamoDBPostgresHorizontal scale, predictable p99

9. Follow-ups (interviewer drill-downs)

  • "What if traffic 100x?"
  • "How do you ensure exactly-once?"
  • "How do you migrate the data model?"
  • "Multi-region active-active?"
  • "Cost optimization?"

Last updated on

Spotted something unclear or wrong on this page?

On this page