Design Netflix (Streaming Video Service)
1. Requirements
Functional
- Browse catalog; personalized home page; search titles and talent.
- Stream with adaptive bitrate; resume playback; multiple profiles per household.
- Download for offline viewing on supported clients.
- Recommendations row-based UI; continue watching; because-you-watched rails.
- Studio/partner ingest for licensed content with DRM packaging.
Non-Functional
- Scale: global subscribers 200M+ class; peak concurrent streams tens of millions; metadata millions of titles regionalized.
- Latency: playback start competitive—target p95 ~2–5 s on good networks, faster on CDNs; API p99 ~200 ms warm.
- Availability: 99.99% playback via CDN; control plane slightly relaxed with graceful degradation.
- Consistency: eventual for recommendations and progress bars acceptable; strong for account entitlements and billing hooks.
- Durability: masters in object storage; DRM keys lifecycle tightly controlled.
Out of Scope
- Full content production and encoding facility contracts.
- Live sports low-latency end-to-end (extension point only).
- ISP settlement and Open Connect appliance deep dive (mention edge).
2. Back-of-Envelope Estimations
Assume 250M subscribers, 4 hours/day streamed per viewing household skew—use concurrent streams as capacity driver.
-
Concurrent streams evening peak ~50M ballpark global order-of-magnitude → nearly all bytes from OCAs/CDN at edge; origin requests small fraction.
-
Control plane: home feed loads ~100M/hour peak → ~30k QPS average with 3–10× peaks on Friday—cache-heavy.
-
Metadata: ~20 KB per title per locale × locales × ~100k titles → single-digit GB hot catalog per region in cache layers.
-
Logs/analytics: trillions of events/year—Kafka to data lake; not on critical playback path.
-
Storage: EB masters + encoded ladders replicated regionally; offline downloads use encrypted packaged files.
Relate to scalability and caching fundamentals.
Download forensics: Offline viewing multiplies license checks per device—model 2–3 active devices per profile for token validation QPS, not just stream starts.
Personalization training: Feature logs often exceed playback row volume because training pipelines generate negatives and impressions—budget cold storage for ML datasets separately from OLTP catalog entities (scalability discipline).
3. API Design
GET /v2/users/{userId}/profiles
-> 200 { profiles: [...] }
GET /v1/catalog/titles/{titleId}
-> 200 { titleId, synopsis, artwork[], maturity }
POST /v1/playback/sessions
Body: { titleId, profileId, deviceId, drmScheme }
-> 201 { sessionId, licenseUrl, manifestUrl, heartbeatSeconds }
POST /v1/playback/sessions/{sessionId}/progress
Body: { positionMs, durationMs }
-> 204
GET /v1/home?profileId=
-> 200 { rows: [{ railType, items: [...] }] }Errors: 401 device limit, 403 geo-restriction, 404 title unavailable in region.
DELETE /v1/profiles/{profileId}/continue/{titleId}
-> 204
GET /v1/catalog/search?q=&cursor=
-> 200 { titles: [...], nextCursor }4. Data Model
- Title:
titleId,type(show/movie), regional availability map, rating metadata. - Episode:
episodeId,seasonId,number,durationMs. - PlaybackAsset: packaged manifests per
drm,audioLang,subtitle. - Profile:
profileId,maturity, continueWatching pointer (cached).
Catalog: document store or SQL with heavy caching; graph for recommendations offline feature gen. Viewing progress: Cassandra/Dynamo keyed (profileId, titleId) for write-heavy upserts. Entitlements in auth service—tie to consistency for subscription state.
5. High-Level Architecture
Open Connect (or commercial CDN) serves bytes. Playback orchestrator returns signed URLs and Widevine/FairPlay license endpoints. Recommendations blend offline models with online exploration. CDN is central.
6. Component Deep-Dives
- Encoding ladder: Per-title optimization (complexity-aware); Per-Shot encodes for premium tier optional; VP9/AV1 rollout tradeoffs vs device support.
- DRM: License requests short-lived; token binds device + session; revocation lists for leaked keys.
- Personalization: Candidate generation (similarity, trending) + ranking model; A/B infrastructure for ranking layers.
- Continue watching: Debounced writes; conflict resolution last timestamp wins per profile.
- Failure: CDN miss storm → stale-while-revalidate manifests; license service down → grace period policy decision (business).
7. Bottlenecks & Mitigations
-
Title launch spikes: Pre-warm CDN; carousel cache at API; circuit break non-critical rails.
-
Recommendation fan-out: Batch feature fetches; approximate neighbor graphs with LSH class methods at scale.
-
DRM load: Regional license clusters; rate limit abnormal devices.
-
Metadata dogpiling: ETags; delta feeds to apps.
-
Kids profile: Maturity filter propagates to search, artwork, and row composition—feature flags per cohort avoid recomputing entire home for unrelated users.
8. Tradeoffs
| Decision | Alternative | Why we picked |
|---|---|---|
| Edge-heavy streaming | Central origin only | Latency and egress cost |
| Eventual progress sync | Lock per play head | UX vs write volume |
| Personalized home | Static catalog | Engagement |
| Multi-codec ladder | Single MP4 | Device and bandwidth diversity |
9. Follow-ups (interviewer drill-downs)
-
100× Friday spike? Scale stateless API horizontally; shed thumbnails; freeze experiments.
-
Exactly-once billing for watch minutes? Idempotent session heartbeats (idempotency).
-
Catalog migration? Versioned APIs; dual-read new schema per microservice.
-
Multi-region active-active? Playback local; catalog replicated with eventual consistency labels in UI for edits.
-
Cost? Codec efficiency; chunk size tuning; tiered storage for cold encodes; download window limits.
-
A/B at scale? Feature gates in client and server; assign cohorts in edge with sticky session; metric pipelines must not double-count streams on app restarts—use session dedupe keys.
-
Studio ingest? Masters arrive on Aspera-class paths; separate SLO from consumer upload; QC lint before transcode to fail fast on corrupt essence.
-
Regional catalog gaps? Licensing differs per country—edge routing must fetch entitlements before calling playback to avoid accidental geo leaks when DNS anycast misroutes.
-
Partner CDN interconnect? When mixing first-party Open Connect with commercial CDNs, align token TTL and key rotation so failover does not orphan sessions mid movie.
-
Accessibility? Audio description and SDH tracks multiply storage and transcode jobs—budget parallel pipelines so accessibility assets are not second-class citizens in queue priority.
Last updated on
Spotted something unclear or wrong on this page?