THN Interview Prep

Design Netflix (Streaming Video Service)

1. Requirements

Functional

  • Browse catalog; personalized home page; search titles and talent.
  • Stream with adaptive bitrate; resume playback; multiple profiles per household.
  • Download for offline viewing on supported clients.
  • Recommendations row-based UI; continue watching; because-you-watched rails.
  • Studio/partner ingest for licensed content with DRM packaging.

Non-Functional

  • Scale: global subscribers 200M+ class; peak concurrent streams tens of millions; metadata millions of titles regionalized.
  • Latency: playback start competitive—target p95 ~2–5 s on good networks, faster on CDNs; API p99 ~200 ms warm.
  • Availability: 99.99% playback via CDN; control plane slightly relaxed with graceful degradation.
  • Consistency: eventual for recommendations and progress bars acceptable; strong for account entitlements and billing hooks.
  • Durability: masters in object storage; DRM keys lifecycle tightly controlled.

Out of Scope

  • Full content production and encoding facility contracts.
  • Live sports low-latency end-to-end (extension point only).
  • ISP settlement and Open Connect appliance deep dive (mention edge).

2. Back-of-Envelope Estimations

Assume 250M subscribers, 4 hours/day streamed per viewing household skew—use concurrent streams as capacity driver.

  • Concurrent streams evening peak ~50M ballpark global order-of-magnitude → nearly all bytes from OCAs/CDN at edge; origin requests small fraction.

  • Control plane: home feed loads ~100M/hour peak → ~30k QPS average with 3–10× peaks on Friday—cache-heavy.

  • Metadata: ~20 KB per title per locale × locales × ~100k titles → single-digit GB hot catalog per region in cache layers.

  • Logs/analytics: trillions of events/year—Kafka to data lake; not on critical playback path.

  • Storage: EB masters + encoded ladders replicated regionally; offline downloads use encrypted packaged files.

Relate to scalability and caching fundamentals.

Download forensics: Offline viewing multiplies license checks per device—model 2–3 active devices per profile for token validation QPS, not just stream starts.

Personalization training: Feature logs often exceed playback row volume because training pipelines generate negatives and impressions—budget cold storage for ML datasets separately from OLTP catalog entities (scalability discipline).

3. API Design

GET /v2/users/{userId}/profiles
-> 200 { profiles: [...] }

GET /v1/catalog/titles/{titleId}
-> 200 { titleId, synopsis, artwork[], maturity }

POST /v1/playback/sessions
Body: { titleId, profileId, deviceId, drmScheme }
-> 201 { sessionId, licenseUrl, manifestUrl, heartbeatSeconds }

POST /v1/playback/sessions/{sessionId}/progress
Body: { positionMs, durationMs }
-> 204

GET /v1/home?profileId=
-> 200 { rows: [{ railType, items: [...] }] }

Errors: 401 device limit, 403 geo-restriction, 404 title unavailable in region.

DELETE /v1/profiles/{profileId}/continue/{titleId}
-> 204

GET /v1/catalog/search?q=&cursor=
-> 200 { titles: [...], nextCursor }

4. Data Model

  • Title: titleId, type (show/movie), regional availability map, rating metadata.
  • Episode: episodeId, seasonId, number, durationMs.
  • PlaybackAsset: packaged manifests per drm, audioLang, subtitle.
  • Profile: profileId, maturity, continueWatching pointer (cached).

Catalog: document store or SQL with heavy caching; graph for recommendations offline feature gen. Viewing progress: Cassandra/Dynamo keyed (profileId, titleId) for write-heavy upserts. Entitlements in auth service—tie to consistency for subscription state.

5. High-Level Architecture

Loading diagram…

Open Connect (or commercial CDN) serves bytes. Playback orchestrator returns signed URLs and Widevine/FairPlay license endpoints. Recommendations blend offline models with online exploration. CDN is central.

6. Component Deep-Dives

  • Encoding ladder: Per-title optimization (complexity-aware); Per-Shot encodes for premium tier optional; VP9/AV1 rollout tradeoffs vs device support.
  • DRM: License requests short-lived; token binds device + session; revocation lists for leaked keys.
  • Personalization: Candidate generation (similarity, trending) + ranking model; A/B infrastructure for ranking layers.
  • Continue watching: Debounced writes; conflict resolution last timestamp wins per profile.
  • Failure: CDN miss storm → stale-while-revalidate manifests; license service down → grace period policy decision (business).

7. Bottlenecks & Mitigations

  • Title launch spikes: Pre-warm CDN; carousel cache at API; circuit break non-critical rails.

  • Recommendation fan-out: Batch feature fetches; approximate neighbor graphs with LSH class methods at scale.

  • DRM load: Regional license clusters; rate limit abnormal devices.

  • Metadata dogpiling: ETags; delta feeds to apps.

  • Kids profile: Maturity filter propagates to search, artwork, and row composition—feature flags per cohort avoid recomputing entire home for unrelated users.

8. Tradeoffs

DecisionAlternativeWhy we picked
Edge-heavy streamingCentral origin onlyLatency and egress cost
Eventual progress syncLock per play headUX vs write volume
Personalized homeStatic catalogEngagement
Multi-codec ladderSingle MP4Device and bandwidth diversity

9. Follow-ups (interviewer drill-downs)

  • 100× Friday spike? Scale stateless API horizontally; shed thumbnails; freeze experiments.

  • Exactly-once billing for watch minutes? Idempotent session heartbeats (idempotency).

  • Catalog migration? Versioned APIs; dual-read new schema per microservice.

  • Multi-region active-active? Playback local; catalog replicated with eventual consistency labels in UI for edits.

  • Cost? Codec efficiency; chunk size tuning; tiered storage for cold encodes; download window limits.

  • A/B at scale? Feature gates in client and server; assign cohorts in edge with sticky session; metric pipelines must not double-count streams on app restartsuse session dedupe keys.

  • Studio ingest? Masters arrive on Aspera-class paths; separate SLO from consumer upload; QC lint before transcode to fail fast on corrupt essence.

  • Regional catalog gaps? Licensing differs per country—edge routing must fetch entitlements before calling playback to avoid accidental geo leaks when DNS anycast misroutes.

  • Partner CDN interconnect? When mixing first-party Open Connect with commercial CDNs, align token TTL and key rotation so failover does not orphan sessions mid movie.

  • Accessibility? Audio description and SDH tracks multiply storage and transcode jobs—budget parallel pipelines so accessibility assets are not second-class citizens in queue priority.

Last updated on

Spotted something unclear or wrong on this page?

On this page