THN Interview Prep

Design YouTube (Video Sharing Platform)

1. Requirements

Functional

  • Upload videos (resumable), transcode to multiple renditions (resolution/bitrate).
  • Stream with adaptive bitrate (ABR); thumbnails and previews; captions.
  • Search and recommendations feed; channels, subscriptions, comments (comments shallow).
  • Studio analytics for creators (aggregate counts).
  • Content policy and copyright hooks via fingerprinting pipeline (referenced, not fully designed).

Non-Functional

  • Scale: billions of users; upload and view QPS among largest on the internet; storage exabytes of video.
  • Latency: playback start p99 low seconds on poor networks—optimize first segment; API p99 ~100–300 ms cached.
  • Availability: 99.99% for watch path; upload may retry gracefully.
  • Consistency: eventual for view counts; strong for ownership and monetization metadata where money flows.
  • Durability: no silent loss of uploaded masters; multi-AZ replication.

Out of Scope

  • Full ad auction and Google Ads integration depth.
  • Legal content moderation policy specifics.
  • Live streaming ultra-low-latency full design (mention extension).

2. Back-of-Envelope Estimations

Assume 2B MAU, 1B hours watched/day industry-scale ballpark, ~5 MB/min effective delivered bytes varies by ABR (orders of magnitude).

  • Views: ~10B views/day → ~120k views/s average; peaks 1M+/s popular events. Metadata hotter than bytes—most bytes served from CDN.

  • Uploads: 500 hours/min urban legend scale—order 10³–10⁴ uploads/s aggregate globally with huge variance.

  • Storage: masters + transcoded ladder EB cumulative; thumbnails and metadata far smaller but non-trivial.

  • Transcoding compute: CPU/GPU fleet sized to ingest spikes; queue backlog acceptable minutes-hours for long tail.

  • Cache: CDN edge dominates; origin shield; popular videos 100% edge hit. Origin egress tbps internal.

Use back-of-envelope sanity checks on CDN hit ratio.

Creator uploads: ~500 hours/min video uploaded (order-of-magnitude industry stat) implies continuous transcode fleet sizing by codec × resolution matrix—not by concurrent viewers alone.

Comment moderation: Comment writes are orders of magnitude fewer than views but spiky during controversy—plan moderation queues separately from view-event ingestion so safety tooling cannot starve playback telemetry (pub/sub isolation).

3. API Design

POST /v1/uploads:init
Body: { title, mimeType, sizeBytes }
-> 200 { uploadId, chunkSizeBytes }

PUT /v1/uploads/{uploadId}
Headers: Content-Range
Body: <binary>
-> 308 Resume

POST /v1/videos/{videoId}:finalize
Body: { visibility, categoryIds }
-> 201 { videoId, processingJobId }

GET /v1/watch/{videoId}/manifest.mpd
-> 200 application/dash+xml

GET /v1/feed/home?cursor=
-> 200 { items: [...], nextCursor }

Errors: 400 bad range, 403 region blocked, 429 upload quota.

GET /v1/channels/{channelId}/videos?cursor=
-> 200 { items: [...], nextCursor }

POST /v1/videos/{videoId}/captions
Body: { language, format, uploadUrl }
-> 202 { captionJobId }

4. Data Model

  • Video: videoId, channelId, status (processing|live|failed), durationMs, visibility.
  • Asset: videoId, rendition (1080p, 720p, ...), blobId, codec, bitrate.
  • Channel: channelId, ownerUserId, subscriberCount (materialized/cached).
  • ViewEvent: append-only for analytics; aggregated counters separate.

Metadata: sharded SQL or Spanner-like for core entities; Cassandra for high-write counters with compaction; object storage for blobs. View aggregates often batch + approximate (consistency tradeoff).

Indexes: Video(channelId, publishedAt DESC); search inverted index on title/tags.

5. High-Level Architecture

Loading diagram…

Upload lands in raw bucket; queue schedules transcode; outputs to packaged bucket (CMAF/DASH/HLS). Playback is almost entirely CDN. Recommendations separate training from serving with feature store. See CDN and message queues.

6. Component Deep-Dives

  • Transcoding pipeline: Normalize codecs; generate ladder; keyframe alignment for ABR switching; parallel segment encode; priority queues for popular creators (policy).
  • Manifest generation: DASH/HLS manifests reference segment URLs with CDN tokens; per-region edge URL signing.
  • View counting: Ingest high-volume events to Kafka; aggregate to batch stores; sharded counters with debounce; spam detection upstream.
  • Search/indexing: Asynchronous indexer on video metadata; Content-ID sidecar for rights.
  • Failure: Transcode failure → retry with backoff; poison jobs → dead-letter with alerts; partial ladder degrade gracefully.

7. Bottlenecks & Mitigations

  • Viral video: Origin hot spotorigin shield, request collapsing, multiple CDNs.

  • Upload spikes: Chunked resumable; per-user concurrency caps; regional ingest buckets.

  • Counter inflation: Anomaly detection; cap increment rates per IP/account.

  • Recommendations cold start: Content-based fallback; explore/exploit balance.

  • Live extension: LL-HLS path adds packager sub-second segments and server-side ad cue injection—separate scale story from VOD but shares CDN and auth edge.

8. Tradeoffs

DecisionAlternativeWhy we picked
ABR streamingSingle MP4 fileMobile network variance
Async transcodingServe raw uploadCompatibility and bandwidth
Approximate view countsStrongly consistentCost at planet scale
Third-party + multi-CDNSingle vendorResilience and peering

9. Follow-ups (interviewer drill-downs)

  • 100× view spikes? More edge; isolate recommendation reads; shed non-critical APIs.

  • Exactly-once view billing? Idempotent session tokens; reconcile aggregates (idempotency).

  • Migration? Dual-write manifests v1/v2; player feature flags.

  • Multi-region? Encode region-local; replicate packaged assets; metadata global with caches.

  • Cost? Tiered storage for cold content; codec efficiency (AV1) rollout; limit free transcoding.

  • Copyright? Content-ID fingerprint ingest on upload; match policy may block or monetize; dispute flow is legal and ops heavysystem only routes state transitions and retains evidence blobs.

  • Shorts vertical? Vertical encode ladder and separate recommendation index; shares graph for VOD but different engagement features in ranking mixer.

Last updated on

Spotted something unclear or wrong on this page?

On this page