Key Architectures (Popular Systems)

Analyzing real-world systems provides proven architectural blueprints and strategies for dealing with extreme scale, hot spots, and geographic distribution.

🐦 1. Twitter / X: Timeline Generation

Twitter's core challenge is handling the massive write load of incoming tweets combined with the read load of generating custom timelines for active users.

The Challenge: Fan-Out Bottlenecks

Write Volume: ~5,000 tweets per second average (peaks at 12,000+).
Read Volume: ~300,000 timeline lookups per second.
Scale Inequity: A normal user has 100 followers; a celebrity has 100 million. A naive "write to all followers' feeds" (push) crashes on celebrity tweets. A naive "query all users I follow on read" (pull) is too slow for 300k QPS.

The Solution: Hybrid Fan-Out Architecture

Incoming Tweet ──► User Follower Check
                         │
                         ├─► Normal User Follower ──► PUSH (Write to Redis Memory Feeds)
                         │
                         └─► Celebrity (>10k followers) ──► PUSH SKIPPED (Write to Celebrity Tweet DB)
                                                                     ▲
                                                                     │ (Blended on read)
User Reads Home Timeline ──► Fetch cached feed from Redis ───────────┘

Push Model (Write Fan-out) - For Normal Users:
- When a normal user tweets, the system queries their follower list.
- The tweet ID is injected directly into the active memory cache (Redis) of each follower's home timeline.
- Result: Home timeline lookup is an $O(1)$ fast Redis read.
Pull Model (Read Fan-out) - For Celebrities:
- When a celebrity (e.g., millions of followers) tweets, the write fan-out is skipped.
- The celebrity's tweet is written only to a relational/document store.
Timeline Blending (Hybrid Read):
- When a user requests their home timeline, the system fetches their cached push timeline from Redis.
- Simultaneously, it fetches tweets from any celebrity the user follows and merges them chronologically into the final list.
- Benefit: Avoids write-amplification while maintaining low read-latencies.

🎬 2. Netflix: Video Streaming

Netflix operates a split architecture: the control plane (running on AWS) handles everything before you press "Play", and the data plane (Open Connect) delivers the video streams.

[ AWS Control Plane ] ──► Sign up, recommendations, metadata
                                    │
                                    ▼
[ Open Connect CDN Appliance ] ──► Placed inside local ISPs ──► Direct stream to Client

1. Control Plane (Microservices on AWS)

Handles user registration, billing, searching, machine-learning-driven recommendations, and movie metadata.
Utilizes Cassandra for high-availability user metadata storage, and DynamoDB for transactional metadata.

2. Data Plane (Open Connect CDN)

Netflix bypasses standard CDNs (like Akamai or Cloudflare) for video streaming by using their custom CDN called Open Connect.
Hardware Appliances: Netflix builds custom storage hardware servers (Open Connect Appliances) and places them directly inside local Internet Service Provider (ISP) facilities for free.
Off-Peak Sync: During off-peak hours (middle of the night), these local appliances download popular movies directly from Netflix's AWS storage.
Zero-Latency Delivery: When a user streams a movie, it is served directly from the local ISP datacenter node, eliminating long-distance backbone network bandwidth costs and buffering latencies.

3. Video Transcoding Pipeline

A single movie can take several Terabytes of raw storage.
Netflix breaks down raw files into small chunks, parallelizes processing across AWS nodes, and encodes each chunk into thousands of output formats (resolutions, devices, audio tracks, and codecs).
Dynamic optimization feeds different bitrates based on the client's current network connection.

🚗 3. Uber: Real-time Dispatch

Uber's core challenge is matching riders to nearby drivers in real-time, requiring rapid processing of location coordinates.

1. Geospatial Indexing (H3 & S2)

Traditional databases are not optimized for fast polygon searches (e.g., "find all drivers within 2 miles of latitude X, longitude Y").
Uber uses H3 (a hexagonal spatial index system). H3 divides the entire Earth's surface into a grid of nested hexagons.
Hexagonal Advantage: The distance between a hexagon center and all its six neighbors is identical, making radial search calculations extremely fast.

     /\     /\
    |  |___|  |
    |  |   |  |
   / \/     \/ \
  |   |  ●  |   |  <-- Rider at center hexagon (Search neighbors)
  |   |_____|   |
   \ /       \ /
    |  |___|  |
    |  |   |  |
     \/     \/

2. Dispatch Architecture (DISCO)

DISCO (Dispatch Core): Written in Go, it manages the locations of all active drivers and handles rider booking requests.
State Tracker: Drivers report their GPS locations every 4 seconds. The location update goes through an API Gateway to a memory-cached State Store (e.g., Ringpop/Redis).
Geohash Lookup: When a rider requests a pickup, DISCO:
1. Resolves the rider's GPS coordinate to a specific H3 hexagon.
2. Fetches all active drivers located in that hexagon and the adjacent six hexagons.
3. Filters out drivers who are offline or busy.
4. Runs a matching algorithm to dispatch the ride.

3. Dynamic Surge Pricing

Surge pricing calculations are done using real-time geospatial demand tracking.
If booking requests (demand) within an H3 hexagon exceed active drivers (supply), a surge pricing factor is calculated and updated in-memory for all requests originating from that specific hexagon.

On this page