THN Interview Prep

Design Twitter (Microblogging)

1. Requirements

Functional

  • Users register profile and post short messages (“tweets”) with optional media attachments stored externally.
  • Follow other users; build personalized home timeline mixing tweets from followed accounts with ranking signals (interview may scope chronological MVP).
  • Like, retweet, reply forming threads; replies expand conversation trees.
  • Search tweets (often delegated to separate Lucene/ES cluster scope).
  • Notifications for mentions and followers (tie to notification system design).

Non-Functional

  • Scale: 500M+ users class problem; thousands of tweets per second global; hundreds of thousands timeline reads per second aggregate.
  • Latency: post visible to self immediately; fan-out to followers under few seconds for normal users; strict read-your-writes for author.
  • Availability: 99.95% for reads; tolerable brief inconsistency for rare edge cases in distributed systems.
  • Consistency: hybrid timeline model — eventual fan-out with synchronization hooks for critical UX paths.

Out of Scope

  • Full ads marketplace and billing.
  • Spaces/audio rooms engineering depth.
  • Full-text indexing internals beyond inverted index mention.
  • Federation / ActivityPub.

2. Back-of-Envelope Estimations

Assume 250M DAU; 150M tweets/day average historical ballpark for teaching.

  • Write: 150M / 86,400 ~ 1,736/s; peak sports/events 20k/s.

  • Read: If each user refreshes home timeline 30x/day → 7.5B timeline requests/day ~ 86,500/s worldwide; many cached — origin tier sees fraction.

  • Storage: tweet metadata ~200 B + text up to 280 chars UTF-8 ~ 1 KB avg → 150 GB/day raw; multi-year multi-PB with media pointers excluded.

  • Fan-out writes (push model): median followers low (~ hundreds maximum mass); celebrity outliers dominate — average arithmetic misleading; plan hybrid (news feed overlap).

  • Media: assume object storage like S3; not counted in tweet row size beyond URL.


3. API Design

POST /v1/tweets
Authorization: Bearer <token>
Body: { "text": "hello world", "replyToTweetId": null, "mediaIds": ["m1"] }
-> 201 { "tweetId": "t_abc", "createdAt": "..." }
GET /v1/timeline/home?cursor=...
-> 200 { "tweets": [ { "tweetId": "...", "author": {...} } ], "nextCursor": "..." }
POST /v1/users/{id}/follow
-> 204
GET /v1/users/{userId}/tweets?cursor=...
-> 200 { "tweets": [...] }
POST /v1/tweets/{tweetId}/retweet
-> 201 { "retweetId": "t_rt" }

Internal gRPC: FanOutTweet, MergeTimelines, FetchTweetEntities.


4. Data Model

Tweet

  • tweet_id (Snowflake/ULID), author_id, text, created_at, reply_to, retweet_of, deleted_at nullable.

User

  • Profile fields, followers_count denormalized updated async.

Timeline (materialized)

  • Keyed by viewer_id → ordered list of tweet_idRedis sorted sets or Cassandra wide partition.

Graph

  • follows edges (follower, followee, ts) in PostgreSQL sharded or JanusGraph if extreme — interviews often stop at sharded SQL.

Why Cassandra for timelines

  • Write-heavy append by partition; TTL trim — vs DynamoDB similar trade (news feed section parallels).

Indexes

  • (author_id, created_at DESC) for profile timeline.
  • Search offloaded to Elasticsearch with tweet_id as document id.

Sample tweet row

tweet_idauthor_idtextcreated_at
189272…u42hello2026-04-29T12:00:00Z

5. High-Level Architecture

Loading diagram…

6. Component Deep-Dives

ID generation

  • Snowflake-style 64-bit IDs from dedicated cluster vs UUID — Snowflake time-sortable and shorter in URLs; needs worker coordination (ID generation).

Fan-out worker

  • Consumes NewTweet events; if follower_count < threshold, fetch follower ids batched from graph shard; write timeline rows async.
  • Above threshold: mark tweet celebrity; skip push or partial push to active followers only.

Timeline read

  • Merge home timeline ids from Cassandra with celebrity pull list queried separately — union + sort by tweet_id timestamp portion if Snowflake.

Retweets

  • Either store as new tweet with retweet_of or separate edge table; timeline insertion analogous to new tweet with smaller fan-out often.

Caching

Search

  • Async indexing pipeline Kafka → Elasticsearch; not on critical post path.

Why Kafka over RabbitMQ for fan-out

  • Throughput and replay; operational complexity acknowledged (message queues).

7. Bottlenecks & Mitigations

BottleneckEffectMitigation
Celebrity fan-outMillions of writesHybrid pull, fan-out to active devices via separate channel
Cassandra partition hot spotOne viewer timeline hugeCap stored ids; trim older; archive cold
Graph DB fan-out query slowPost latencyPrecomputed follower buckets; cache follower list shards
Search index lagTweet missing in searchAccept eventual; show on profile regardless
Rate abuseSpam tweetsDistributed rate limiter per user/IP

8. Tradeoffs

DecisionAlternativeWhy we picked
Cassandra timelinesPure PostgresHorizontal write scale at follower fan-out
Kafka fan-out pipelineSynchronous SQL triggersDecouple peak spikes
Snowflake IDsULIDMillisecond precision + embedded worker id common pattern
ElasticsearchPostgres FTSScale and relevance tuning
Celebrity hybridPush onlyBounded write amplification
Redis overlayMemcachedEviction + sorted structures for merge

9. Follow-ups (interviewer drill-downs)

  • Delete tweet / moderation: Tombstone in tweet store; propagate delete events to timelines via compact topic (expensive) vs lazy filter at read (cheaper CPU per read).
  • Reply threading: Store adjacency list tweet_id -> parent; depth queries bounded.
  • Quote tweets: New tweet + pointer similar to retweet model.
  • Multi-region: Active-active timeline writes conflict rare if user pinned to region—use CRDT only if necessary.
  • Trending topics: Separate counting cluster (Storm/Flink) — compare typeahead popularity pipeline.

Last updated on

Spotted something unclear or wrong on this page?

On this page