Design Instagram (Photo Sharing Feed)
1. Requirements
Functional
- Users upload images/videos; server generates multiple renditions (thumbnail, standard, HD) and serves via CDN.
- Follow graph drives personalized feed and stories ephemeral content (24h) as parallel product surfaces sharing infra.
- Engagement: like, comment, save, share to stories.
- Direct messaging as separate real-time path (touch lightly or reference WhatsApp patterns).
- Explore tab recommendations beyond social graph (ML ranking).
Non-Functional
- Scale: 1B+ MAU class; tens of millions media uploads/day; billions of feed impressions/day.
- Latency: upload acknowledgements fast; processing async; feed first paint p99 under 400 ms with cache warm.
- Availability: 99.99% for reads; uploads tolerate retries with resumable protocol.
- Consistency: eventual fan-out to followers acceptable seconds-level; likes counts eventually consistent with tolerance UI +/- delta.
- Durability: no media loss — object storage 11 9s; metadata transactional.
Out of Scope
- Full Reels recommendation ML stack depth.
- Payment and shopping checkout flows.
- Content moderation CV models training — assume human review tooling hooks only.
- IGTV long-form separate product complexity.
2. Back-of-Envelope Estimations
Uploads: 50M media items/day (photo+video blended teaching number); peak 2x–5x daily average during events → 1,000–3,000/s sustained peaks higher.
Feed reads: 500M DAU * 80 opens/day ~ 40B impressions metric confusion — better: per session 10 feed API calls * 500M * fraction active → 100k–500k RPS origin globally after CDN — still enormous; edge cache ineffective for personalized feed except pieces.
Storage: Average 3 MB processed photo + video much larger — assume average effective 15 MB stored per item including transcoded ladders → 50M * 15 MB ~ 750 TB/day new media — in reality dedupe and compression vary; interview uses order-of-magnitude: petabyte-scale yearly media footprint.
CDN egress: Dominant cost driver; multi-megabyte video segments streamed — terabits per second aggregate during peaks — requires tier-1 CDN contracts.
Metadata: 50M * 1 KB ~ 50 GB/day posts rows — small vs media.
3. API Design
POST /v1/media/upload-session
-> 201 { "uploadId": "up_123", "uploadUrls": [ { "rendition": "original", "url": "https://s3..." } ] }POST /v1/posts
Body: { "mediaIds": ["m1"], "caption": "sunset", "locationId": null }
-> 201 { "postId": "p_456" }GET /v1/feed/home?cursor=...
-> 200 { "items": [ { "postId": "...", "media": [...], "author": {...} } ] }POST /v1/media/{mediaId}/likes
-> 204GET /v1/users/{id}/feed
-> 200 { "items": [...] }Resumable uploads: tus protocol or S3 multipart presigned URLs — large video resilience.
4. Data Model
Post
post_id,author_id,caption,created_at,media_ids[],location, visibility.
Media
media_id,owner_id,s3_keysper rendition,width,height,duration,codec.
Engagement
likestable or counter column sharded; Redis for hot counters with periodic flush to Cassandra/Scylla.
Feed timeline
- Same pattern as Twitter: Cassandra partition per viewer with ordered
post_id.
Why not single Postgres
- Write fan-out and sheer row volume — relational kept for accounts, billing hooks, some OLTP; media metadata may live in sharded MySQL (Instagram historically) or DynamoDB — interview acknowledges evolution.
Indexes
(author_id, created_at)for profiles.- Geo indexes optional Elasticsearch for location discovery.
Sample media row
| media_id | owner_id | renditions | created_at |
|---|---|---|---|
| m789 | u22 | { thumb: "s3://...", std: "s3://..." } | 2026-04-29 |
5. High-Level Architecture
6. Component Deep-Dives
Upload & processing
- Client uploads to S3 via presigned URL (removes API servers from data path) — why vs proxy upload: bandwidth and horizontal scale (load balancing not bottlenecked).
- SQS/Lambda or Kafka + FFmpeg workers transcode ladders HLS for video; photos libvips thumbnails.
- Magic bytes virus scan with ClamAV in async pipeline — not blocking first byte if product accepts rare bad frame removal later.
Feed generation
- Same fan-out on write vs pull hybrid as Twitter (news feed); stories use shorter TTL partitions.
Ranking
- Explore uses ML features in Galaxy-style feature store (conceptual): Redis online, Hive offline; TensorFlow Serving — bespoke vs cloud Vertex AI trade on org maturity.
CDN
- CloudFront / Akamai with signed URLs for private accounts vs long-cache public influencers — cache key includes user auth variant carefully.
Caching
- Memcached historically popular for hot object metadata at Meta-scale; Redis cluster for session + graph fragments (caching).
Stories
- Separate Cassandra table or same with
ttl_secondsrow-level — Redis expiring keys insufficient alone for durability.
7. Bottlenecks & Mitigations
| Bottleneck | Scenario | Mitigation |
|---|---|---|
| Transcode backlog | Viral video spikes | Autoscale workers on queue depth; shed load by delaying non-critical renditions |
| Feed cold start | New user | Onboarding suggestions from popular graph |
| Hot influencer post | Fan-out storm | Hybrid pull + dynamic insertion |
| CDN origin overload | Cache miss storm | Origin shield; internal tier caching |
| Counter inconsistency | Like mashing | Idempotent like API; CRDT-style optional overkill — settle for tolerance |
8. Tradeoffs
| Decision | Alternative | Why we picked |
|---|---|---|
| S3 + async processing | Disk on API box | Elastic capacity and durability |
| Kafka event backbone | RabbitMQ | Durability at millions msgs/min |
| Cassandra feed | DynamoDB | Similar; pick based on org cloud |
| Presigned direct upload | Proxied multipart | API tier CPU and NIC preservation |
| HLS streaming | Single MP4 | Adaptive bitrate mobile networks |
| Separate Explore rank | Chronological only | Engagement product requirement |
9. Follow-ups (interviewer drill-downs)
- Live video? Separate low-latency RTMP ingest → packaging → CDN WebRTC vs HLS latency trade.
- Copyright detection? Perceptual hashing (pHash) pipeline async + legal tooling.
- DM E2E? Signal Protocol style — massive scope; reference WhatsApp doc.
- Deletion: Remove from S3 (lifecycle policy), purge CDN with wildcard invalidation cost awareness.
- Cost optimization: Serve lower bitrate in constrained markets — device capability hints header.
Last updated on
Spotted something unclear or wrong on this page?