Design Instagram (Photo Sharing Feed)

1. Requirements

Functional

Users upload images/videos; server generates multiple renditions (thumbnail, standard, HD) and serves via CDN.
Follow graph drives personalized feed and stories ephemeral content (24h) as parallel product surfaces sharing infra.
Engagement: like, comment, save, share to stories.
Direct messaging as separate real-time path (touch lightly or reference WhatsApp patterns).
Explore tab recommendations beyond social graph (ML ranking).

Non-Functional

Scale: 1B+ MAU class; tens of millions media uploads/day; billions of feed impressions/day.
Latency: upload acknowledgements fast; processing async; feed first paint p99 under 400 ms with cache warm.
Availability: 99.99% for reads; uploads tolerate retries with resumable protocol.
Consistency: eventual fan-out to followers acceptable seconds-level; likes counts eventually consistent with tolerance UI +/- delta.
Durability: no media loss — object storage 11 9s; metadata transactional.

Out of Scope

Full Reels recommendation ML stack depth.
Payment and shopping checkout flows.
Content moderation CV models training — assume human review tooling hooks only.
IGTV long-form separate product complexity.

2. Back-of-Envelope Estimations

Uploads: 50M media items/day (photo+video blended teaching number); peak 2x–5x daily average during events → 1,000–3,000/s sustained peaks higher.

Feed reads: 500M DAU * 80 opens/day ~ 40B impressions metric confusion — better: per session 10 feed API calls * 500M * fraction active → 100k–500k RPS origin globally after CDN — still enormous; edge cache ineffective for personalized feed except pieces.

Storage: Average 3 MB processed photo + video much larger — assume average effective 15 MB stored per item including transcoded ladders → 50M * 15 MB ~ 750 TB/day new media — in reality dedupe and compression vary; interview uses order-of-magnitude: petabyte-scale yearly media footprint.

CDN egress: Dominant cost driver; multi-megabyte video segments streamed — terabits per second aggregate during peaks — requires tier-1 CDN contracts.

Metadata: 50M * 1 KB ~ 50 GB/day posts rows — small vs media.

3. API Design

POST /v1/media/upload-session
-> 201 { "uploadId": "up_123", "uploadUrls": [ { "rendition": "original", "url": "https://s3..." } ] }

POST /v1/posts
Body: { "mediaIds": ["m1"], "caption": "sunset", "locationId": null }
-> 201 { "postId": "p_456" }

GET /v1/feed/home?cursor=...
-> 200 { "items": [ { "postId": "...", "media": [...], "author": {...} } ] }

POST /v1/media/{mediaId}/likes
-> 204

GET /v1/users/{id}/feed
-> 200 { "items": [...] }

Resumable uploads: tus protocol or S3 multipart presigned URLs — large video resilience.

4. Data Model

Post

post_id, author_id, caption, created_at, media_ids[], location, visibility.

Media

media_id, owner_id, s3_keys per rendition, width, height, duration, codec.

Engagement

likes table or counter column sharded; Redis for hot counters with periodic flush to Cassandra/Scylla.

Feed timeline

Same pattern as Twitter: Cassandra partition per viewer with ordered post_id.

Why not single Postgres

Write fan-out and sheer row volume — relational kept for accounts, billing hooks, some OLTP; media metadata may live in sharded MySQL (Instagram historically) or DynamoDB — interview acknowledges evolution.

Indexes

(author_id, created_at) for profiles.
Geo indexes optional Elasticsearch for location discovery.

Sample media row

media_id	owner_id	renditions	created_at
m789	u22	{ thumb: "s3://...", std: "s3://..." }	2026-04-29

5. High-Level Architecture

Loading diagram…

6. Component Deep-Dives

Upload & processing

Client uploads to S3 via presigned URL (removes API servers from data path) — why vs proxy upload: bandwidth and horizontal scale (load balancing not bottlenecked).
SQS/Lambda or Kafka + FFmpeg workers transcode ladders HLS for video; photos libvips thumbnails.
Magic bytes virus scan with ClamAV in async pipeline — not blocking first byte if product accepts rare bad frame removal later.

Feed generation

Same fan-out on write vs pull hybrid as Twitter (news feed); stories use shorter TTL partitions.

Ranking

Explore uses ML features in Galaxy-style feature store (conceptual): Redis online, Hive offline; TensorFlow Serving — bespoke vs cloud Vertex AI trade on org maturity.

CDN

CloudFront / Akamai with signed URLs for private accounts vs long-cache public influencers — cache key includes user auth variant carefully.

Caching

Memcached historically popular for hot object metadata at Meta-scale; Redis cluster for session + graph fragments (caching).

Stories

Separate Cassandra table or same with ttl_seconds row-level — Redis expiring keys insufficient alone for durability.

7. Bottlenecks & Mitigations

Bottleneck	Scenario	Mitigation
Transcode backlog	Viral video spikes	Autoscale workers on queue depth; shed load by delaying non-critical renditions
Feed cold start	New user	Onboarding suggestions from popular graph
Hot influencer post	Fan-out storm	Hybrid pull + dynamic insertion
CDN origin overload	Cache miss storm	Origin shield; internal tier caching
Counter inconsistency	Like mashing	Idempotent like API; CRDT-style optional overkill — settle for tolerance

8. Tradeoffs

Decision	Alternative	Why we picked
S3 + async processing	Disk on API box	Elastic capacity and durability
Kafka event backbone	RabbitMQ	Durability at millions msgs/min
Cassandra feed	DynamoDB	Similar; pick based on org cloud
Presigned direct upload	Proxied multipart	API tier CPU and NIC preservation
HLS streaming	Single MP4	Adaptive bitrate mobile networks
Separate Explore rank	Chronological only	Engagement product requirement

9. Follow-ups (interviewer drill-downs)

Live video? Separate low-latency RTMP ingest → packaging → CDN WebRTC vs HLS latency trade.
Copyright detection? Perceptual hashing (pHash) pipeline async + legal tooling.
DM E2E? Signal Protocol style — massive scope; reference WhatsApp doc.
Deletion: Remove from S3 (lifecycle policy), purge CDN with wildcard invalidation cost awareness.
Cost optimization: Serve lower bitrate in constrained markets — device capability hints header.

On this page