Design URL Shortener
1. Requirements
Functional
- Create a short alias for a long HTTPS URL; returned alias is unique and resolves to the original target.
- Resolve short codes via HTTP redirect (301 for SEO-stable links, 302 when analytics need every hit).
- Optional custom aliases when allowed by policy (length, charset, reserved words).
- Optional expiration time and soft-delete for abuse or user-initiated removal.
- Optional click analytics: aggregate counts per short code, coarse geo/device bucketing.
- Admin-style APIs for lookup-by-owner and revocation (out of scope for public MVP unless product requires it).
Non-Functional
- Scale: 100M DAU for global consumer product scale interview framing; 10:1 read:write ratio typical for shorteners.
- Writes: ~500 short URLs/s sustained at that scale (with bursts on campaigns); reads dominated by popular links.
- Latency: resolve p99 under 50 ms at edge excluding client RTT; create p99 under 200 ms including DB round trips.
- Availability: 99.99% for reads; 99.9% for creates acceptable if rare failures retry.
- Consistency: strong uniqueness on create (no two codes for same policy collision); redirect reads tolerate eventual consistency if CDN caches 301; analytics eventual.
- Durability: long-term persistence of mapping; audit trail optional.
Out of Scope
- Full malware/phishing classification pipeline (assume integration with third-party safe-browsing APIs only at edge).
- Billing, subscriptions, and organization SSO beyond a stub identity layer.
- Arbitrary URL shortening for non-HTTP(S) schemes beyond validation rules.
- Graph analytics across redirect graphs.
2. Back-of-Envelope Estimations
Assume 100M DAU, each user creates 0.1 short URLs per day on average (10M new shorts/day) and follows 20 redirects/day average (2B reads/day).
- Write QPS: 10M / 86,400 s ~ 116/s sustained; peak 5–10x ~ 600–1,200/s.
- Read QPS: 2B / 86,400 ~ 23,200/s sustained; peak ~ 100k/s with viral links.
Storage for mappings (code, long URL, metadata ~500 bytes per row): 10M * 500 B ~ 5 GB/day; ~1.8 TB/year; 5 years ~ 9 TB before compaction of expired rows.
Bandwidth: average redirect response ~300 B body + headers ~ 1 KB; 2B * 1 KB ~ 2 TB/day egress from origin before CDN (CDN absorbs most).
Cache working set: top 1% of codes might drive 80% of reads (Pareto). If 1B distinct codes exist over time but hot set is 10M codes * 500 B ~ 5 GB in a regional Redis tier plus CDN edge keys.
Show math explicitly in interviews: always relate DAU to actions, then to QPS, then to fan-out and storage.
3. API Design
REST over HTTPS; JSON bodies; idempotency key header on create for safe retries.
POST /v1/short-urls
Headers: Idempotency-Key: <uuid>
Body: { "targetUrl": "https://example.com/a/b?c=1", "ttlSeconds": 2592000, "customAlias": "spring-sale" }
-> 201 { "shortCode": "aB3dEf", "shortUrl": "https://go.example/aB3dEf", "expiresAt": "2026-05-30T12:00:00Z" }
-> 409 { "error": "alias_taken" }
-> 422 { "error": "invalid_target_url" }GET /v1/short-urls/{shortCode}
-> 302 Location: <targetUrl> (or 301 if configured immutable)
-> 404 { "error": "not_found" }
-> 410 { "error": "expired" }GET /v1/short-urls/{shortCode}/stats?granularity=day
-> 200 { "clicks": [ { "date": "2026-04-29", "count": 12034 } ] }Internal gRPC (optional) for high-volume create from trusted services: CreateShortUrl, ResolveShortUrl with protobuf for lower overhead than JSON.
4. Data Model
Entities
short_url:short_code(PK, 6–8 char base62),target_url(text),owner_id(nullable),created_at,expires_at,redirect_type(301/302),idempotency_keyunique.click_aggregate:short_code,bucket_start,count— for analytics rollups.
SQL vs NoSQL
- PostgreSQL (with Citus or manual sharding) fits strong uniqueness, transactional creates, and secondary indexes on
owner_idandexpires_atfor cleanup jobs. Good when team expertise is relational and traffic fits vertically scaled writer with read replicas. - Amazon DynamoDB (or Cassandra) fits truly horizontal write scaling and predictable partition-level performance; use
short_codeas partition key for O(1) point lookups; GSI onidempotency_keyfor deduplication. Pick DynamoDB when managed ops and multi-region active-active matter more than rich SQL.
Indexes
- Primary:
short_code. - Unique:
idempotency_keywhen present. - Optional GSI/BTREE on
expires_atfor TTL sweeper workers.
Sample row (DynamoDB item)
| short_code | target_url | expires_at | redirect_type |
|---|---|---|---|
| k8mP2x | https://shop.example/p/abc | 2026-12-01T00:00:00Z | 301 |
5. High-Level Architecture
Clients hit a global anycast edge (CDN) for GET redirects; CDN misses go to regional API + cache. Creates go through API gateway to write path. Analytics buffered to a stream. See also caching and load balancing patterns.
6. Component Deep-Dives
Short code generation (counter vs random)
- Segmented counter in a dedicated service (Snowflake-style epoch + worker id + sequence) encoded base62 gives dense codes and no collision retries; requires coordinated workers (ID generation discipline).
- Crypto random 80-bit token base62 needs uniqueness checks and occasional retry; simpler ops but variable length and harder human typing.
Pick counter-backed encoding when you want fixed length and monotonic debuggability; pick random when decentralizing generation at edge without coordination.
Resolve read path
- CDN checks edge cache key
/{code}; on hit return 301/302 without origin. - On miss, resolve service checks Redis (caching) for
code -> target_url + flags. - On Redis miss, load from DynamoDB/Postgres, backfill Redis with TTL, return redirect.
- Async publish click event to Kafka with short code, timestamp, anonymized IP bucket — never block redirect on analytics.
Create write path
- Validate URL scheme/host policy.
- Idempotency: if
Idempotency-Keyseen, return prior result from store. - Allocate code via counter service or insert with random code + retry on conflict.
- Write to primary store; replicate async to second region if multi-master pattern used.
Hot link mitigation
- Extreme viral URL becomes hot key in Redis; use consistent hashing across Redis shards so load spreads; request coalescing for identical keys during spikes.
7. Bottlenecks & Mitigations
| Risk | Symptom | Mitigation |
|---|---|---|
| Hot short code | Single Redis shard saturated | Local micro-cache in resolve pods; replicate read-heavy keys to multiple cache replicas; CDN 301 long TTL |
| Write leader limit | Postgres primary CPU bound | Shard by hash(short_code) or move writes to DynamoDB partitions |
| Thundering herd on expiry | Many clients refresh huge CDN population | Staggered TTL; soft expiry with background refresh |
| Analytics backlog | Kafka lag grows | Scale consumers; sample clicks 1/N for tail codes |
| Code exhaustion | 6-char base62 ~ 56B space but operational bugs reuse | Monitor collision rate; lengthen code automatically |
8. Tradeoffs
| Decision | Alternative | Why we picked |
|---|---|---|
| DynamoDB for mapping at mega scale | Postgres + Citus | Partition scalability and cross-region replication without heavy DBA |
| Redis in front of DB | Memcached only | Rich TTL + Lua for atomic touch + optional geo replication (caching) |
| Kafka for click stream | SQS per region | Replay and unified analytics pipeline; higher ops cost acceptable |
| 301 default for stable campaigns | 302 only | SEO and CDN efficiency; product chooses per link |
| Server-side counter IDs | Random base62 | Predictable length and collision-free allocation under load |
| Edge-first redirects | Every request to origin | Latency and cost; mandatory for this problem class |
9. Follow-ups (interviewer drill-downs)
- What if read traffic grows 100x overnight on one code? Discuss CDN-only tier, anycast, and optional static export of top URLs to edge KV.
- How do you migrate from Postgres to DynamoDB without downtime? Dual-write, shadow reads, cutover with feature flag.
- Multi-region active-active for creates? Conflict resolution on idempotency keys; vector clocks rarely needed if codes globally unique via allocator.
- Exactly-once analytics? At-least-once into Kafka + idempotent aggregates downstream; dedupe by (code, click_id).
- Cost optimization? Aggressive CDN TTL for 301, cold storage for old analytics, right-size Redis by evicting cold codes from cache but not DB.
- Abuse and phishing? Rate limit creates per account/IP using a distributed rate limiter design.
Last updated on
Spotted something unclear or wrong on this page?