Design URL Shortener

1. Requirements

Functional

Create a short alias for a long HTTPS URL; returned alias is unique and resolves to the original target.
Resolve short codes via HTTP redirect (301 for SEO-stable links, 302 when analytics need every hit).
Optional custom aliases when allowed by policy (length, charset, reserved words).
Optional expiration time and soft-delete for abuse or user-initiated removal.
Optional click analytics: aggregate counts per short code, coarse geo/device bucketing.
Admin-style APIs for lookup-by-owner and revocation (out of scope for public MVP unless product requires it).

Non-Functional

Scale: 100M DAU for global consumer product scale interview framing; 10:1 read:write ratio typical for shorteners.
Writes: ~500 short URLs/s sustained at that scale (with bursts on campaigns); reads dominated by popular links.
Latency: resolve p99 under 50 ms at edge excluding client RTT; create p99 under 200 ms including DB round trips.
Availability: 99.99% for reads; 99.9% for creates acceptable if rare failures retry.
Consistency: strong uniqueness on create (no two codes for same policy collision); redirect reads tolerate eventual consistency if CDN caches 301; analytics eventual.
Durability: long-term persistence of mapping; audit trail optional.

Out of Scope

Full malware/phishing classification pipeline (assume integration with third-party safe-browsing APIs only at edge).
Billing, subscriptions, and organization SSO beyond a stub identity layer.
Arbitrary URL shortening for non-HTTP(S) schemes beyond validation rules.
Graph analytics across redirect graphs.

2. Back-of-Envelope Estimations

Assume 100M DAU, each user creates 0.1 short URLs per day on average (10M new shorts/day) and follows 20 redirects/day average (2B reads/day).

Write QPS: 10M / 86,400 s ~ 116/s sustained; peak 5–10x ~ 600–1,200/s.
Read QPS: 2B / 86,400 ~ 23,200/s sustained; peak ~ 100k/s with viral links.

Storage for mappings (code, long URL, metadata ~500 bytes per row): 10M * 500 B ~ 5 GB/day; ~1.8 TB/year; 5 years ~ 9 TB before compaction of expired rows.

Bandwidth: average redirect response ~300 B body + headers ~ 1 KB; 2B * 1 KB ~ 2 TB/day egress from origin before CDN (CDN absorbs most).

Cache working set: top 1% of codes might drive 80% of reads (Pareto). If 1B distinct codes exist over time but hot set is 10M codes * 500 B ~ 5 GB in a regional Redis tier plus CDN edge keys.

Show math explicitly in interviews: always relate DAU to actions, then to QPS, then to fan-out and storage.

3. API Design

REST over HTTPS; JSON bodies; idempotency key header on create for safe retries.

POST /v1/short-urls
Headers: Idempotency-Key: <uuid>
Body: { "targetUrl": "https://example.com/a/b?c=1", "ttlSeconds": 2592000, "customAlias": "spring-sale" }
-> 201 { "shortCode": "aB3dEf", "shortUrl": "https://go.example/aB3dEf", "expiresAt": "2026-05-30T12:00:00Z" }
-> 409 { "error": "alias_taken" }
-> 422 { "error": "invalid_target_url" }

GET /v1/short-urls/{shortCode}
-> 302 Location: <targetUrl>  (or 301 if configured immutable)
-> 404 { "error": "not_found" }
-> 410 { "error": "expired" }

GET /v1/short-urls/{shortCode}/stats?granularity=day
-> 200 { "clicks": [ { "date": "2026-04-29", "count": 12034 } ] }

Internal gRPC (optional) for high-volume create from trusted services: CreateShortUrl, ResolveShortUrl with protobuf for lower overhead than JSON.

4. Data Model

Entities

short_url: short_code (PK, 6–8 char base62), target_url (text), owner_id (nullable), created_at, expires_at, redirect_type (301/302), idempotency_key unique.
click_aggregate: short_code, bucket_start, count — for analytics rollups.

SQL vs NoSQL

PostgreSQL (with Citus or manual sharding) fits strong uniqueness, transactional creates, and secondary indexes on owner_id and expires_at for cleanup jobs. Good when team expertise is relational and traffic fits vertically scaled writer with read replicas.
Amazon DynamoDB (or Cassandra) fits truly horizontal write scaling and predictable partition-level performance; use short_code as partition key for O(1) point lookups; GSI on idempotency_key for deduplication. Pick DynamoDB when managed ops and multi-region active-active matter more than rich SQL.

Indexes

Primary: short_code.
Unique: idempotency_key when present.
Optional GSI/BTREE on expires_at for TTL sweeper workers.

Sample row (DynamoDB item)

short_code	target_url	expires_at	redirect_type
k8mP2x	https://shop.example/p/abc	2026-12-01T00:00:00Z	301

Clients hit a global anycast edge (CDN) for GET redirects; CDN misses go to regional API + cache. Creates go through API gateway to write path. Analytics buffered to a stream. See also caching and load balancing patterns.

URL shortener edge resolve path showing CDN cache hits, regional API, Redis lookup, primary store fallback, and analytics stream.

Loading diagram…

6. Component Deep-Dives

Short code generation (counter vs random)

Segmented counter in a dedicated service (Snowflake-style epoch + worker id + sequence) encoded base62 gives dense codes and no collision retries; requires coordinated workers (ID generation discipline).
Crypto random 80-bit token base62 needs uniqueness checks and occasional retry; simpler ops but variable length and harder human typing.

Pick counter-backed encoding when you want fixed length and monotonic debuggability; pick random when decentralizing generation at edge without coordination.

Resolve read path

CDN checks edge cache key /{code}; on hit return 301/302 without origin.
On miss, resolve service checks Redis (caching) for code -> target_url + flags.
On Redis miss, load from DynamoDB/Postgres, backfill Redis with TTL, return redirect.
Async publish click event to Kafka with short code, timestamp, anonymized IP bucket — never block redirect on analytics.

Create write path

Validate URL scheme/host policy.
Idempotency: if Idempotency-Key seen, return prior result from store.
Allocate code via counter service or insert with random code + retry on conflict.
Write to primary store; replicate async to second region if multi-master pattern used.

Hot link mitigation

Extreme viral URL becomes hot key in Redis; use consistent hashing across Redis shards so load spreads; request coalescing for identical keys during spikes.

7. Bottlenecks & Mitigations

Risk	Symptom	Mitigation
Hot short code	Single Redis shard saturated	Local micro-cache in resolve pods; replicate read-heavy keys to multiple cache replicas; CDN 301 long TTL
Write leader limit	Postgres primary CPU bound	Shard by hash(short_code) or move writes to DynamoDB partitions
Thundering herd on expiry	Many clients refresh huge CDN population	Staggered TTL; soft expiry with background refresh
Analytics backlog	Kafka lag grows	Scale consumers; sample clicks 1/N for tail codes
Code exhaustion	6-char base62 ~ 56B space but operational bugs reuse	Monitor collision rate; lengthen code automatically

8. Tradeoffs

Decision	Alternative	Why we picked
DynamoDB for mapping at mega scale	Postgres + Citus	Partition scalability and cross-region replication without heavy DBA
Redis in front of DB	Memcached only	Rich TTL + Lua for atomic touch + optional geo replication (caching)
Kafka for click stream	SQS per region	Replay and unified analytics pipeline; higher ops cost acceptable
301 default for stable campaigns	302 only	SEO and CDN efficiency; product chooses per link
Server-side counter IDs	Random base62	Predictable length and collision-free allocation under load
Edge-first redirects	Every request to origin	Latency and cost; mandatory for this problem class

9. Follow-ups (interviewer drill-downs)

What if read traffic grows 100x overnight on one code? Discuss CDN-only tier, anycast, and optional static export of top URLs to edge KV.
How do you migrate from Postgres to DynamoDB without downtime? Dual-write, shadow reads, cutover with feature flag.
Multi-region active-active for creates? Conflict resolution on idempotency keys; vector clocks rarely needed if codes globally unique via allocator.
Exactly-once analytics? At-least-once into Kafka + idempotent aggregates downstream; dedupe by (code, click_id).
Cost optimization? Aggressive CDN TTL for 301, cold storage for old analytics, right-size Redis by evicting cold codes from cache but not DB.
Abuse and phishing? Rate limit creates per account/IP using a distributed rate limiter design.