THN Interview Prep

Design URL Shortener

1. Requirements

Functional

  • Create a short alias for a long HTTPS URL; returned alias is unique and resolves to the original target.
  • Resolve short codes via HTTP redirect (301 for SEO-stable links, 302 when analytics need every hit).
  • Optional custom aliases when allowed by policy (length, charset, reserved words).
  • Optional expiration time and soft-delete for abuse or user-initiated removal.
  • Optional click analytics: aggregate counts per short code, coarse geo/device bucketing.
  • Admin-style APIs for lookup-by-owner and revocation (out of scope for public MVP unless product requires it).

Non-Functional

  • Scale: 100M DAU for global consumer product scale interview framing; 10:1 read:write ratio typical for shorteners.
  • Writes: ~500 short URLs/s sustained at that scale (with bursts on campaigns); reads dominated by popular links.
  • Latency: resolve p99 under 50 ms at edge excluding client RTT; create p99 under 200 ms including DB round trips.
  • Availability: 99.99% for reads; 99.9% for creates acceptable if rare failures retry.
  • Consistency: strong uniqueness on create (no two codes for same policy collision); redirect reads tolerate eventual consistency if CDN caches 301; analytics eventual.
  • Durability: long-term persistence of mapping; audit trail optional.

Out of Scope

  • Full malware/phishing classification pipeline (assume integration with third-party safe-browsing APIs only at edge).
  • Billing, subscriptions, and organization SSO beyond a stub identity layer.
  • Arbitrary URL shortening for non-HTTP(S) schemes beyond validation rules.
  • Graph analytics across redirect graphs.

2. Back-of-Envelope Estimations

Assume 100M DAU, each user creates 0.1 short URLs per day on average (10M new shorts/day) and follows 20 redirects/day average (2B reads/day).

  • Write QPS: 10M / 86,400 s ~ 116/s sustained; peak 5–10x ~ 600–1,200/s.
  • Read QPS: 2B / 86,400 ~ 23,200/s sustained; peak ~ 100k/s with viral links.

Storage for mappings (code, long URL, metadata ~500 bytes per row): 10M * 500 B ~ 5 GB/day; ~1.8 TB/year; 5 years ~ 9 TB before compaction of expired rows.

Bandwidth: average redirect response ~300 B body + headers ~ 1 KB; 2B * 1 KB ~ 2 TB/day egress from origin before CDN (CDN absorbs most).

Cache working set: top 1% of codes might drive 80% of reads (Pareto). If 1B distinct codes exist over time but hot set is 10M codes * 500 B ~ 5 GB in a regional Redis tier plus CDN edge keys.

Show math explicitly in interviews: always relate DAU to actions, then to QPS, then to fan-out and storage.


3. API Design

REST over HTTPS; JSON bodies; idempotency key header on create for safe retries.

POST /v1/short-urls
Headers: Idempotency-Key: <uuid>
Body: { "targetUrl": "https://example.com/a/b?c=1", "ttlSeconds": 2592000, "customAlias": "spring-sale" }
-> 201 { "shortCode": "aB3dEf", "shortUrl": "https://go.example/aB3dEf", "expiresAt": "2026-05-30T12:00:00Z" }
-> 409 { "error": "alias_taken" }
-> 422 { "error": "invalid_target_url" }
GET /v1/short-urls/{shortCode}
-> 302 Location: <targetUrl>  (or 301 if configured immutable)
-> 404 { "error": "not_found" }
-> 410 { "error": "expired" }
GET /v1/short-urls/{shortCode}/stats?granularity=day
-> 200 { "clicks": [ { "date": "2026-04-29", "count": 12034 } ] }

Internal gRPC (optional) for high-volume create from trusted services: CreateShortUrl, ResolveShortUrl with protobuf for lower overhead than JSON.


4. Data Model

Entities

  • short_url: short_code (PK, 6–8 char base62), target_url (text), owner_id (nullable), created_at, expires_at, redirect_type (301/302), idempotency_key unique.
  • click_aggregate: short_code, bucket_start, count — for analytics rollups.

SQL vs NoSQL

  • PostgreSQL (with Citus or manual sharding) fits strong uniqueness, transactional creates, and secondary indexes on owner_id and expires_at for cleanup jobs. Good when team expertise is relational and traffic fits vertically scaled writer with read replicas.
  • Amazon DynamoDB (or Cassandra) fits truly horizontal write scaling and predictable partition-level performance; use short_code as partition key for O(1) point lookups; GSI on idempotency_key for deduplication. Pick DynamoDB when managed ops and multi-region active-active matter more than rich SQL.

Indexes

  • Primary: short_code.
  • Unique: idempotency_key when present.
  • Optional GSI/BTREE on expires_at for TTL sweeper workers.

Sample row (DynamoDB item)

short_codetarget_urlexpires_atredirect_type
k8mP2xhttps://shop.example/p/abc2026-12-01T00:00:00Z301

5. High-Level Architecture

Clients hit a global anycast edge (CDN) for GET redirects; CDN misses go to regional API + cache. Creates go through API gateway to write path. Analytics buffered to a stream. See also caching and load balancing patterns.

URL shortener edge resolve path showing CDN cache hits, regional API, Redis lookup, primary store fallback, and analytics stream.

Loading diagram…

6. Component Deep-Dives

Short code generation (counter vs random)

  • Segmented counter in a dedicated service (Snowflake-style epoch + worker id + sequence) encoded base62 gives dense codes and no collision retries; requires coordinated workers (ID generation discipline).
  • Crypto random 80-bit token base62 needs uniqueness checks and occasional retry; simpler ops but variable length and harder human typing.

Pick counter-backed encoding when you want fixed length and monotonic debuggability; pick random when decentralizing generation at edge without coordination.

Resolve read path

  1. CDN checks edge cache key /{code}; on hit return 301/302 without origin.
  2. On miss, resolve service checks Redis (caching) for code -> target_url + flags.
  3. On Redis miss, load from DynamoDB/Postgres, backfill Redis with TTL, return redirect.
  4. Async publish click event to Kafka with short code, timestamp, anonymized IP bucket — never block redirect on analytics.

Create write path

  1. Validate URL scheme/host policy.
  2. Idempotency: if Idempotency-Key seen, return prior result from store.
  3. Allocate code via counter service or insert with random code + retry on conflict.
  4. Write to primary store; replicate async to second region if multi-master pattern used.

Hot link mitigation

  • Extreme viral URL becomes hot key in Redis; use consistent hashing across Redis shards so load spreads; request coalescing for identical keys during spikes.

7. Bottlenecks & Mitigations

RiskSymptomMitigation
Hot short codeSingle Redis shard saturatedLocal micro-cache in resolve pods; replicate read-heavy keys to multiple cache replicas; CDN 301 long TTL
Write leader limitPostgres primary CPU boundShard by hash(short_code) or move writes to DynamoDB partitions
Thundering herd on expiryMany clients refresh huge CDN populationStaggered TTL; soft expiry with background refresh
Analytics backlogKafka lag growsScale consumers; sample clicks 1/N for tail codes
Code exhaustion6-char base62 ~ 56B space but operational bugs reuseMonitor collision rate; lengthen code automatically

8. Tradeoffs

DecisionAlternativeWhy we picked
DynamoDB for mapping at mega scalePostgres + CitusPartition scalability and cross-region replication without heavy DBA
Redis in front of DBMemcached onlyRich TTL + Lua for atomic touch + optional geo replication (caching)
Kafka for click streamSQS per regionReplay and unified analytics pipeline; higher ops cost acceptable
301 default for stable campaigns302 onlySEO and CDN efficiency; product chooses per link
Server-side counter IDsRandom base62Predictable length and collision-free allocation under load
Edge-first redirectsEvery request to originLatency and cost; mandatory for this problem class

9. Follow-ups (interviewer drill-downs)

  • What if read traffic grows 100x overnight on one code? Discuss CDN-only tier, anycast, and optional static export of top URLs to edge KV.
  • How do you migrate from Postgres to DynamoDB without downtime? Dual-write, shadow reads, cutover with feature flag.
  • Multi-region active-active for creates? Conflict resolution on idempotency keys; vector clocks rarely needed if codes globally unique via allocator.
  • Exactly-once analytics? At-least-once into Kafka + idempotent aggregates downstream; dedupe by (code, click_id).
  • Cost optimization? Aggressive CDN TTL for 301, cold storage for old analytics, right-size Redis by evicting cold codes from cache but not DB.
  • Abuse and phishing? Rate limit creates per account/IP using a distributed rate limiter design.

Mark this page when you finish learning it.

Last updated on

Spotted something unclear or wrong on this page?

On this page