THN Interview Prep

Design Rate Limiter (LLD)

1. Requirements

  • Functional

    • For each client key, permit up to N requests per fixed window or token refill rate.
    • Answer allow or deny with optional retry-after metadata.
    • Share limiter across cooperative workers in one process or coordinate externally for cluster.
  • Non-Functional

    • O(1) hot path for decision with bounded memory per key.
    • Correct under concurrency without double-spending tokens.
  • Assumptions / Out of Scope

    • Distributed Redis sliding window is HLD; here in-process abstractions.

2. Core Entities

EntityResponsibilityKey Attributes
RateLimiterPublic APIalgorithm ref
ClientKeyIdentifierstring or tenant id
TokenBucketRefill modelcapacity, tokens, refillRate, lastRefill
FixedWindowCounter modelwindowStart, count, limit
DecisionResultallowed, retryAfter
ClockTime sourcenow function injectable

3. Class Diagram

Loading diagram…

4. State / Sequence Diagram (where relevant)

Loading diagram…

5. Design Patterns Applied

  • Strategy — Swappable LimiterAlgorithm. Strategy pattern.
  • Singleton (per scope) — One limiter instance per service process if shared global (use sparingly). Singleton pattern.

6. Implementation

Go

package ratelimit

type ClientKey string

type Decision struct {
    Allowed    bool
    RetryAfter time.Duration
}

type LimiterAlgorithm interface {
    TryAcquire(key ClientKey) Decision
}

type RateLimiter struct {
    Algorithm LimiterAlgorithm
}

func (limiter *RateLimiter) Allow(key ClientKey) Decision {
    return limiter.Algorithm.TryAcquire(key)
}

JavaScript

class TokenBucketLimiter {
  constructor({ capacity, refillPerSecond, clock }) {
    this.capacity = capacity;
    this.refillPerSecond = refillPerSecond;
    this.clock = clock;
    this.buckets = new Map();
  }
  tryAcquire(clientKey) { /* ... */ }
}

class RateLimiter {
  constructor({ algorithm }) {
    this.algorithm = algorithm;
  }
  allow(clientKey) {
    return this.algorithm.tryAcquire(clientKey);
  }
}

7. Concurrency / Thread Safety

  • Collisions: Parallel Allow on same key racing token decrement.
  • Granularity: Mutex per key sharded by hash; or atomic operations on token fields with careful refill math.
  • Go: sync.Mutex map keyed by ClientKey or sync.Map for bucket storage.
  • JavaScript: Single-threaded; distributed workers need Redis Lua or compare-and-swap.

8. Extensibility & Followups

  • Sliding window log with deque per key for smoother limiting.
  • Hierarchical quotas (tenant then user).
  • Edge cases: clock skew, burst after long idle, memory growth from many keys (eviction).

Last updated on

Spotted something unclear or wrong on this page?

On this page