Design Rate Limiter (LLD)
1. Requirements
-
Functional
- For each client key, permit up to N requests per fixed window or token refill rate.
- Answer allow or deny with optional retry-after metadata.
- Share limiter across cooperative workers in one process or coordinate externally for cluster.
-
Non-Functional
- O(1) hot path for decision with bounded memory per key.
- Correct under concurrency without double-spending tokens.
-
Assumptions / Out of Scope
- Distributed Redis sliding window is HLD; here in-process abstractions.
2. Core Entities
| Entity | Responsibility | Key Attributes |
|---|---|---|
| RateLimiter | Public API | algorithm ref |
| ClientKey | Identifier | string or tenant id |
| TokenBucket | Refill model | capacity, tokens, refillRate, lastRefill |
| FixedWindow | Counter model | windowStart, count, limit |
| Decision | Result | allowed, retryAfter |
| Clock | Time source | now function injectable |
3. Class Diagram
Loading diagram…
4. State / Sequence Diagram (where relevant)
Loading diagram…
5. Design Patterns Applied
- Strategy — Swappable
LimiterAlgorithm. Strategy pattern. - Singleton (per scope) — One limiter instance per service process if shared global (use sparingly). Singleton pattern.
6. Implementation
Go
package ratelimit
type ClientKey string
type Decision struct {
Allowed bool
RetryAfter time.Duration
}
type LimiterAlgorithm interface {
TryAcquire(key ClientKey) Decision
}
type RateLimiter struct {
Algorithm LimiterAlgorithm
}
func (limiter *RateLimiter) Allow(key ClientKey) Decision {
return limiter.Algorithm.TryAcquire(key)
}JavaScript
class TokenBucketLimiter {
constructor({ capacity, refillPerSecond, clock }) {
this.capacity = capacity;
this.refillPerSecond = refillPerSecond;
this.clock = clock;
this.buckets = new Map();
}
tryAcquire(clientKey) { /* ... */ }
}
class RateLimiter {
constructor({ algorithm }) {
this.algorithm = algorithm;
}
allow(clientKey) {
return this.algorithm.tryAcquire(clientKey);
}
}7. Concurrency / Thread Safety
- Collisions: Parallel
Allowon same key racing token decrement. - Granularity: Mutex per key sharded by hash; or atomic operations on token fields with careful refill math.
- Go:
sync.Mutexmap keyed byClientKeyor sync.Map for bucket storage. - JavaScript: Single-threaded; distributed workers need Redis Lua or compare-and-swap.
8. Extensibility & Followups
- Sliding window log with deque per key for smoother limiting.
- Hierarchical quotas (tenant then user).
- Edge cases: clock skew, burst after long idle, memory growth from many keys (eviction).
Mark this page when you finish learning it.
Last updated on
Spotted something unclear or wrong on this page?