Design Rate Limiter (LLD)
1. Requirements
-
Functional
- For each client key, permit up to N requests per fixed window or token refill rate.
- Answer allow or deny with optional retry-after metadata.
- Share limiter across cooperative workers in one process or coordinate externally for cluster.
-
Non-Functional
- O(1) hot path for decision with bounded memory per key.
- Correct under concurrency without double-spending tokens.
-
Assumptions / Out of Scope
- Distributed Redis sliding window is HLD; here in-process abstractions.
2. Core Entities
| Entity | Responsibility | Key Attributes |
|---|---|---|
| RateLimiter | Public API | algorithm ref |
| ClientKey | Identifier | string or tenant id |
| TokenBucket | Refill model | capacity, tokens, refillRate, lastRefill |
| FixedWindow | Counter model | windowStart, count, limit |
| Decision | Result | allowed, retryAfter |
| Clock | Time source | now function injectable |
3. Class Diagram
Loading diagram…
4. State / Sequence Diagram (where relevant)
Loading diagram…
5. Design Patterns Applied
- Strategy — Swappable
LimiterAlgorithm. Strategy pattern. - Singleton (per scope) — One limiter instance per service process if shared global (use sparingly). Singleton pattern.
6. Implementation
Go
package ratelimit
type ClientKey string
type Decision struct {
Allowed bool
RetryAfter time.Duration
}
type LimiterAlgorithm interface {
TryAcquire(key ClientKey) Decision
}
type RateLimiter struct {
Algorithm LimiterAlgorithm
}
func (limiter *RateLimiter) Allow(key ClientKey) Decision {
return limiter.Algorithm.TryAcquire(key)
}JavaScript
class TokenBucketLimiter {
constructor({ capacity, refillPerSecond, clock }) {
this.capacity = capacity;
this.refillPerSecond = refillPerSecond;
this.clock = clock;
this.buckets = new Map();
}
tryAcquire(clientKey) { /* ... */ }
}
class RateLimiter {
constructor({ algorithm }) {
this.algorithm = algorithm;
}
allow(clientKey) {
return this.algorithm.tryAcquire(clientKey);
}
}7. Concurrency / Thread Safety
- Collisions: Parallel
Allowon same key racing token decrement. - Granularity: Mutex per key sharded by hash; or atomic operations on token fields with careful refill math.
- Go:
sync.Mutexmap keyed byClientKeyor sync.Map for bucket storage. - JavaScript: Single-threaded; distributed workers need Redis Lua or compare-and-swap.
8. Extensibility & Followups
- Sliding window log with deque per key for smoother limiting.
- Hierarchical quotas (tenant then user).
- Edge cases: clock skew, burst after long idle, memory growth from many keys (eviction).
Last updated on
Spotted something unclear or wrong on this page?