Caching & retry amplification
Core details
Thundering herd: many clients miss cache (or key expires) simultaneously → spike on origin DB/API.
Mitigations:
| Technique | Role |
|---|---|
| TTL jitter | spread expirations |
| Single-flight / request coalescing | one fetch per key |
| Probabilistic early refresh | refresh before hard expiry |
| Circuit breaker | fail fast when origin unhealthy |
Retry amplification: clients retry on timeout together → harmonic load spike. Use capped retries, exponential backoff + jitter, per-hop budgets, Retry-After respect.
Hedging (second duplicate request after delay): can help tails or double load—use sparingly with idempotency.
Understanding
Caches trade freshness for cost. Retries trade availability perception for downstream load. Staff design names who loses when both fire at once (usually the database).
Senior understanding
Connect to product: money reads need fresh routing or short TTL + honest UX—not “cache for speed” alone. Link Application caching & consistency.
Diagram
See also
Last updated on
Spotted something unclear or wrong on this page?