Caching & consistency
Core details
Layers you can stack in one answer: browser/CDN → app local → shared remote (Redis/mem) → DB buffer pool. Each layer differs in latency win vs staleness risk.
Cache stampede: many clients miss simultaneously—mitigate with jittered TTL, early probabilistic refresh, or request coalescing / single-flight fetch.
Patterns: cache-aside vs read-through vs write-through—state who owns invalidation triggers (event bus, version bump, explicit delete).
Problem this solves: repeated reads can overload expensive sources of truth, but stale or leaked data can be worse than slow data.
Simple mental model: a cache is a copy with a contract. The contract must say who fills it, who invalidates it, how stale it may be, and who is allowed to see it.
Understanding
Caches trade freshness for cost & latency. Wrong layer choice leaks sensitive personalized HTML at CDN or serves money-moving reads from stale snapshots without UI disclosure—product + compliance failure, not “just cache config.”
Distributed invalidation is eventual—design user-visible honesty or stronger read paths when correctness demands it.
Pattern choice
| Pattern | Use when | Main risk |
|---|---|---|
| Cache-aside | application owns read path and misses | inconsistent invalidation spread across code |
| Read-through | shared data access layer can centralize fills | cache layer becomes too smart/opaque |
| Write-through | writes must update cache with source | write latency and partial failure handling |
| Write-behind | throughput matters more than immediate durability | data loss or reordering if queue fails |
For money, permissions, inventory, and legal state, be explicit about whether stale reads are acceptable. If not, bypass the cache or use a version/lease/read-your-write strategy.
Senior understanding
| Tension | Staff response |
|---|---|
| Thundering herd metrics | track miss spike + origin QPS sync |
| Multi-tenant poisoning | namespace keys + ACL metadata never global search |
| Debugging ghost states | correlation ids + cache key version logging sampling conscious cardinality discipline |
Mention comparison with feature flags disabling cache path during incident—operational muscle.
Interview answer structure
“I classify the data first: public, personalized, permissioned, or money-moving. Then I choose the cache layer and freshness contract. I prevent stampedes with jitter or single-flight, prevent leaks with key namespacing and auth-aware cache keys, and prove the cache is healthy with hit rate, origin QPS, stale-read indicators, and eviction pressure.”
Follow-ups to expect:
- What data must never be cached at the CDN?
- How do users see their own write immediately?
- How do you debug one user seeing stale state?
- What do you do when Redis is down?
Diagram
See also
Mark this page when you finish learning it.
Spotted something unclear or wrong on this page?