THN Interview Prep

Redis Internals (Interview Deep Dive)

Redis is an in-memory data structure server with optional persistence and a rich type system. Interviews expect you to connect single-threaded execution (per core) with latency predictability, data structure choice, and operational trade-offs (RDB vs AOF, eviction, clustering).

At a high level: a client issues commands over RESP; the server maps keys to one of many encodings (ziplist, intset, listpack, skiplist + hash table, etc.—implementation details evolve by version) behind the logical types you know from the API.


Data structures you actually design around

  • String: byte sequences; can hold integers efficiently; bitmap and bitfield operations for compact counters/flags; use for caching HTML/JSON, locks, and simple values.
  • List: ordered sequences; LPUSH/RPOP queues, capped lists, recent-items feeds (mind big list issues).
  • Set: unique members; set algebra, tag membership, random sampling.
  • Sorted set: score + member unique; powers leaderboards and time-series windows when scores encode time; range by rank or score in O(log N) style work (implementation uses skiplist + map for classic ZSET).
  • Hash: field-value maps; natural for object-shaped cache rows; watch big hashes (consider splitting or compression at the app layer).
  • Stream (since 5+): append-only log with consumer groups, rough analog to simple event logging inside Redis; not a drop-in for Kafka at large scale, but great for low-latency, co-located workflows.
  • HyperLogLog for approximate distinct counts; GEO for geospatial (sorted set under the hood with geohash-style encoding).
  • JSON module (if enabled) for document-style access—treat as an add-on, not the portable core.

When mapping to system design, pair Redis with caching and think about staleness and consistency models for read-through vs write-through.


Single-threaded command loop and why it matters

The core command processing is single-threaded per shard in common deployments. Long-running commands (KEYS, unbounded SORT, huge SMEMBERS, Lua scripts that touch too much) block the world for that node. Use SCAN, pipelining, and sharding to keep p99 healthy.

I/O threads in recent versions help with network I/O, but the execution model still shapes your design: keep hot values small, avoid megabyte blobs unless you accept latency risk, and use connection pooling with care (many tiny connections can still add overhead).


Persistence: RDB snapshots

RDB writes point-in-time snapshots to disk (fork + copy-on-write child process). Properties:

  • Compact files, good for backup/restore and disaster recovery shipping.
  • Bounded full sync cost but between snapshots, crashes lose last minutes of writes unless combined with AOF.
  • fork latency on huge memory footprints can stall parents—watch latency doctor, memory fragmentation, and huge pages settings in production guides.

Use RDB when you can tolerate snapshot granularity loss or when datasets are mostly reconstructible from upstream sources.


Persistence: AOF append-only file

AOF logs mutating commands (with rewrite/compaction passes). Properties:

  • Finer durability windows (appendfsync always/everysec/no)—everysec is a common compromise.
  • Rewrite (BGREWRITEAOF) compacts command streams into shorter equivalents.

Combine RDB + AOF when you want faster restarts with stronger tail-of-log durability—Redis supports mixed persistence modes.

Link mentally to write-ahead logging in databases (Postgres deep dive); the spirit is similar though mechanisms differ.


Eviction policies when memory is full

Redis can cap memory (maxmemory). When breached, eviction policies decide what dies:

  • volatile-ttl, volatile-lru/lfu, allkeys-lru/lfu/random, noeviction (writes fail).

LRU/LFU are approximations (samples)—good enough for caches. Choose based on whether you only expire TTL’d keys or can evict any key.

Eviction interacts with hot key problems: if your working set exceeds RAM, thrashing occurs—fix by sharding, compressing values, or moving cold data to disk-backed stores.


Clustering and hash slots

Redis Cluster partitions keyspace into 16384 hash slots. Each master owns a slot range; replicas failover if masters die (with quorum/replica migration rules). Clients must be cluster-aware (follow MOVED/ASK redirects).

Multi-key operations (MGET, Lua touching multiple keys) require keys to hash to the same slot—use hash tags {user}:123:profile / {user}:123:orders patterns so related keys co-locate.

This is the Redis-flavored version of sharding: deterministic routing without a central coordinator for each read.


Replication and reads

Primary/replica replication streams writes asynchronously by default; replica reads can be stale—perfect for read scaling of cache-like data if your app tolerates lag. Do not assume linearizable reads from replicas unless you use the rare tight coordination patterns; tie back to consistency models.


Interview phrase

“I pick Redis types by access pattern—sorted sets for leaderboards, hashes for object rows, streams for lightweight queues—and I treat replicas as possibly stale. I pair RDB for checkpoints with AOF for tighter tails, tune eviction for LRU/LFU on cache workloads, and cluster with hash tags when I need multi-key Lua.”


Pub/sub, blocking lists, and reliable-ish queues

PUBLISH/SUBSCRIBE is fire-and-forget—subscribers offline miss messages. For small work queues, BRPOP with lists can provide blocking pop semantics; for stricter ack/retry, use Streams with consumer groups or an external broker (Kafka).

When comparing to message queue vs stream, position Redis as ultra-low-latency colocated infrastructure, not the global source of legal record for all events.


Lua scripting and server-side composition

EVAL runs Lua atomically in the server—great for compare-and-swap style updates or rate limiting (rate limiter token buckets). Keep scripts short; long scripts block the event loop of that shard.


Memory planning: fragmentation and large values

jemalloc and memory policies matter for fragmentation; huge values cause eviction surprises and long persistence pauses. Prefer smaller keys and chunking for big blobs, or object storage for bulk (blob storage mental model).


High availability and Sentinel

Redis Sentinel provides failover for primary/replica topologies: monitoring, leader election for the data primary, and client discovery. Cluster provides sharding—pick based on data size and throughput ceilings of a single primary.


Security surface

  • ACLv2 for command-level permissions and key patterns.
  • TLS and requirepass/ACL for authn; disable dangerous commands in shared environments.

Common interview traps

  • Claiming multi-key ACID—Redis is not a general SQL database; transactions exist but with constraints.
  • Ignoring hot keys in cluster mode—single slot saturation still happens.
  • Using KEYS in production—use SCAN.

How Redis pairs with Postgres or DynamoDB

Use Redis as a ephemeral or soft-state layer: session caches, idempotency token sets with TTL, feature flags, or throttle counters, while source of truth remains in a durable store. Align mentally with consistency models for what readers may miss after failures.


Modules and search (Redis Stack)

Redis Stack bundles RediSearch, RedisJSON, time-series modules—great for rapid prototypes; validate licensing and operational footprint before committing. Interviews sometimes mention vector similarity via Redis—position next to dedicated vector DB trade-offs.


Redis vs Memcached (soundbite)

Redis wins when you need structures, persistence options, replication, and programmability (Lua). Memcached remains valid for pure slab-based key/value caching with brutal simplicity—pick based on features vs ops burden.


Failure modes to cite

  • OOM under eviction pressure when maxmemory misconfigured.
  • Primary failover flips connections—clients must handle MOVED/topology refresh in cluster mode.
  • AOF rewrite spikes IO—monitor during maintenance windows.

Last updated on

Spotted something unclear or wrong on this page?

On this page