Load Balancer
What it is
A load balancer sits in front of a pool of servers (or services) and distributes incoming connections or requests so no single instance is overwhelmed. It can operate at Layer 4 (transport: TCP/UDP) or Layer 7 (application: HTTP, gRPC, etc.).
When to use
- Horizontal scaling: more than one instance of an app or API.
- High availability: health checks remove bad nodes; traffic fails over.
- SSL termination (often at L7 or a dedicated tier): decrypt at the edge, plain HTTP inside the VPC (trade security vs. trust boundaries).
- Geographic or multi-AZ routing: send users to the nearest or healthiest region.
L4 vs L7
| Aspect | L4 (network) | L7 (application) |
|---|---|---|
| Decides on | IP, port, sometimes TLS SNI | URL path, host header, cookies, method |
| Cost / latency | Usually lower | Higher (parsing, rules) |
| Use when | TCP stickiness, raw protocols, max throughput | Routing by path, A/B, canary, WAF-style rules |
Loading diagram…
Algorithms
- Round robin (RR): cycle through backends in order. Simple; ignores load and connection count.
- Least connections: send the next request to the backend with the fewest active connections. Good for long-lived or uneven work.
- Consistent hashing: map a key (e.g. user id, session id) to a ring; same key usually hits the same server. Minimizes reshuffle when nodes are added/removed. Common for stateful caches or session affinity without sticky cookies at L7.
Alternatives: DNS round robin (coarse, TTL caching hurts agility); client-side discovery + random pick; service mesh sidecars that load balance inside the cluster.
Failure modes
- Misconfigured health checks: flapping backends or traffic to dead nodes.
- Hot partitions: consistent hashing can still skew if key distribution is skewed; add virtual nodes or reshuffle keys.
- Connection storms: SYN floods or reconnect loops overwhelm the balancer or backends.
- State at L7: wrong sticky/session rules break auth or cart flows.
Interview talking points
- Start with capacity and back-of-envelope math; mention latency vs throughput when arguing L4 vs L7.
- Clarify stateless vs stateful backends and whether you need affinity or shared storage/cache.
- Mention health checks, draining, and graceful shutdown for deploys.
Related fundamentals
Last updated on
Spotted something unclear or wrong on this page?