Request lifecycle, timeouts & deadlines
Core details
Typical control-plane path: client → edge (TLS, L7 rules) → authentication → authorization → validation → handler → dependencies (DB/cache/queue) → response mapping.
Assign explicit timeouts per hop: client, gateway, service, each downstream. Propagate deadline/budget metadata so child calls don’t overspend parent SLAs.
Cancellation: when a client disconnects or deadline elapses, cooperative cancellation stops useless work—yet care for idempotent side-effects still committed mid-flight.
Understanding
Partial failure is normal: networks stall, threads wait for pools, caches miss cold. Without per-hop budgets, tail latency explodes and retry storms amplify outages. Timeouts are engineering decisions about acceptable incompleteness, not “failure is bad” absolutism—state what user sees (degrade, queue, error code) per scenario.
Unbounded synchronous fan-out converts O(1) handlers into unbounded coordination graphs under load even if each remote call “usually fast.”
Senior understanding
| Interviewer probe | Strong response |
|---|---|
| “Why 429 vs 503?” | overload shedding vs dependency unavailable semantics + client retry strategy |
| “How prove fix?” | percentile trace + pool wait metrics pre/post |
| “Mobile flaky networks?” | idempotent retries vs non-idempotent writes policy explicit |
Link metrics to customer journey (checkout, login) not vanity service CPU alone.
Diagram
See also
Spotted something unclear or wrong on this page?