Request lifecycle, timeouts & deadlines
Core details
Typical control-plane path: client → edge (TLS, L7 rules) → authentication → authorization → validation → handler → dependencies (DB/cache/queue) → response mapping.
Assign explicit timeouts per hop: client, gateway, service, each downstream. Propagate deadline/budget metadata so child calls don’t overspend parent SLAs.
Cancellation: when a client disconnects or deadline elapses, cooperative cancellation stops useless work—yet care for idempotent side-effects still committed mid-flight.
Problem this solves: one slow dependency should not turn a bounded user request into an unbounded thread, connection, or event-loop occupancy problem.
Simple mental model: every request carries a stopwatch. Each hop spends part of the budget; no child call gets to pretend the parent has infinite time.
Understanding
Partial failure is normal: networks stall, threads wait for pools, caches miss cold. Without per-hop budgets, tail latency explodes and retry storms amplify outages. Timeouts are engineering decisions about acceptable incompleteness, not “failure is bad” absolutism—state what user sees (degrade, queue, error code) per scenario.
Unbounded synchronous fan-out converts O(1) handlers into unbounded coordination graphs under load even if each remote call “usually fast.”
The visual model below shows the request path as a budget-spending chain: every hop consumes part of the parent deadline, cancellation stops wasted work, and the response class tells clients whether to retry, degrade, or back off.

Senior understanding
| Interviewer probe | Strong response |
|---|---|
| “Why 429 vs 503?” | overload shedding vs dependency unavailable semantics + client retry strategy |
| “How prove fix?” | percentile trace + pool wait metrics pre/post |
| “Mobile flaky networks?” | idempotent retries vs non-idempotent writes policy explicit |
Link metrics to customer journey (checkout, login) not vanity service CPU alone.
Worked budget example
For a checkout API with a 900 ms server-side SLO:
| Hop | Budget | Guard |
|---|---|---|
| Edge + auth + validation | 80 ms | reject bad credentials/payloads early |
| Inventory read | 180 ms | fallback to “verify later” only if product policy allows |
| Payment authorization | 450 ms | no retry inside handler unless idempotency key is present |
| Order write + outbox | 120 ms | single transaction, explicit commit boundary |
| Response mapping | 70 ms | return committed status, not downstream wishful thinking |
If the payment call consumes 430 ms, the order write should not still receive a full 120 ms by default. It receives the remaining parent budget minus response margin.
Interview answer structure
“I start with a deadline at the edge, propagate it through the service, and give every downstream a child timeout smaller than the remaining parent budget. I classify failures: validation gets 4xx, overload gets 429 or 503 with retry guidance, and writes are retried only behind an idempotency key. I prove the design with p95/p99 traces, pool wait metrics, and an alert tied to the user journey.”
Follow-ups to expect:
- What changes for mobile clients with poor networks?
- Which calls are safe to retry?
- How do you cancel work that already wrote to the database?
- What status code should the client see during overload?
Diagram
See also
Mark this page when you finish learning it.
Spotted something unclear or wrong on this page?