Request lifecycle, timeouts & deadlines

Core details

Typical control-plane path: client → edge (TLS, L7 rules) → authentication → authorization → validation → handler → dependencies (DB/cache/queue) → response mapping.

Assign explicit timeouts per hop: client, gateway, service, each downstream. Propagate deadline/budget metadata so child calls don’t overspend parent SLAs.

Cancellation: when a client disconnects or deadline elapses, cooperative cancellation stops useless work—yet care for idempotent side-effects still committed mid-flight.

Problem this solves: one slow dependency should not turn a bounded user request into an unbounded thread, connection, or event-loop occupancy problem.

Simple mental model: every request carries a stopwatch. Each hop spends part of the budget; no child call gets to pretend the parent has infinite time.

Partial failure is normal: networks stall, threads wait for pools, caches miss cold. Without per-hop budgets, tail latency explodes and retry storms amplify outages. Timeouts are engineering decisions about acceptable incompleteness, not “failure is bad” absolutism—state what user sees (degrade, queue, error code) per scenario.

Unbounded synchronous fan-out converts O(1) handlers into unbounded coordination graphs under load even if each remote call “usually fast.”

The visual model below shows the request path as a budget-spending chain: every hop consumes part of the parent deadline, cancellation stops wasted work, and the response class tells clients whether to retry, degrade, or back off.

Backend request lifecycle showing client, edge, auth, handler, dependencies, deadline budgets, cancellation, response mapping, and verification loop.

Senior understanding

Interviewer probe	Strong response
“Why 429 vs 503?”	overload shedding vs dependency unavailable semantics + client retry strategy
“How prove fix?”	percentile trace + pool wait metrics pre/post
“Mobile flaky networks?”	idempotent retries vs non-idempotent writes policy explicit

Link metrics to customer journey (checkout, login) not vanity service CPU alone.

Worked budget example

For a checkout API with a 900 ms server-side SLO:

Hop	Budget	Guard
Edge + auth + validation	80 ms	reject bad credentials/payloads early
Inventory read	180 ms	fallback to “verify later” only if product policy allows
Payment authorization	450 ms	no retry inside handler unless idempotency key is present
Order write + outbox	120 ms	single transaction, explicit commit boundary
Response mapping	70 ms	return committed status, not downstream wishful thinking

If the payment call consumes 430 ms, the order write should not still receive a full 120 ms by default. It receives the remaining parent budget minus response margin.

Interview answer structure

“I start with a deadline at the edge, propagate it through the service, and give every downstream a child timeout smaller than the remaining parent budget. I classify failures: validation gets 4xx, overload gets 429 or 503 with retry guidance, and writes are retried only behind an idempotency key. I prove the design with p95/p99 traces, pool wait metrics, and an alert tied to the user journey.”

Follow-ups to expect:

What changes for mobile clients with poor networks?
Which calls are safe to retry?
How do you cancel work that already wrote to the database?
What status code should the client see during overload?

Diagram

Loading diagram…

Request lifecycle, timeouts & deadlines

Core details

Understanding

Senior understanding

Worked budget example

Interview answer structure

Diagram

See also

On this page