API Gateway

What it is

An API gateway is a single ingress for clients (web, mobile, partners) that handles cross-cutting concerns: authentication, authorization, rate limiting, routing to internal services, request/response shaping, sometimes TLS termination and observability.

API gateway request path showing client ingress, edge policy checks, route selection, backend services, direct object upload bypass, and observability.

  Mobile app --> API Gateway --> service A
                             --> service B
                             --> service C

Concerns

Auth: validate JWT/OAuth tokens, API keys, mTLS for partners; optional token exchange at edge.
Rate limit: protect backends from abuse (see rate-limiter); per-user and per-API-key quotas.
Routing: path-based or host-based route tables; A/B and canary traffic splits.
Request validation / WAF: reject malformed payloads, oversized requests, suspicious patterns, and disallowed origins before backend fanout.
BFF (Backend for Frontend): sometimes a per-client-type gateway or thin service shapes aggregated responses for mobile vs web—avoids one generic API for all clients; tradeoff is more gateway logic to maintain.

Concrete example

For GET /v1/users/42/orders, the gateway terminates TLS, checks CORS, validates the JWT with cached JWKS keys, applies a per-user + endpoint quota, matches /v1/users/:id/orders to the orders service, injects a correlation id and user claims, and enforces an upstream timeout budget.

The service still owns domain authorization such as "can this user see these orders?" The gateway should reject obviously invalid traffic, not become the place where business rules drift away from services.

When to use

Many public clients and microservices behind the gateway.
Central place for policy (auth, limits, CORS, IP allowlists).
Single DNS and certificate management for consumers.

Alternatives

Service mesh (mTLS, retries) inside the cluster without replacing edge auth—often gateway + mesh together.
Direct to services with each service doing auth: duplication and drift.
CDN + Lambda@Edge for very edge-heavy auth/limiting—different operational model.

Component	Main job	Use it when	Watch out for
Load balancer	Spread traffic across healthy targets	You need availability and capacity distribution	Usually too coarse for per-user policy
Reverse proxy / CDN	Terminate HTTP, cache responses, shield origins, route to origins	You need static delivery, public GET caching, or origin protection	Personalized API traffic often bypasses cache
WAF	Block malicious HTTP patterns and known attack signatures	You need generic edge security before API processing	It complements policy; it does not understand every product invariant
API gateway	Edge policy, routing, auth checks, quotas, request shaping	Many clients call many backend APIs	Can become a bottleneck or business-logic dumping ground
Service mesh	Service-to-service mTLS, retries, traffic policy, telemetry	You need internal east-west control	Does not replace public API auth or product-specific routing
BFF	Client-specific API composition	Mobile and web need different response shapes	More surface area and ownership to maintain

Dependency and security policy

Auth provider unavailable: choose per route whether cached keys and recently validated sessions can continue briefly; fail closed for high-risk writes and admin actions.
JWKS rotation: cache public keys with TTL, keep old keys during rotation, and alert on refresh failures so a key rollout does not become an outage.
Partner APIs: combine API keys, mTLS, per-partner quotas, scoped permissions, and audit logs; avoid one shared global partner credential.
Request validation: reject oversized bodies, invalid content types, malformed schemas, and disallowed origins before expensive backend work.

Failure modes

Gateway as bottleneck: must scale horizontally and cache JWKS public keys.
Misrouting: bad config sends traffic to wrong cluster/version.
Large payloads through gateway: memory and timeout limits; sometimes direct upload to blob-storage with signed URLs.
Retry amplification: automatic retries at the edge can multiply load during partial outages; use bounded retries, jitter, and circuit breakers.
Policy drift: gateway auth rules and service auth rules disagree; test both and keep ownership clear.
Dependency failures: auth provider, rate-limit store, or config service outage forces a clear fail-open vs fail-closed policy per route.

Common mistakes

Putting orchestration-heavy business workflows in the gateway instead of a BFF or application service.
Treating gateway authentication as the only authorization layer.
Rolling out route or policy config without validation, canaries, rollback, and audit trails.
Using the gateway for large uploads/downloads when signed object-storage URLs would remove app-server byte handling.

Operational signals

Gateway request rate, p95/p99 latency, 4xx/5xx/429 rates, upstream timeout rate, retry count, and saturation.
Per-route error budgets and backend dependency latency.
Config version, rollout status, route-table changes, certificate expiry, JWKS refresh failures, and auth provider failures.

Interview talking points

Separate edge security from service-level authorization (what user can do vs which microservice enforces domain rules).
Mention observability: correlation ids through gateway to all services.
Tie limits and timeouts to latency-throughput SLOs and back-of-envelope capacity.

Interview answer shape

Start with clients and traffic: web/mobile/partners, public APIs, request volume, payload size, and security needs.
Place the gateway at the edge for TLS, auth checks, quotas, CORS, request validation, routing, and observability.
Keep domain authorization and business invariants in backend services; the gateway passes identity and context.
Define route config, canary rollout, timeout budgets, retries, and direct-upload bypass for large payloads.
Close with operations: horizontal scaling, rate-limit stores, config rollback, traces, and failure policy when auth or routing dependencies fail.

Common follow-ups: API gateway vs load balancer, gateway vs service mesh, BFF vs generic gateway, WAF placement, canary routing, signed URLs for uploads, and how to avoid retry storms.

Follow-up answers

API gateway vs load balancer: the load balancer distributes traffic across healthy targets; the gateway adds API-aware policy such as identity checks, quotas, request validation, route-level timeouts, and response shaping. Many systems use both.
Gateway vs service mesh: the gateway protects north-south client ingress. The mesh controls east-west service traffic, mTLS, retries, and telemetry after traffic is already inside the system.
Where WAF/CDN fit: CDN and WAF often sit before the gateway to absorb static traffic, block obvious attacks, and reduce origin load. The gateway still owns API identity, routing, and quotas.
Large uploads: authenticate at the gateway or app API, issue a short-lived signed object-storage URL, and let the client upload bytes directly to storage so gateway memory and timeout budgets are not consumed.
Retry storms: cap retries, add jitter, honor upstream deadlines, use circuit breakers, and avoid every layer retrying the same failed dependency independently.

Interview questions

Where would you place authentication, authorization, rate limiting, and request validation in a microservice system?
How do you prevent an API gateway from becoming a single bottleneck or single point of failure?
When would you use an API gateway, a load balancer, a service mesh, and a BFF together?

Memory hooks

Gateway = public edge policy + route selection + observability.
Services still own domain authorization and business invariants.
Large bytes should usually bypass the gateway with signed object-storage URLs.

On this page