THN Interview Prep

API Gateway

What it is

An API gateway is a single ingress for clients (web, mobile, partners) that handles cross-cutting concerns: authentication, authorization, rate limiting, routing to internal services, request/response shaping, sometimes TLS termination and observability.

API gateway request path showing client ingress, edge policy checks, route selection, backend services, direct object upload bypass, and observability.

  Mobile app --> API Gateway --> service A
                             --> service B
                             --> service C

Concerns

  • Auth: validate JWT/OAuth tokens, API keys, mTLS for partners; optional token exchange at edge.
  • Rate limit: protect backends from abuse (see rate-limiter); per-user and per-API-key quotas.
  • Routing: path-based or host-based route tables; A/B and canary traffic splits.
  • Request validation / WAF: reject malformed payloads, oversized requests, suspicious patterns, and disallowed origins before backend fanout.
  • BFF (Backend for Frontend): sometimes a per-client-type gateway or thin service shapes aggregated responses for mobile vs web—avoids one generic API for all clients; tradeoff is more gateway logic to maintain.

Concrete example

For GET /v1/users/42/orders, the gateway terminates TLS, checks CORS, validates the JWT with cached JWKS keys, applies a per-user + endpoint quota, matches /v1/users/:id/orders to the orders service, injects a correlation id and user claims, and enforces an upstream timeout budget.

The service still owns domain authorization such as "can this user see these orders?" The gateway should reject obviously invalid traffic, not become the place where business rules drift away from services.

When to use

  • Many public clients and microservices behind the gateway.
  • Central place for policy (auth, limits, CORS, IP allowlists).
  • Single DNS and certificate management for consumers.

Alternatives

  • Service mesh (mTLS, retries) inside the cluster without replacing edge auth—often gateway + mesh together.
  • Direct to services with each service doing auth: duplication and drift.
  • CDN + Lambda@Edge for very edge-heavy auth/limiting—different operational model.
ComponentMain jobUse it whenWatch out for
Load balancerSpread traffic across healthy targetsYou need availability and capacity distributionUsually too coarse for per-user policy
Reverse proxy / CDNTerminate HTTP, cache responses, shield origins, route to originsYou need static delivery, public GET caching, or origin protectionPersonalized API traffic often bypasses cache
WAFBlock malicious HTTP patterns and known attack signaturesYou need generic edge security before API processingIt complements policy; it does not understand every product invariant
API gatewayEdge policy, routing, auth checks, quotas, request shapingMany clients call many backend APIsCan become a bottleneck or business-logic dumping ground
Service meshService-to-service mTLS, retries, traffic policy, telemetryYou need internal east-west controlDoes not replace public API auth or product-specific routing
BFFClient-specific API compositionMobile and web need different response shapesMore surface area and ownership to maintain

Dependency and security policy

  • Auth provider unavailable: choose per route whether cached keys and recently validated sessions can continue briefly; fail closed for high-risk writes and admin actions.
  • JWKS rotation: cache public keys with TTL, keep old keys during rotation, and alert on refresh failures so a key rollout does not become an outage.
  • Partner APIs: combine API keys, mTLS, per-partner quotas, scoped permissions, and audit logs; avoid one shared global partner credential.
  • Request validation: reject oversized bodies, invalid content types, malformed schemas, and disallowed origins before expensive backend work.

Failure modes

  • Gateway as bottleneck: must scale horizontally and cache JWKS public keys.
  • Misrouting: bad config sends traffic to wrong cluster/version.
  • Large payloads through gateway: memory and timeout limits; sometimes direct upload to blob-storage with signed URLs.
  • Retry amplification: automatic retries at the edge can multiply load during partial outages; use bounded retries, jitter, and circuit breakers.
  • Policy drift: gateway auth rules and service auth rules disagree; test both and keep ownership clear.
  • Dependency failures: auth provider, rate-limit store, or config service outage forces a clear fail-open vs fail-closed policy per route.

Common mistakes

  • Putting orchestration-heavy business workflows in the gateway instead of a BFF or application service.
  • Treating gateway authentication as the only authorization layer.
  • Rolling out route or policy config without validation, canaries, rollback, and audit trails.
  • Using the gateway for large uploads/downloads when signed object-storage URLs would remove app-server byte handling.

Operational signals

  • Gateway request rate, p95/p99 latency, 4xx/5xx/429 rates, upstream timeout rate, retry count, and saturation.
  • Per-route error budgets and backend dependency latency.
  • Config version, rollout status, route-table changes, certificate expiry, JWKS refresh failures, and auth provider failures.

Interview talking points

  • Separate edge security from service-level authorization (what user can do vs which microservice enforces domain rules).
  • Mention observability: correlation ids through gateway to all services.
  • Tie limits and timeouts to latency-throughput SLOs and back-of-envelope capacity.

Interview answer shape

  1. Start with clients and traffic: web/mobile/partners, public APIs, request volume, payload size, and security needs.
  2. Place the gateway at the edge for TLS, auth checks, quotas, CORS, request validation, routing, and observability.
  3. Keep domain authorization and business invariants in backend services; the gateway passes identity and context.
  4. Define route config, canary rollout, timeout budgets, retries, and direct-upload bypass for large payloads.
  5. Close with operations: horizontal scaling, rate-limit stores, config rollback, traces, and failure policy when auth or routing dependencies fail.

Common follow-ups: API gateway vs load balancer, gateway vs service mesh, BFF vs generic gateway, WAF placement, canary routing, signed URLs for uploads, and how to avoid retry storms.

Follow-up answers

  • API gateway vs load balancer: the load balancer distributes traffic across healthy targets; the gateway adds API-aware policy such as identity checks, quotas, request validation, route-level timeouts, and response shaping. Many systems use both.
  • Gateway vs service mesh: the gateway protects north-south client ingress. The mesh controls east-west service traffic, mTLS, retries, and telemetry after traffic is already inside the system.
  • Where WAF/CDN fit: CDN and WAF often sit before the gateway to absorb static traffic, block obvious attacks, and reduce origin load. The gateway still owns API identity, routing, and quotas.
  • Large uploads: authenticate at the gateway or app API, issue a short-lived signed object-storage URL, and let the client upload bytes directly to storage so gateway memory and timeout budgets are not consumed.
  • Retry storms: cap retries, add jitter, honor upstream deadlines, use circuit breakers, and avoid every layer retrying the same failed dependency independently.

Interview questions

  1. Where would you place authentication, authorization, rate limiting, and request validation in a microservice system?
  2. How do you prevent an API gateway from becoming a single bottleneck or single point of failure?
  3. When would you use an API gateway, a load balancer, a service mesh, and a BFF together?

Memory hooks

  • Gateway = public edge policy + route selection + observability.
  • Services still own domain authorization and business invariants.
  • Large bytes should usually bypass the gateway with signed object-storage URLs.

Mark this page when you finish learning it.

Last updated on

Spotted something unclear or wrong on this page?

On this page