THN Interview Prep

Performance & Optimization

Performance engineering is measurement → hypothesis → single change → proof, tied to a user- or revenue-facing metric. Without that anchor, “optimization” becomes guesswork and resume bullet theater.

How to use this page

  • Use Core basics as a vocabulary and mental model checklist.
  • Use Profiling playbooks as literal tool menus (what you open first).
  • Use Recognition cues to avoid mis-classifying a queueing problem as “CPU.”
  • Run Study sessions with real traces or realistic invented waterfalls.

Topic study plan (deep pages)

Topic notes: /performance/topics/...Core details → Understanding → Senior understanding → Diagram.

TopicFocus
Tail latency & SLOsPercentiles, error budgets, queueing intuition
Profiling the browserLong tasks, layout, network waterfall
Profiling services & asyncCPU vs wait, pools, traces
Database query pathPlans, stats, locks, replicas
Caching & retry amplificationStampede, jitter, hedging cautions

Authoring template (not in sidebar): `content/core-docs/performance/topics/topic-page-template.mdx` (`publishDocs: false`).


Core basics — decompose wall time

For any hop (browser frame, service request, DB query), break latency into:

BucketWhat it usually isFirst instruments
Queueingwaiting for thread, pool, partition lock, event-loop turnpool depth, sched latency, queue depth
Computehot code, regex, (de)serializationCPU flame / sample
IOremote calls, disk, replicationspan waterfall, iostat class tools
SerializationJSON/protobuf size, encodingpayload metrics, alloc profile
Coordinationlocks, barriers, chatty fan-outcontention profiles, trace spans

Interview line: “I’d prove which bucket dominates before tuning—otherwise you ‘speed up’ the wrong stage.”

Little’s Law (qualitative)

In stable systems: average items in flight ≈ arrival rate × average time in system.
Sudden backlog growth often means saturation before CPU hits 100%.

Percentiles matter more than averages

StatGood for
p50 / mediantypical feel, capacity planning shorthand
p90–p95UX + many SLOs
high tails (p99+)catastrophic retries / user abandonment

Optimize the percentile your product SLO names—and say explicitly when you sacrificed tail for median intentionally (rare—but must be deliberate).


Frontend performance (what to open first)

Chrome / Chromium Performance workflow

  1. Start recording → reproduce interaction once cleanly.
  2. Look for long tasks (main-thread blocks).
  3. Bottom-up: sort by self time JS functions.
  4. Layout markers: flagged reflow/layout events → map to offending read/write geometry pattern.
  5. Network: check critical request priority, LCP candidate discovery.

Heap memory investigations:

SymptomTooling move
Detached DOM after route changesMemory snapshot diff dominator paths
Steady climbAllocation timeline / sampling profiler

Synthetic vs field:

  • Lighthouse lab reproducible regressions CI-friendly.
  • CrUX field validates low-end phones & flaky networks—you need both narratives.

Synergy: revisit /frontend for rendering & UX contracts.


Backend / services performance playbooks

CPU vs wall time split

Wall ≫ CPU ⇒ waiting (locks, pools, downstream). Profile async traces before micro-optimizing functions.

Classic mistakes:

MistakeWhat it looks likeFix direction
Blocking ELlatency spikes punctuallyoffload / non-blocking libs
Unbounded parallelismdownstream brownoutsemaphores, bulkheads
Retry stormcascading 429/503backoff + jitter + budgets

Pools & saturation signals

Monitor waiting time acquiring DB connections—not only query duration. Exhausted pools mimic “DB slow” falsely.

Tracing must carry budget: parent deadline propagated to children to avoid hopeless late work.

Synergy: /backend, /databases.


Database performance fundamentals

Read path

Steps an interviewer expects you to articulate:

StepInvestigation
Find query textpg_stat_statements class metrics / slow log
Get plan shapeEXPLAIN ANALYZE (vendor equivalent)
Check row estimatescardinality / stats freshness story
Check access pathsequential vs index scan vs bitmap—why chosen

Staff nuance: sometimes “wrong index” isn’t—you need rewrite (covering projection, narrower SELECT, lateral batching patterns), not index soup.

Write path & amplification

indexes ↑ write cost; hot updates on wide indexes cause bloat contention—articulate consciously.

Replication & consistency illusion

Stale read replicas surfaced as intermittent “bugs” unless user-visible freshness cues engineered—classification error mistaken as raw performance issue historically common.


Caching layers & invalidation realism

CacheTypical failure modeInterview mitigation story
Browser HTTPleaking auth’d HTML/CDN mishapkeyed URLs, surrogate keys
App localstampedes after TTLjitter + coalesce
Shared remotestale business decisionsversioning + negative caching caution
ORM/sessionphantom staleness layeringTTL + explicit invalidation events

Thundering herd: explain probabilistic early refresh conceptually—even if vendor-specific implementation deferred.


Load shapes & amplification

PatternHazardControls
Retries without jitterharmonic spikescapped retries + backoff
Global fan-out timeoutsherd releaseconcurrency limits
Periodic cron alignmentspikesjitter scheduling / rate smoothing
Autoscale lagcold queue growthprovisioning / queue absorb strategy

Cold starts (serverless / JVM warmup): quantify cold penalty affecting tail—tie to concurrency provisioning money trade.


Reliability interplay (error budgets)

Shrinking tails often trades cost or complexity or correctness windows—explicitly reconcile with reliability error budget narratives when relevant—not every perf win is unconditionally “good ops.”


Recognition cues (symptoms → drills)

SymptomSplit firstDrill
High CPU but low QPSuser vs syscall vs GCsegregated profiler views
Memory climb steadyleak vs cachingsnapshot diff timelines
API slow intermittenttails vs saturationpercentile trace overlays
DB CPU low but latency highwaiting / locks / replicaswaits / locks / replication dashboard
Hot key / skewuneven shard loadresharding narrative + cache pad

Staff follow-ups: “What dashboard would pre-detect recurrence?” “What synthetic probe fails CI next time?”


Memory hooks

  • USE per bottlenecked resource (Utilization, Saturation, Errors).
  • One change hypothesis—no multi-variable mystery PRs pretending clarity.
  • Little’s backlog intuition connects queue depth intuitively arrival rate interplay.

Study sessions (timed)

Session R — Incident replay (35 min)

Reconstruct chronologically: symptom metrics → narrowing experiment → causal commit → remediation → preventative guard instrumentation addition.

Session P — Flame/waterfall reading (25 min)

Use a sanitized internal capture or invent plausible shape: annotate three hypotheses + validation step each.

Session S — Constraint swap (12 min verbal)

Alternate characterization CPU-bound ↔ IO-bound ↔ memory-bound swapping diagnostic ordering & mitigations verbally without notes.


Diagrams

Bottleneck narrowing

Loading diagram…

Retry amplification cartoon

Loading diagram…

Pitfalls

  • Reporting mean latency only while p95/p99 regress—retries hide the user story until production screams.
  • Index theater: adding indexes without measuring write amplification, bloat, and maintenance windows.
  • Tuning micro-benchmark laptops while production traffic is skewed, bursty, and multi-tenant.
  • Shipping caches without staleness UX—especially money, quotas, entitlement, entitlement counts.
  • Fixing CPU while latency is dominated by queueing—you win flame graphs but lose users.

  • /frontend — rendering + main-thread discipline.
  • /backend — timeouts, retries, pools, saturation.
  • /databases (if present in nav) — plans, isolation, replication lag.
  • /dsa — algorithmic hotspots when profiling points to asymptotics.

Mark this page when you finish learning it.

Spotted something unclear or wrong on this page?

On this page