THN Interview Prep

ID Generation

Distributed systems need unique, often sortable identifiers without a single global mutex. Interviewers probe whether you understand collision probability, clock skew, embeddability in URLs, and ordering guarantees.


Common strategies

UUID (v4 random)

  • Pros: trivial, no coordination, huge space.
  • Cons: not time-sortable; long as string primary keys in some DBs; index locality worse than monotonic ids.

Snowflake-style 64-bit integer

  • Structure (conceptual): timestamp | datacenter/worker | sequence within millisecond.
  • Pros: roughly time-ordered, compact, fits in BIGINT, good for indexes and logs.
  • Cons: needs worker id assignment; clock skew handling (NTP); sequence exhaustion if QPS per worker > sequence bits allow.

Database sequence / auto-increment

  • Pros: strong ordering, transactional simplicity.
  • Cons: single-writer bottleneck at extreme scale; not ideal for multi-region active-active without ranges.

Ticket / segment servers

  • Pre-allocate ranges of ids to services so generation stays local until range exhaustion — hybrid of DB safety and distributed throughput.

Interview talking points

  1. Ordering: Do we need strict total order or “good enough” time order for UX?
  2. Collision: Birthday paradox irrelevant for 122-bit UUID; matters if you shorten IDs (base62).
  3. Hot spots: Monotonic ids can create write hot shards in some DBs — mitigate with hash prefix sharding or ULID/tradeoffs.
  4. Clock skew: Snowflake nodes must reject generating ids when clock moves backward beyond tolerance.

See also: System design curriculum overview

Last updated on

Spotted something unclear or wrong on this page?

On this page