13. Union-Find (DSU)
TL;DR
A disjoint-set union structure maintains a partition of items into components and supports Find (which set?) and Union (merge two sets). With path compression on Find and union by rank (or size), each operation is amortized inverse-Ackermann α(n), effectively constant. Use DSU for dynamic connectivity, cycle detection in undirected graphs, Kruskal’s MST, and redundant edge problems.
Recognition Cues
- Phrases: “connected components,” “union edges,” “redundant connection,” “are
aandbin same group?”, “minimum spanning tree” with sorted edges. - Input:
nnodes, undirected edges or union queries. - Output: component count, whether graph stays acyclic, MST weight, or the redundant edge.
Diagram
Pattern control flow (customize nodes for this pattern). camelCase node IDs.
Mental Model
Each set is a tree whose root is the “representative.” Find walks to the root while flattening parents (path compression). Union attaches the shallower tree under the deeper root (by rank). Amortized α(n) grows so slowly it is < 5 for all realistic n.
Generic Recipe
- Initialize
parent[index] = index,rank[index] = 0. - Find: while
parent[x] != x, setparent[x] = parent[parent[x]](or full path compression at end). - Union:
rootA = Find(a),rootB = Find(b); if equal, skip (already connected — often cycle if adding edge). Else link smaller rank under larger; if ranks tie, increment new root’s rank. - For Kruskal: sort edges by weight;
Unionif roots differ; accumulate weight. - For components count: after unions, count roots where
parent[i] == i.
Complexity
- Time: O(α(n)) amortized per
Find/Union; Kruskal O(E log E) from sorting edges. - Space: O(n) for parent and rank arrays.
Generic Code Template
Go
type UnionFind struct {
parent []int
rank []int
}
func NewUnionFind(size int) *UnionFind {
parent := make([]int, size)
rank := make([]int, size)
for index := 0; index < size; index++ {
parent[index] = index
}
return &UnionFind{parent: parent, rank: rank}
}
func (unionFind *UnionFind) Find(node int) int {
if unionFind.parent[node] != node {
unionFind.parent[node] = unionFind.Find(unionFind.parent[node])
}
return unionFind.parent[node]
}
func (unionFind *UnionFind) Union(first int, second int) bool {
rootFirst := unionFind.Find(first)
rootSecond := unionFind.Find(second)
if rootFirst == rootSecond {
return false
}
if unionFind.rank[rootFirst] < unionFind.rank[rootSecond] {
unionFind.parent[rootFirst] = rootSecond
} else if unionFind.rank[rootFirst] > unionFind.rank[rootSecond] {
unionFind.parent[rootSecond] = rootFirst
} else {
unionFind.parent[rootSecond] = rootFirst
unionFind.rank[rootFirst]++
}
return true
}
func (unionFind *UnionFind) Connected(first int, second int) bool {
return unionFind.Find(first) == unionFind.Find(second)
}JavaScript
class UnionFind {
constructor(size) {
this.parent = Array.from({ length: size }, (_, index) => index);
this.rank = Array(size).fill(0);
}
find(node) {
if (this.parent[node] !== node) {
this.parent[node] = this.find(this.parent[node]);
}
return this.parent[node];
}
union(first, second) {
let rootFirst = this.find(first);
let rootSecond = this.find(second);
if (rootFirst === rootSecond) return false;
if (this.rank[rootFirst] < this.rank[rootSecond]) {
this.parent[rootFirst] = rootSecond;
} else if (this.rank[rootFirst] > this.rank[rootSecond]) {
this.parent[rootSecond] = rootFirst;
} else {
this.parent[rootSecond] = rootFirst;
this.rank[rootFirst] += 1;
}
return true;
}
connected(first, second) {
return this.find(first) === this.find(second);
}
}Variants
- Connected components count: start with
n, decrementUnionwhen merging distinct roots. - Detect cycle (undirected): on each new edge,
Unionreturnsfalse⇒ cycle closed. - Kruskal MST: DSU on vertices; process edges ascending weight.
- Redundant connection: last edge that joins already-connected endpoints (or variant II with directed roots).
Representative Problems
- 323. Number of Connected Components in an Undirected Graph — count components
- 547. Number of Provinces — adjacency matrix + DSU or DFS
- 684. Redundant Connection — cycle detection
- 1584. Min Cost to Connect All Points — Kruskal with implicit edges
Anti-patterns
- Union without checking roots first — wrong merges or double counting.
- Using DSU for directed cycle without adapting (use topo sort or specialized algorithms).
- Path compression without union by rank/size — still works but weaker guarantees in theory.
- Kruskal without sorting edges — breaks MST optimality.
Last updated on
Spotted something unclear or wrong on this page?