Union-Find (Disjoint Set Union)¶
Union-Find maintains a collection of disjoint sets with near-constant-time operations for merging sets and finding which set an element belongs to. Essential for Kruskal's MST, connected components, and cycle detection.
Key Facts¶
- find(x): Find root/representative of set containing x
- union(x, y): Merge sets containing x and y
- With path compression + union by rank: O(alpha(n)) amortized per operation
- alpha(n) = inverse Ackermann function, effectively O(1) for all practical n (alpha(n) <= 4 for n < 10^80)
- Space: O(n)
Patterns¶
Implementation¶
class UnionFind:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # path compression
return self.parent[x]
def union(self, x, y):
px, py = self.find(x), self.find(y)
if px == py:
return False # already same set
# Union by rank
if self.rank[px] < self.rank[py]:
px, py = py, px
self.parent[py] = px
if self.rank[px] == self.rank[py]:
self.rank[px] += 1
return True
Iterative Find with Path Halving¶
def find(self, x):
while self.parent[x] != x:
self.parent[x] = self.parent[self.parent[x]] # path halving
x = self.parent[x]
return x
Path halving (iterative) is simpler and has same asymptotic performance as full path compression (recursive).
Cycle Detection in Undirected Graph¶
def has_cycle(vertices, edges):
uf = UnionFind(len(vertices))
for u, v in edges:
if not uf.union(u, v):
return True # u and v already connected = cycle
return False
Count Connected Components¶
def count_components(n, edges):
uf = UnionFind(n)
for u, v in edges:
uf.union(u, v)
return len(set(uf.find(i) for i in range(n)))
Optimization Techniques¶
| Technique | Effect |
|---|---|
| Path compression | Flattens tree during find - all nodes point to root |
| Union by rank | Attaches shorter tree under taller - limits height to O(log n) |
| Both combined | Amortized O(alpha(n)) per operation |
| Neither | O(n) worst case per operation (degenerate chain) |
Gotchas¶
- Without path compression, find can be O(n) on a degenerate chain
- Union by rank alone gives O(log n) per find
unionreturns False when elements are already in same set - useful for cycle detection- The rank is an upper bound on height, not exact height (path compression doesn't update rank)
- For sparse graphs, Union-Find is more efficient than DFS/BFS for connectivity queries on dynamic graphs (edges added over time)
See Also¶
- [[minimum-spanning-trees]] - Kruskal's algorithm uses Union-Find
- [[graph-representation]] - Union-Find as alternative for dynamic connectivity
- [[graph-traversal-bfs-dfs]] - alternative for static connectivity queries