Union-Find Algorithm: Two Arrays That Make Connectivity Nearly Free

- Union-Find manages disjoint sets with two flat arrays (
parentandrank), enabling near-constant-time connectivity queries on millions of nodes. - Path compression rewrites every node on a find path to point directly to the root, flattening trees on each lookup so future finds are O(1).
- Union by rank always attaches the shorter tree under the taller one, bounding tree height at O(log n) without compression.
- Both optimizations together achieve O(α(n)) amortized time per operation, where α(n) ≤ 4 for any input you will ever feed a real program.
- Cycle detection: if
find(u) == find(v)before adding edge (u, v), the edge creates a cycle. This is exactly how Kruskal's MST algorithm works. - Dynamic connectivity and connected components are the primary interview use cases. DSU does not support edge deletion or set splits.
- Path halving is often the fastest variant in practice: fewer writes per find, trivially iterative, and the same asymptotic bound as full path compression.
Union-Find Algorithm: Two Arrays That Make Connectivity Nearly Free
You have a million nodes. You need to answer thousands of questions like "are these two nodes in the same group?" and "what happens when I merge these two groups?" And you need all of that to be, essentially, instantaneous.
That's the union find algorithm, also called Disjoint Set Union (DSU). You're managing a collection of non-overlapping groups with two operations: find which group something belongs to, and union two groups together. Both run in what is effectively constant time, backed by nothing more than two flat integer arrays. No heap allocation. No struct nodes. No drama.
Reach for it whenever you're tracking connectivity, grouping, or need to detect if adding an edge creates a cycle.
The Surprisingly Low-Tech Interior
The internal representation is a forest of trees, but you never allocate tree nodes. Every element from 0 to n-1 gets an index, and two arrays drive everything.
parent[i] : the parent of node i. If parent[i] == i, then i is a root (the representative of its group).
rank[i] : an upper bound on the height of the subtree rooted at i.
Initially, every node is its own root:
parent = [0, 1, 2, 3, 4]
rank = [0, 0, 0, 0, 0]
After union(0, 1) and union(1, 2):
parent = [0, 0, 0, 3, 4]
rank = [1, 0, 0, 0, 0]
Node 0 became the root of a group containing 0, 1, and 2. Nodes 3 and 4 are still their own roots.
Gold border = root. Blue border = non-root. Parent array values highlighted in amber show the changed pointers.
The key structural invariant: to find which group a node belongs to, you follow parent pointers up until you hit a node that points to itself. That root is the group's representative.
Why flat arrays beat pointer trees
You could represent this forest with struct Node { int parent; int rank; Node* parent_ptr; } and heap-allocate nodes. Don't. The array approach keeps parent and rank data in contiguous memory. When you chase parent pointers during a find, adjacent array slots are likely already in cache. Pointer-chasing through heap-allocated nodes causes a cache miss at every hop. The CPU is not happy. The arrays also eliminate allocation overhead and pointer storage entirely.
Core Operations
| Operation | Naive | Union by Rank only | Path Compression only | Both |
|---|---|---|---|---|
makeSet(v) | O(1) | O(1) | O(1) | O(1) |
find(v) | O(n) worst | O(log n) worst | O(log n) amortized | O(α(n)) amortized |
union(a, b) | O(n) worst | O(log n) worst | O(log n) amortized | O(α(n)) amortized |
| Space | O(n) | O(n) | O(n) | O(n) |
α(n) is the inverse Ackermann function. For every input size you will ever encounter in practice, α(n) ≤ 4. This is, for all practical purposes, constant time. More on that shortly.
makeSet
def make_set(v): parent[v] = v rank[v] = 0
Trivially O(1). Initialize parent to self, rank to zero.
find with path compression
The naive find just follows parent pointers to the root:
def find_naive(v): while parent[v] != v: v = parent[v] return v
This is O(n) worst case. If you union in a line, 0→1→2→3→...→n, finding 0 walks the whole chain.
Path compression fixes this. After you find the root, rewrite every node you visited to point directly to the root:
def find(v): if parent[v] != v: parent[v] = find(parent[v]) # recursive: repoints on the way back up return parent[v]
The first call is expensive. Every call after that is O(1). The chain collapses into a star on first contact.
The first call to find(0) on a deep chain is expensive. But it flattens the path. Every subsequent call is O(1). This amortizes to O(log n) per operation on its own.
union by rank
The naive union just attaches one root to the other:
def union_naive(a, b): parent[find(a)] = find(b)
Naive union without rank lets trees grow into chains. Union by rank prevents this by always attaching the shorter tree under the taller one:
def union(a, b): root_a = find(a) root_b = find(b) if root_a == root_b: return # already in the same set if rank[root_a] < rank[root_b]: root_a, root_b = root_b, root_a parent[root_b] = root_a if rank[root_a] == rank[root_b]: rank[root_a] += 1
Left: equal ranks, rank increments. Right: unequal ranks, smaller joins under larger, rank stays the same.
The rank invariant that makes this work: a tree of rank r contains at least 2^r nodes. Proof by induction: a rank-0 tree has one node (base case). A rank-r tree forms only when two rank-(r-1) trees merge, so it has at least 2^(r-1) + 2^(r-1) = 2^r nodes. Since a tree can have at most n nodes, its rank is at most log₂(n). Height is bounded by rank, so find without path compression is O(log n) worst case.
Why Both Together Gives α(n)
Path compression alone gives O(log n) amortized. Union by rank alone gives O(log n) worst case. Combining them gives O(α(n)) amortized, proven by Robert Tarjan in 1975 and refined with Jan van Leeuwen in 1984.
The formal proof appears in §21.4 of CLRS and is genuinely involved. The intuition is this: path compression continually flattens trees, and union by rank ensures the trees never got deep enough to make compression expensive in the first place. Each operation performs a small amount of "compression work," but that work is borrowed from the future savings it creates. The accounting works out to α(n) amortized per operation across any sequence of m operations on n elements.
The combined complexity is also optimal. Every disjoint-set data structure must use Ω(α(n)) time per operation.
The Inverse Ackermann Function (and Why α(n) ≤ 4 Is Funny)
The Ackermann function A(m, n) grows absurdly fast:
A(0, n) = n + 1
A(1, n) = n + 2
A(2, n) = 2n + 3
A(3, n) = 2^(n+3) − 3
A(4, n) ≈ a tower of 2s of height (n + 3)
A(4, 4) is a power tower 2^(2^(2^(65536))). It has more digits than atoms in the observable universe. Your laptop will be dead. The sun will be dead. The heat death of the universe will have come and gone. And that number is still just a number that a function can produce.
α(n) is defined as the minimum k such that A(k, k) ≥ n. So α(n) = 4 requires n > A(4, 4). For any n you would ever feed a computer program, α(n) ≤ 4. Full stop.
This is why practitioners call Union-Find "essentially constant time." It's not a hand wave. It's a proven bound on the inverse of a function that dwarfs anything physically realizable. The universe did not cooperate with making this algorithm slow.
Path Compression Variants
Full compression (shown above) is recursive and rewrites all pointers on the way back up the call stack. Two iterative alternatives achieve the same asymptotic bound and avoid stack overhead.
Path splitting: every node on the path gets its parent set to its grandparent.
def find_path_splitting(v): while parent[v] != v: parent[v], v = parent[parent[v]], parent[v] return v
Path halving: same idea, but only every other node is updated.
def find_path_halving(v): while parent[v] != v: parent[v] = parent[parent[v]] v = parent[v] return v
Both still achieve O(α(n)) amortized when combined with union by rank. Path halving is often the fastest in practice because it does fewer writes per find and is trivially iterative.
Rank vs Size
Union by rank tracks an upper bound on tree height. Union by size tracks the actual count of elements in a set. Both strategies prevent chain formation, and both give O(log n) worst case for find without compression and O(α(n)) amortized with compression.
Size is slightly simpler to reason about and more useful if you need to query group sizes. Rank is marginally cheaper in some implementations because you only increment when two equal-rank roots merge, while size always updates.
One subtle point: once you add path compression, rank is no longer the exact height of the tree. It's an upper bound. Nodes get repointed by compression, but rank never decreases. So "rank" is a bit of a misnomer after path compression has run. The name lies, but the math holds.
Implementations
class UnionFind: def __init__(self, n: int): self.parent = list(range(n)) self.rank = [0] * n self.size = [1] * n def find(self, v: int) -> int: if self.parent[v] != v: self.parent[v] = self.find(self.parent[v]) return self.parent[v] def union(self, a: int, b: int) -> bool: root_a = self.find(a) root_b = self.find(b) if root_a == root_b: return False if self.rank[root_a] < self.rank[root_b]: root_a, root_b = root_b, root_a self.parent[root_b] = root_a self.size[root_a] += self.size[root_b] if self.rank[root_a] == self.rank[root_b]: self.rank[root_a] += 1 return True def connected(self, a: int, b: int) -> bool: return self.find(a) == self.find(b) def get_size(self, v: int) -> int: return self.size[self.find(v)]
What Problems It Solves
Dynamic connectivity. You have a graph where edges arrive over time. After each edge, answer "are these two nodes connected?" DSU handles this in α(n) per query. It does not handle edge deletions, that requires link-cut trees, which are a whole other conversation.
Connected components. After processing all edges, the number of distinct roots in your parent array equals the number of connected components. Counting how many nodes satisfy parent[i] == i gives you the count.
Cycle detection in undirected graphs. Before adding edge (u, v): if find(u) == find(v), adding this edge creates a cycle. This is exactly how Kruskal's MST algorithm works. Process edges in order of weight, add each edge only if its endpoints are in different components.
Grouping and equivalence classes. Account merging (if two accounts share an email, they belong to the same person), pixel grouping in image segmentation, network clustering. Any problem where "same group" is transitive.
Offline Lowest Common Ancestor. Tarjan's offline LCA algorithm uses DSU to find the LCA of query pairs in a single DFS pass over the tree, achieving O(α(n)) per query.
When to Use the Union-Find Algorithm
The signals that a problem calls for Union-Find:
- You're asked about "connected components," "groups," or "clusters" that grow over time.
- Edges or relationships arrive incrementally and you need connectivity queries after each one.
- You need to detect whether adding a connection creates a cycle.
- Two elements are "equivalent" if there is any path connecting them through some transitivity rule.
- You see the word "merge" or "union" applied to sets where you never need to split them apart.
That last bullet is the one people miss. DSU supports union but not split. It can merge groups all day. Ask it to remove an element from a group and it just stares at you. If your problem involves undoing a merge or splitting a component, DSU is not your tool.
Worked example: Number of Islands II (LeetCode 305)
Problem: You have an m x n grid of water. You receive a sequence of operations, each turning one cell to land. After each operation, report the number of islands.
Why DSU fits: You're adding land cells incrementally (union operations) and querying connectivity after each addition. You never remove land. The "group" is an island, defined by 4-directional adjacency.
Approach: Map each cell (r, c) to an index r * n + c. Maintain a DSU over all m*n indices. Keep a count of current islands.
When a new land cell is added at (r, c):
- Increment island count (this cell starts as its own island).
- For each of the 4 neighbors: if the neighbor is land and in a different component, call
union. Each successful union decrements island count by 1 (two islands merged into one).
def num_islands_ii(m: int, n: int, positions: list[list[int]]) -> list[int]: uf = UnionFind(m * n) land = set() island_count = 0 results = [] directions = [(0, 1), (0, -1), (1, 0), (-1, 0)] for row, col in positions: index = row * n + col if index in land: results.append(island_count) continue land.add(index) island_count += 1 for dr, dc in directions: neighbor_row, neighbor_col = row + dr, col + dc neighbor_index = neighbor_row * n + neighbor_col if (0 <= neighbor_row < m and 0 <= neighbor_col < n and neighbor_index in land): if uf.union(index, neighbor_index): island_count -= 1 results.append(island_count) return results
Incremental additions map to union, connectivity queries map to find, and the island count updates as a side effect of whether union actually merged two different components.
Worked example: Redundant Connection (LeetCode 684)
Problem: A tree of n nodes has one extra edge added, creating exactly one cycle. Find and return that extra edge.
Why DSU fits: You're building a graph edge by edge and need to detect the first edge that would create a cycle. That's precisely the "already in the same component" check.
Approach: Process edges in order. For each edge (u, v): if find(u) == find(v), this edge connects two nodes already in the same component. It's the redundant edge. Otherwise, call union(u, v) and continue.
def find_redundant_connection(edges: list[list[int]]) -> list[int]: uf = UnionFind(len(edges) + 1) for u, v in edges: if not uf.union(u, v): return [u, v] return []
Eight lines. The DSU encoding of "are these already connected?" is exactly the cycle detection condition. No drama.
Recap
- Union-Find manages a forest of disjoint sets using two flat arrays:
parentandrank. find(v)follows parent pointers to the root. Path compression rewrites all visited nodes to point directly to the root.union(a, b)merges two trees by attaching the smaller-ranked root under the larger-ranked one.- Union by rank alone: O(log n) worst case per operation.
- Path compression alone: O(log n) amortized per operation.
- Both together: O(α(n)) amortized, where α(n) ≤ 4 for every practically realizable input.
- The inverse Ackermann bound is proven optimal. You cannot do better than α(n) amortized.
- Use it for: dynamic connectivity, connected components, cycle detection, grouping by equivalence. Not for: problems requiring set splits or edge deletions.
- The
unionreturn value (whether two different components were merged) is often the key signal in problems like cycle detection and island counting.
If you want to practice recognizing when Union-Find fits versus when a plain BFS/DFS will do, SpaceComplexity runs you through voice-based mock interviews where a real interviewer prompt (not a multiple choice quiz) forces you to justify your data structure choice under pressure. That's where pattern recognition actually sticks.
For more on graph algorithms that pair naturally with DSU, see our deep dives on dynamic programming patterns and recognizing the right approach from problem signals.
Further reading
- Disjoint-set data structure, Wikipedia
- Disjoint Set Union, CP-Algorithms
- Union-Find Algorithm, GeeksforGeeks
- Tarjan, R.E. (1975). "Efficiency of a Good But Not Linear Set Union Algorithm." Journal of the ACM, 22(2), 215-225.
- Cormen et al., Introduction to Algorithms (CLRS), §21: Data Structures for Disjoint Sets.