Union-Find: Two Arrays That Make Connectivity Nearly Free

Union-Find Algorithm: Two Arrays That Make Connectivity Nearly Free

You have a million nodes. You need to answer thousands of questions like "are these two nodes in the same group?" and "what happens when I merge these two groups?" And you need all of that to be, essentially, instantaneous.

That's the union find algorithm, also called Disjoint Set Union (DSU). You're managing a collection of non-overlapping groups with two operations: find which group something belongs to, and union two groups together. Both run in what is effectively constant time, backed by nothing more than two flat integer arrays. No heap allocation. No struct nodes. No drama.

Reach for it whenever you're tracking connectivity, grouping, or need to detect if adding an edge creates a cycle.

The Surprisingly Low-Tech Interior

The internal representation is a forest of trees, but you never allocate tree nodes. Every element from 0 to n-1 gets an index, and two arrays drive everything.

parent[i]  : the parent of node i. If parent[i] == i, then i is a root (the representative of its group).
rank[i]    : an upper bound on the height of the subtree rooted at i.

Initially, every node is its own root:

parent = [0, 1, 2, 3, 4]
rank   = [0, 0, 0, 0, 0]

After union(0, 1) and union(1, 2):

parent = [0, 0, 0, 3, 4]
rank   = [1, 0, 0, 0, 0]

Node 0 became the root of a group containing 0, 1, and 2. Nodes 3 and 4 are still their own roots.

Forest state: five isolated nodes before any unions, then a tree rooted at 0 after union(0,1) and union(1,2) Gold border = root. Blue border = non-root. Parent array values highlighted in amber show the changed pointers.

The key structural invariant: to find which group a node belongs to, you follow parent pointers up until you hit a node that points to itself. That root is the group's representative.

Why flat arrays beat pointer trees

You could represent this forest with struct Node { int parent; int rank; Node* parent_ptr; } and heap-allocate nodes. Don't. The array approach keeps parent and rank data in contiguous memory. When you chase parent pointers during a find, adjacent array slots are likely already in cache. Pointer-chasing through heap-allocated nodes causes a cache miss at every hop. The CPU is not happy. The arrays also eliminate allocation overhead and pointer storage entirely.

Core Operations

Operation	Naive	Union by Rank only	Path Compression only	Both
`makeSet(v)`	O(1)	O(1)	O(1)	O(1)
`find(v)`	O(n) worst	O(log n) worst	O(log n) amortized	O(α(n)) amortized
`union(a, b)`	O(n) worst	O(log n) worst	O(log n) amortized	O(α(n)) amortized
Space	O(n)	O(n)	O(n)	O(n)

α(n) is the inverse Ackermann function. For every input size you will ever encounter in practice, α(n) ≤ 4. This is, for all practical purposes, constant time. More on that shortly.

makeSet

def make_set(v):
    parent[v] = v
    rank[v] = 0

Trivially O(1). Initialize parent to self, rank to zero.

find with path compression

The naive find just follows parent pointers to the root:

def find_naive(v):
    while parent[v] != v:
        v = parent[v]
    return v

This is O(n) worst case. If you union in a line, 0→1→2→3→...→n, finding 0 walks the whole chain.

Path compression fixes this. After you find the root, rewrite every node you visited to point directly to the root:

def find(v):
    if parent[v] != v:
        parent[v] = find(parent[v])  # recursive: repoints on the way back up
    return parent[v]

Path compression: a chain of five nodes before find(0), then all four non-root nodes pointing directly to the root after The first call is expensive. Every call after that is O(1). The chain collapses into a star on first contact.

The first call to find(0) on a deep chain is expensive. But it flattens the path. Every subsequent call is O(1). This amortizes to O(log n) per operation on its own.

union by rank

The naive union just attaches one root to the other:

def union_naive(a, b):
    parent[find(a)] = find(b)

Naive union without rank lets trees grow into chains. Union by rank prevents this by always attaching the shorter tree under the taller one:

def union(a, b):
    root_a = find(a)
    root_b = find(b)
    if root_a == root_b:
        return  # already in the same set
    if rank[root_a] < rank[root_b]:
        root_a, root_b = root_b, root_a
    parent[root_b] = root_a
    if rank[root_a] == rank[root_b]:
        rank[root_a] += 1

Union by rank: two cases, equal-rank trees merge and increment the root's rank, different-rank trees merge with the shorter tree attaching under the taller one, rank unchanged Left: equal ranks, rank increments. Right: unequal ranks, smaller joins under larger, rank stays the same.

The rank invariant that makes this work: a tree of rank r contains at least 2^r nodes. Proof by induction: a rank-0 tree has one node (base case). A rank-r tree forms only when two rank-(r-1) trees merge, so it has at least 2^(r-1) + 2^(r-1) = 2^r nodes. Since a tree can have at most n nodes, its rank is at most log₂(n). Height is bounded by rank, so find without path compression is O(log n) worst case.

Why Both Together Gives α(n)

Path compression alone gives O(log n) amortized. Union by rank alone gives O(log n) worst case. Combining them gives O(α(n)) amortized, proven by Robert Tarjan in 1975 and refined with Jan van Leeuwen in 1984.

The formal proof appears in §21.4 of CLRS and is genuinely involved. The intuition is this: path compression continually flattens trees, and union by rank ensures the trees never got deep enough to make compression expensive in the first place. Each operation performs a small amount of "compression work," but that work is borrowed from the future savings it creates. The accounting works out to α(n) amortized per operation across any sequence of m operations on n elements.

The combined complexity is also optimal. Every disjoint-set data structure must use Ω(α(n)) time per operation.

The Inverse Ackermann Function (and Why α(n) ≤ 4 Is Funny)

The Ackermann function A(m, n) grows absurdly fast:

A(0, n) = n + 1
A(1, n) = n + 2
A(2, n) = 2n + 3
A(3, n) = 2^(n+3) − 3
A(4, n) ≈ a tower of 2s of height (n + 3)

A(4, 4) is a power tower 2^(2^(2^(65536))). It has more digits than atoms in the observable universe. Your laptop will be dead. The sun will be dead. The heat death of the universe will have come and gone. And that number is still just a number that a function can produce.

α(n) is defined as the minimum k such that A(k, k) ≥ n. So α(n) = 4 requires n > A(4, 4). For any n you would ever feed a computer program, α(n) ≤ 4. Full stop.

This is why practitioners call Union-Find "essentially constant time." It's not a hand wave. It's a proven bound on the inverse of a function that dwarfs anything physically realizable. The universe did not cooperate with making this algorithm slow.

Path Compression Variants

Full compression (shown above) is recursive and rewrites all pointers on the way back up the call stack. Two iterative alternatives achieve the same asymptotic bound and avoid stack overhead.

Path splitting: every node on the path gets its parent set to its grandparent.

def find_path_splitting(v):
    while parent[v] != v:
        parent[v], v = parent[parent[v]], parent[v]
    return v

Path halving: same idea, but only every other node is updated.

def find_path_halving(v):
    while parent[v] != v:
        parent[v] = parent[parent[v]]
        v = parent[v]
    return v

Both still achieve O(α(n)) amortized when combined with union by rank. Path halving is often the fastest in practice because it does fewer writes per find and is trivially iterative.

Rank vs Size

Union by rank tracks an upper bound on tree height. Union by size tracks the actual count of elements in a set. Both strategies prevent chain formation, and both give O(log n) worst case for find without compression and O(α(n)) amortized with compression.

Size is slightly simpler to reason about and more useful if you need to query group sizes. Rank is marginally cheaper in some implementations because you only increment when two equal-rank roots merge, while size always updates.

One subtle point: once you add path compression, rank is no longer the exact height of the tree. It's an upper bound. Nodes get repointed by compression, but rank never decreases. So "rank" is a bit of a misnomer after path compression has run. The name lies, but the math holds.

Implementations

class UnionFind:
    def __init__(self, n: int):
        self.parent = list(range(n))
        self.rank = [0] * n
        self.size = [1] * n

    def find(self, v: int) -> int:
        if self.parent[v] != v:
            self.parent[v] = self.find(self.parent[v])
        return self.parent[v]

    def union(self, a: int, b: int) -> bool:
        root_a = self.find(a)
        root_b = self.find(b)
        if root_a == root_b:
            return False
        if self.rank[root_a] < self.rank[root_b]:
            root_a, root_b = root_b, root_a
        self.parent[root_b] = root_a
        self.size[root_a] += self.size[root_b]
        if self.rank[root_a] == self.rank[root_b]:
            self.rank[root_a] += 1
        return True

    def connected(self, a: int, b: int) -> bool:
        return self.find(a) == self.find(b)

    def get_size(self, v: int) -> int:
        return self.size[self.find(v)]

What Problems It Solves

Dynamic connectivity. You have a graph where edges arrive over time. After each edge, answer "are these two nodes connected?" DSU handles this in α(n) per query. It does not handle edge deletions, that requires link-cut trees, which are a whole other conversation.

Connected components. After processing all edges, the number of distinct roots in your parent array equals the number of connected components. Counting how many nodes satisfy parent[i] == i gives you the count.

Cycle detection in undirected graphs. Before adding edge (u, v): if find(u) == find(v), adding this edge creates a cycle. This is exactly how Kruskal's MST algorithm works. Process edges in order of weight, add each edge only if its endpoints are in different components.

Grouping and equivalence classes. Account merging (if two accounts share an email, they belong to the same person), pixel grouping in image segmentation, network clustering. Any problem where "same group" is transitive.

Offline Lowest Common Ancestor. Tarjan's offline LCA algorithm uses DSU to find the LCA of query pairs in a single DFS pass over the tree, achieving O(α(n)) per query.

When to Use the Union-Find Algorithm

The signals that a problem calls for Union-Find:

You're asked about "connected components," "groups," or "clusters" that grow over time.
Edges or relationships arrive incrementally and you need connectivity queries after each one.
You need to detect whether adding a connection creates a cycle.
Two elements are "equivalent" if there is any path connecting them through some transitivity rule.
You see the word "merge" or "union" applied to sets where you never need to split them apart.

That last bullet is the one people miss. DSU supports union but not split. It can merge groups all day. Ask it to remove an element from a group and it just stares at you. If your problem involves undoing a merge or splitting a component, DSU is not your tool.

Worked example: Number of Islands II (LeetCode 305)

Problem: You have an m x n grid of water. You receive a sequence of operations, each turning one cell to land. After each operation, report the number of islands.

Why DSU fits: You're adding land cells incrementally (union operations) and querying connectivity after each addition. You never remove land. The "group" is an island, defined by 4-directional adjacency.

Approach: Map each cell (r, c) to an index r * n + c. Maintain a DSU over all m*n indices. Keep a count of current islands.

When a new land cell is added at (r, c):

Increment island count (this cell starts as its own island).
For each of the 4 neighbors: if the neighbor is land and in a different component, call union. Each successful union decrements island count by 1 (two islands merged into one).

def num_islands_ii(m: int, n: int, positions: list[list[int]]) -> list[int]:
    uf = UnionFind(m * n)
    land = set()
    island_count = 0
    results = []
    directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]

    for row, col in positions:
        index = row * n + col
        if index in land:
            results.append(island_count)
            continue
        land.add(index)
        island_count += 1
        for dr, dc in directions:
            neighbor_row, neighbor_col = row + dr, col + dc
            neighbor_index = neighbor_row * n + neighbor_col
            if (0 <= neighbor_row < m and 0 <= neighbor_col < n
                    and neighbor_index in land):
                if uf.union(index, neighbor_index):
                    island_count -= 1
        results.append(island_count)

    return results

Incremental additions map to union, connectivity queries map to find, and the island count updates as a side effect of whether union actually merged two different components.

Worked example: Redundant Connection (LeetCode 684)

Problem: A tree of n nodes has one extra edge added, creating exactly one cycle. Find and return that extra edge.

Why DSU fits: You're building a graph edge by edge and need to detect the first edge that would create a cycle. That's precisely the "already in the same component" check.

Approach: Process edges in order. For each edge (u, v): if find(u) == find(v), this edge connects two nodes already in the same component. It's the redundant edge. Otherwise, call union(u, v) and continue.

def find_redundant_connection(edges: list[list[int]]) -> list[int]:
    uf = UnionFind(len(edges) + 1)
    for u, v in edges:
        if not uf.union(u, v):
            return [u, v]
    return []

Eight lines. The DSU encoding of "are these already connected?" is exactly the cycle detection condition. No drama.

Recap

Union-Find manages a forest of disjoint sets using two flat arrays: parent and rank.
find(v) follows parent pointers to the root. Path compression rewrites all visited nodes to point directly to the root.
union(a, b) merges two trees by attaching the smaller-ranked root under the larger-ranked one.
Union by rank alone: O(log n) worst case per operation.
Path compression alone: O(log n) amortized per operation.
Both together: O(α(n)) amortized, where α(n) ≤ 4 for every practically realizable input.
The inverse Ackermann bound is proven optimal. You cannot do better than α(n) amortized.
Use it for: dynamic connectivity, connected components, cycle detection, grouping by equivalence. Not for: problems requiring set splits or edge deletions.
The union return value (whether two different components were merged) is often the key signal in problems like cycle detection and island counting.

If you want to practice recognizing when Union-Find fits versus when a plain BFS/DFS will do, SpaceComplexity runs you through voice-based mock interviews where a real interviewer prompt (not a multiple choice quiz) forces you to justify your data structure choice under pressure. That's where pattern recognition actually sticks.

For more on graph algorithms that pair naturally with DSU, see our deep dives on dynamic programming patterns and recognizing the right approach from problem signals.