Union-Find (Disjoint Set)
Union-Find: The Digital Bureaucrat
The Story: The Party Plannerās Dilemma
Imagine you are hosting a massive party with 1,000 guests. You want to know if two random people, Alice and Bob, know each other.
- Alice knows Carol.
- Carol knows Dave.
- Dave knows Bob.
Tracing this chain every time is exhausting ().
In 1964, Bernard Galler and Michael Fisher proposed a solution that mirrors a Political Party: Everyone belongs to a group, and every group has a āRepresentativeā (The Boss).
If you ask Alice āWho is your boss?ā, she says āCarolā. You ask Carol, she says āDaveā. You ask Dave, he says āMeā. If you ask Bob āWho is your boss?ā, he also says āDaveā. Since they have the same boss, they are in the same group. Simple.
But they added a twist called Path Compression: The next time you ask Alice, she skips the middleman and points directly to Dave. This simple trick flattens the hierarchy, making the algorithm nearly instantaneous.
Why do we need it?
Union-Find is designed for one thing: Dynamic Connectivity.
- Social Networks: āYou and X have 5 mutual friends.ā
- Image Processing: āMagic Wandā tool in Photoshop uses this to find all connected pixels of the same color.
- Kruskalās Algorithm: Building a Minimum Spanning Tree (connecting cities with the cheapest cables) requires constantly checking āAre these two cities already connected?ā
How the Algorithm āThinksā
The algorithm is a bureaucrat with two simple jobs:
Find (The Inquiry): It asks a node: āWho is your leader?ā If the node points to someone else, it follows the finger up the chain until it finds the person pointing to themselves. Crucial Optimization: As it walks back down, it whispers to everyone in the chain: āBy the way, the big boss is Dave. Next time, just point directly to him.ā (Path Compression)
Union (The Merger): It connects two groups. It takes the boss of Group A and tells them: āFrom now on, you report to the boss of Group B.ā Optimization: It always makes the smaller group report to the larger group to keep the tree flat. (Union by Rank)
The āInverse Ackermannā Miracle
Union-Find is famous for its speed. Its time complexity is , where is the Inverse Ackermann function. What does that mean? For all practical purposes (even if is the number of atoms in the universe), . It is, effectively, Constant Time .
Implementation (Python)
class UnionFind:
def __init__(self, size):
# Initially, everyone is their own boss
self.parent = list(range(size))
# Rank helps us merge smaller trees into larger ones
self.rank = [1] * size
def find(self, p):
# 1. Path Compression: The Bureaucratic Shortcut
if self.parent[p] != p:
# Recursively find the boss and update my parent directly to the boss
self.parent[p] = self.find(self.parent[p])
return self.parent[p]
def union(self, p, q):
# 2. Union by Rank: The Merger
rootP = self.find(p)
rootQ = self.find(q)
if rootP != rootQ:
# Attach the shorter tree to the taller tree
if self.rank[rootP] > self.rank[rootQ]:
self.parent[rootQ] = rootP
elif self.rank[rootP] < self.rank[rootQ]:
self.parent[rootP] = rootQ
else:
self.parent[rootQ] = rootP
self.rank[rootP] += 1
return True # Successfully merged
return False # Already connected
# Example Usage
# uf = UnionFind(10)
# uf.union(1, 2)
# uf.union(2, 5)
# print(uf.find(1) == uf.find(5)) # TrueSummary
Union-Find teaches us that the best way to manage a complex organization is to flatten the hierarchy. By empowering every member to know the āBig Bossā directly, it turns a tangled web of relationships into a lightning-fast lookup system.
