The Story That Explains Kruskal's Algorithm
He works down the pile. For each road he asks one question: "Do these two villages already have a route between them?" If they don't, he builds the road. If they already do — even through a long chain of other villages — building it would just create a wasteful loop, so he throws the quote away. He stops the moment every village is reachable.
That sort-then-merge-without-loops strategy is Kruskal's algorithm.
Kruskal's algorithm finds the Minimum Spanning Tree (MST) of a weighted, undirected graph: the cheapest set of edges connecting all vertices with no cycles. Where Prim's grows a single tree outward from one vertex, Kruskal's is edge-centric — it sorts all edges once and greedily merges separate clusters into one, using a clever data structure called Union-Find to detect cycles instantly.
Process edges from lightest to heaviest. Add an edge only if its two endpoints live in different components; doing so merges them. Skip any edge whose endpoints are already in the same component, because that would form a cycle. By the cut property, the lightest edge crossing between two clusters is always safe — so greedy is provably optimal.
The Foundation — Spanning Trees & the Cycle Rule
A spanning tree touches every one of the V vertices, stays connected, and has no cycle — so it always contains exactly V − 1 edges. The minimum spanning tree is the spanning tree whose edge weights sum to the smallest possible total. Kruskal's relies on two twin rules of MST theory.
| For any split of the vertices into two groups, the lightest edge crossing the split is safe to add — it belongs to some MST. This is why Kruskal can greedily take the cheapest qualifying edge. |
| In any cycle, the heaviest edge is never needed in an MST. So when an edge would close a loop, it is the costliest link of that loop and can be safely discarded. |
The MST minimises the total weight of the entire tree, not the distance between any two particular vertices. If you need point-to-point shortest routes, that is Dijkstra's job, not Kruskal's.
The Engine Room — Union-Find (Disjoint Set Union)
The whole algorithm hinges on answering one question fast, millions of times if needed: "are these two vertices already connected?" The Union-Find structure (also called Disjoint Set Union, or DSU) does exactly this in near-constant time.
Path compression flattens the tree during find by pointing
nodes straight at the root. Union by rank always hangs the shorter tree
under the taller one. Together they push the amortised cost of each operation down to
O(α(V)) — the inverse Ackermann function, which is effectively a
small constant for any input you will ever meet.
The Algorithm in Five Steps
Hands-On: Watch Every Line of Code Build the Tree
This is the heart of the tutorial. The graph on the left colours each vertex by
its Union-Find component, and the Python on the right is wired to it. Press
Play (or step manually) and the highlighted line of code is the task
happening on the canvas: when sorted(edges) runs, the edge strip orders itself; when
find runs, the edge under consideration pulses; a union turns the
edge green and recolours the absorbed component into the growing blue tree; a
cycle edge flashes red and is discarded.
This is the exact graph used in the Python in Section 06. Final MST cost is 17, and edge A–C (4) is skipped as a cycle.
Python Implementation
Full Version — Union-Find with Path Compression & Union by Rank
This is the production-quality implementation. The two helper functions are the Union-Find
engine; kruskal itself is just a sorted loop over the edges.
def find(parent, x):
while parent[x] != x: # walk up to the root of x's set
parent[x] = parent[parent[x]] # path compression: flatten the tree
x = parent[x]
return x
def union(parent, rank, a, b):
ra, rb = find(parent, a), find(parent, b)
if ra == rb:
return False # same set -> this edge is a cycle
if rank[ra] < rank[rb]:
ra, rb = rb, ra # attach the shorter tree...
parent[rb] = ra # ...under the taller one
if rank[ra] == rank[rb]:
rank[ra] += 1
return True
def kruskal(nodes, edges):
parent = {v: v for v in nodes} # every vertex is its own set
rank = {v: 0 for v in nodes}
mst, total = [], 0
for w, u, v in sorted(edges): # greedily scan edges, lightest first
if union(parent, rank, u, v): # keep only edges that link two sets
mst.append((u, v, w))
total += w
if len(mst) == len(nodes) - 1:
break # V-1 edges -> spanning tree done
return mst, total
Running It on Our Six-Village Graph
Note that each edge is stored as (weight, u, v) so that Python's sorted()
orders by weight first — the only thing Kruskal needs.
nodes = ['A', 'B', 'C', 'D', 'E', 'F']
edges = [
(1, 'A', 'B'), (2, 'C', 'D'), (3, 'B', 'C'), (4, 'A', 'C'),
(5, 'D', 'E'), (6, 'E', 'F'), (7, 'B', 'E'), (8, 'A', 'F'),
]
tree, cost = kruskal(nodes, edges)
for u, v, w in tree:
print(f"{u} -- {v} (weight {w})")
print("Total MST cost:", cost)
Edge A–C (4) never appears in the output. By the time the scan reached
it, A and C were already linked through A–B–C, so union returned
False and the edge was skipped. Likewise B–E (7) and
A–F (8) were never even examined — the loop hit V−1 = 5 edges
and stopped early.
The Run, Traced Step by Step
Here is exactly what the animation does, written out. Edges are processed in sorted order, and the Union-Find check decides each one's fate.
| Edge | find(u) vs find(v) | Decision | Components After | Total |
|---|---|---|---|---|
| A–B (1) | different | Union — add | {A,B} {C} {D} {E} {F} | 1 |
| C–D (2) | different | Union — add | {A,B} {C,D} {E} {F} | 3 |
| B–C (3) | different | Union — add | {A,B,C,D} {E} {F} | 6 |
| A–C (4) | same root | Skip — cycle | unchanged | 6 |
| D–E (5) | different | Union — add | {A,B,C,D,E} {F} | 11 |
| E–F (6) | different | Union — add → STOP | {A,B,C,D,E,F} | 17 |
| B–E (7), A–F (8) | — | Never examined (5 edges reached) | — | 17 |
Time & Space Complexity
| Phase | Work Done | Cost |
|---|---|---|
| Sort the edges | One comparison sort of all E edges | O(E log E) |
| Union-Find operations | Up to 2E finds + E unions, near-constant each | O(E α(V)) ≈ O(E) |
| Overall | The sort dominates | O(E log E) = O(E log V) |
| Space | Edge list + parent/rank arrays | O(V + E) |
In any simple graph E ≤ V², so log E ≤ log V² = 2 log V —
a constant factor. That is why the two forms are interchangeable. Because the runtime is
driven by sorting edges, Kruskal's is the natural choice on
sparse graphs, where E is small. On dense graphs (E ≈ V²), Prim's
O(V²) matrix form often wins instead.
Benefits of Kruskal's Algorithm
Both return a correct MST. Kruskal sorts all edges and merges a forest with Union-Find — best on sparse graphs and edge lists. Prim grows one tree from a vertex with a priority queue — best on dense graphs. Same answer, different path to it.
Real-World Use Cases
Kruskal vs Prim — Side by Side
| Property | Kruskal's Algorithm | Prim's Algorithm |
|---|---|---|
| Strategy | Sort edges; merge a forest | Grow one tree from a vertex |
| Core data structure | Union-Find (DSU) + sorted edges | Priority queue / key array |
| Typical time | O(E log E) | O(E log V) or O(V²) |
| Best on | Sparse graphs / edge lists | Dense graphs |
| Partial result is | A forest until the final merge | Always one connected tree |
| Disconnected graph | Yes — gives a spanning forest | No — needs a connected graph |
Golden Rules
find to O(V) in the worst case.
find(u) == find(v). Compare roots, not the
vertices themselves. Add the edge only when the roots differ, then immediately union.
(weight, u, v) lets a plain
sorted() do the right thing. Sorting tuples by a later field is a common,
silent bug.