Data Structure 📂 Tree in Data Structure · 6 of 7 33 min read

Kruskal's Algorithm: A Hands-On Visual Guide with Python

A hands-on tutorial on Kruskal's algorithm for the Minimum Spanning Tree. It pairs an interactive, line-by-line animation — where each executing line of Python recolours the Union-Find components and merges the forest in real time — with a story-driven explanation of the cut and cycle properties, a dedicated Union-Find (DSU) walkthrough, a fully traced example (including a skipped cycle edge), complexity analysis, benefits, six real-world use cases, and a Kruskal-vs-Prim comparison

Section 01

The Story That Explains Kruskal's Algorithm

The Road Contractor Who Sorts the Quotes
A government wants to connect six villages with paved roads at the lowest total cost. A contractor collects a price quote for every possible road segment, then does something simple: he stacks all the quotes from cheapest to most expensive.

He works down the pile. For each road he asks one question: "Do these two villages already have a route between them?" If they don't, he builds the road. If they already do — even through a long chain of other villages — building it would just create a wasteful loop, so he throws the quote away. He stops the moment every village is reachable.

That sort-then-merge-without-loops strategy is Kruskal's algorithm.

Kruskal's algorithm finds the Minimum Spanning Tree (MST) of a weighted, undirected graph: the cheapest set of edges connecting all vertices with no cycles. Where Prim's grows a single tree outward from one vertex, Kruskal's is edge-centric — it sorts all edges once and greedily merges separate clusters into one, using a clever data structure called Union-Find to detect cycles instantly.

🌱
The Core Insight

Process edges from lightest to heaviest. Add an edge only if its two endpoints live in different components; doing so merges them. Skip any edge whose endpoints are already in the same component, because that would form a cycle. By the cut property, the lightest edge crossing between two clusters is always safe — so greedy is provably optimal.


Section 02

The Foundation — Spanning Trees & the Cycle Rule

A spanning tree touches every one of the V vertices, stays connected, and has no cycle — so it always contains exactly V − 1 edges. The minimum spanning tree is the spanning tree whose edge weights sum to the smallest possible total. Kruskal's relies on two twin rules of MST theory.

✅ Cut Property (what to keep)
For any split of the vertices into two groups, the lightest edge crossing the split is safe to add — it belongs to some MST. This is why Kruskal can greedily take the cheapest qualifying edge.
❌ Cycle Property (what to drop)
In any cycle, the heaviest edge is never needed in an MST. So when an edge would close a loop, it is the costliest link of that loop and can be safely discarded.
⚠️
MST ≠ Shortest Path

The MST minimises the total weight of the entire tree, not the distance between any two particular vertices. If you need point-to-point shortest routes, that is Dijkstra's job, not Kruskal's.


Section 03

The Engine Room — Union-Find (Disjoint Set Union)

The whole algorithm hinges on answering one question fast, millions of times if needed: "are these two vertices already connected?" The Union-Find structure (also called Disjoint Set Union, or DSU) does exactly this in near-constant time.

⚙️ The Three Union-Find Operations
make-set
Start with every vertex in its own one-element set — its own colour. With V vertices you begin with V separate components.
find(x)
Follow parent pointers up to the root (representative) of x's set. Two vertices are connected exactly when they share a root.
union(a, b)
Merge two sets by pointing one root at the other. After this, every vertex in both sets shares one representative — they are now one component.
🔑
Two Tricks Make It Nearly Free

Path compression flattens the tree during find by pointing nodes straight at the root. Union by rank always hangs the shorter tree under the taller one. Together they push the amortised cost of each operation down to O(α(V)) — the inverse Ackermann function, which is effectively a small constant for any input you will ever meet.


Section 04

The Algorithm in Five Steps

01
Put Every Vertex in Its Own Set
Initialise Union-Find so each of the V vertices is an isolated component. Nothing is connected yet.
02
Sort All Edges by Weight
Order every edge from lightest to heaviest. This single sort is what makes the greedy choice trivial — the next candidate is always the next edge in the list.
03
Scan Edges, Ask "Same Set?"
Walk the sorted list. For each edge, run find on both endpoints. If their roots match, the vertices are already connected.
04
Union if Different, Skip if Same
Different roots → add the edge to the MST and union the two sets. Same root → discard the edge (it would create a cycle).
05
Stop at V−1 Edges
The moment you have accepted V−1 edges, every vertex is connected and the tree is complete. Any remaining heavier edges are never even examined.

Section 05

Hands-On: Watch Every Line of Code Build the Tree

This is the heart of the tutorial. The graph on the left colours each vertex by its Union-Find component, and the Python on the right is wired to it. Press Play (or step manually) and the highlighted line of code is the task happening on the canvas: when sorted(edges) runs, the edge strip orders itself; when find runs, the edge under consideration pulses; a union turns the edge green and recolours the absorbed component into the growing blue tree; a cycle edge flashes red and is discarded.

Interactive — Kruskal's Algorithm, Line by Line
0
MST Cost
0 / 5
Edges
6
Components
a component considering in MST cycle → skip
Press Play to start, or use Step → to walk through one line at a time.
step 0 / 0

This is the exact graph used in the Python in Section 06. Final MST cost is 17, and edge A–C (4) is skipped as a cycle.


Section 06

Python Implementation

Full Version — Union-Find with Path Compression & Union by Rank

This is the production-quality implementation. The two helper functions are the Union-Find engine; kruskal itself is just a sorted loop over the edges.

def find(parent, x):
    while parent[x] != x:               # walk up to the root of x's set
        parent[x] = parent[parent[x]]   # path compression: flatten the tree
        x = parent[x]
    return x

def union(parent, rank, a, b):
    ra, rb = find(parent, a), find(parent, b)
    if ra == rb:
        return False                    # same set -> this edge is a cycle
    if rank[ra] < rank[rb]:
        ra, rb = rb, ra                 # attach the shorter tree...
    parent[rb] = ra                     # ...under the taller one
    if rank[ra] == rank[rb]:
        rank[ra] += 1
    return True

def kruskal(nodes, edges):
    parent = {v: v for v in nodes}      # every vertex is its own set
    rank   = {v: 0 for v in nodes}
    mst, total = [], 0
    for w, u, v in sorted(edges):       # greedily scan edges, lightest first
        if union(parent, rank, u, v):   # keep only edges that link two sets
            mst.append((u, v, w))
            total += w
            if len(mst) == len(nodes) - 1:
                break                   # V-1 edges -> spanning tree done
    return mst, total

Running It on Our Six-Village Graph

Note that each edge is stored as (weight, u, v) so that Python's sorted() orders by weight first — the only thing Kruskal needs.

nodes = ['A', 'B', 'C', 'D', 'E', 'F']
edges = [
    (1, 'A', 'B'), (2, 'C', 'D'), (3, 'B', 'C'), (4, 'A', 'C'),
    (5, 'D', 'E'), (6, 'E', 'F'), (7, 'B', 'E'), (8, 'A', 'F'),
]

tree, cost = kruskal(nodes, edges)
for u, v, w in tree:
    print(f"{u} -- {v}  (weight {w})")
print("Total MST cost:", cost)
OUTPUT
A -- B (weight 1) C -- D (weight 2) B -- C (weight 3) D -- E (weight 5) E -- F (weight 6) Total MST cost: 17
🎉
Where Did A–C (4) Go?

Edge A–C (4) never appears in the output. By the time the scan reached it, A and C were already linked through A–B–C, so union returned False and the edge was skipped. Likewise B–E (7) and A–F (8) were never even examined — the loop hit V−1 = 5 edges and stopped early.


Section 07

The Run, Traced Step by Step

Here is exactly what the animation does, written out. Edges are processed in sorted order, and the Union-Find check decides each one's fate.

Edgefind(u) vs find(v)DecisionComponents AfterTotal
A–B (1)differentUnion — add{A,B} {C} {D} {E} {F}1
C–D (2)differentUnion — add{A,B} {C,D} {E} {F}3
B–C (3)differentUnion — add{A,B,C,D} {E} {F}6
A–C (4)same rootSkip — cycleunchanged6
D–E (5)differentUnion — add{A,B,C,D,E} {F}11
E–F (6)differentUnion — add → STOP{A,B,C,D,E,F}17
B–E (7), A–F (8)Never examined (5 edges reached)17

Section 08

Time & Space Complexity

PhaseWork DoneCost
Sort the edgesOne comparison sort of all E edgesO(E log E)
Union-Find operationsUp to 2E finds + E unions, near-constant eachO(E α(V)) ≈ O(E)
OverallThe sort dominatesO(E log E) = O(E log V)
SpaceEdge list + parent/rank arraysO(V + E)
📐
Why O(E log E) Equals O(E log V)

In any simple graph E ≤ V², so log E ≤ log V² = 2 log V — a constant factor. That is why the two forms are interchangeable. Because the runtime is driven by sorting edges, Kruskal's is the natural choice on sparse graphs, where E is small. On dense graphs (E ≈ V²), Prim's O(V²) matrix form often wins instead.


Section 09

Benefits of Kruskal's Algorithm

Provably Optimal
cut + cycle property
Greedy, yet guaranteed to return an exact minimum spanning tree — no approximation, no backtracking.
🧮
Great on Sparse Graphs
edge-centric
When edges are few, sorting them is cheap and the whole algorithm flies. It works directly from a plain edge list — no adjacency structure required.
🪱
Handles Disconnected Graphs
spanning forest
If the graph isn't fully connected, Kruskal's naturally produces a minimum spanning forest — one MST per component. Prim's cannot do this directly.
💡
Prim vs Kruskal — the One-Liner

Both return a correct MST. Kruskal sorts all edges and merges a forest with Union-Find — best on sparse graphs and edge lists. Prim grows one tree from a vertex with a priority queue — best on dense graphs. Same answer, different path to it.


Section 10

Real-World Use Cases

🚧
Infrastructure Networks
Connecting cities with roads, rail, fibre, water, or power lines at minimum total cost — the classic MST application, and Kruskal's shines when the candidate links are listed as edges.
roads, grids, telecom
🧮
Clustering
Single-linkage clustering is literally Kruskal's stopped early: keep merging the closest points until you reach k clusters. Removing the heaviest MST edges splits data into natural groups.
single-linkage, data mining
📷
Image Segmentation
Graph-based segmentation (e.g. Felzenszwalb–Huttenlocher) sorts pixel-difference edges and merges regions Kruskal-style to find coherent areas in an image.
computer vision
🔌
Circuit & Chip Design
Wiring components together with the least total trace length maps onto an MST over the pins — reducing material, signal delay, and cost.
VLSI, PCB routing
📊
TSP Approximation
An MST gives a 2-approximation for the metric Travelling Salesman Problem — a fast, provably-bounded starting point for routing heuristics.
logistics, routing
🌐
Connectivity & Components
The same Union-Find engine answers "is the network still connected?" incrementally — used in social-graph analysis, percolation, and dynamic connectivity problems.
graph analytics

Section 11

Kruskal vs Prim — Side by Side

PropertyKruskal's AlgorithmPrim's Algorithm
StrategySort edges; merge a forestGrow one tree from a vertex
Core data structureUnion-Find (DSU) + sorted edgesPriority queue / key array
Typical timeO(E log E)O(E log V) or O(V²)
Best onSparse graphs / edge listsDense graphs
Partial result isA forest until the final mergeAlways one connected tree
Disconnected graphYes — gives a spanning forestNo — needs a connected graph

Section 12

Golden Rules

🧮 Kruskal's Algorithm — Things That Trip People Up
1
Sorting is the whole cost. Everything else is near-linear thanks to Union-Find. If your edges arrive pre-sorted (or you can use a counting/radix sort on small integer weights), Kruskal's drops to almost O(E).
2
Always use both Union-Find optimisations. Path compression and union by rank together give O(α(V)) per operation. Skipping them can degrade a find to O(V) in the worst case.
3
Cycle check is just find(u) == find(v). Compare roots, not the vertices themselves. Add the edge only when the roots differ, then immediately union.
4
Stop at V−1 edges. Once the MST has V−1 edges it is complete — break out of the loop instead of scanning the remaining heavy edges for no reason.
5
Store edges weight-first. Using (weight, u, v) lets a plain sorted() do the right thing. Sorting tuples by a later field is a common, silent bug.
6
An MST is not unique. When several edges share a weight, more than one valid minimum spanning tree can exist — but they all share the same total cost.