Kruskal's Algorithm: Visual Guide + Python Code

Section 01

The Story That Explains Kruskal's Algorithm

📖 Real World Analogy

The Road Contractor Who Sorts the Quotes

A government wants to connect six villages with paved roads at the lowest total cost. A contractor collects a price quote for every possible road segment, then does something simple: he stacks all the quotes from cheapest to most expensive.

He works down the pile. For each road he asks one question: "Do these two villages already have a route between them?" If they don't, he builds the road. If they already do — even through a long chain of other villages — building it would just create a wasteful loop, so he throws the quote away. He stops the moment every village is reachable.

That sort-then-merge-without-loops strategy is Kruskal's algorithm.

Kruskal's algorithm finds the Minimum Spanning Tree (MST) of a weighted, undirected graph: the cheapest set of edges connecting all vertices with no cycles. Where Prim's grows a single tree outward from one vertex, Kruskal's is edge-centric — it sorts all edges once and greedily merges separate clusters into one, using a clever data structure called Union-Find to detect cycles instantly.

🌱

The Core Insight

Process edges from lightest to heaviest. Add an edge only if its two endpoints live in different components; doing so merges them. Skip any edge whose endpoints are already in the same component, because that would form a cycle. By the cut property, the lightest edge crossing between two clusters is always safe — so greedy is provably optimal.

Section 02

The Foundation — Spanning Trees & the Cycle Rule

A spanning tree touches every one of the V vertices, stays connected, and has no cycle — so it always contains exactly V − 1 edges. The minimum spanning tree is the spanning tree whose edge weights sum to the smallest possible total. Kruskal's relies on two twin rules of MST theory.

✅ Cut Property (what to keep)

For any split of the vertices into two groups, the lightest edge crossing the split is safe to add — it belongs to some MST. This is why Kruskal can greedily take the cheapest qualifying edge.

❌ Cycle Property (what to drop)

In any cycle, the heaviest edge is never needed in an MST. So when an edge would close a loop, it is the costliest link of that loop and can be safely discarded.

⚠️

MST ≠ Shortest Path

The MST minimises the total weight of the entire tree, not the distance between any two particular vertices. If you need point-to-point shortest routes, that is Dijkstra's job, not Kruskal's.

Section 03

The Engine Room — Union-Find (Disjoint Set Union)

The whole algorithm hinges on answering one question fast, millions of times if needed: "are these two vertices already connected?" The Union-Find structure (also called Disjoint Set Union, or DSU) does exactly this in near-constant time.

⚙️ The Three Union-Find Operations

make-set

Start with every vertex in its own one-element set — its own colour. With V vertices you begin with V separate components.

find(x)

Follow parent pointers up to the root (representative) of x's set. Two vertices are connected exactly when they share a root.

union(a, b)

Merge two sets by pointing one root at the other. After this, every vertex in both sets shares one representative — they are now one component.

🔑

Two Tricks Make It Nearly Free

Path compression flattens the tree during find by pointing nodes straight at the root. Union by rank always hangs the shorter tree under the taller one. Together they push the amortised cost of each operation down to O(α(V)) — the inverse Ackermann function, which is effectively a small constant for any input you will ever meet.

Section 04

The Algorithm in Five Steps

Put Every Vertex in Its Own Set

Initialise Union-Find so each of the V vertices is an isolated component. Nothing is connected yet.

Sort All Edges by Weight

Order every edge from lightest to heaviest. This single sort is what makes the greedy choice trivial — the next candidate is always the next edge in the list.

Scan Edges, Ask "Same Set?"

Walk the sorted list. For each edge, run find on both endpoints. If their roots match, the vertices are already connected.

Union if Different, Skip if Same

Different roots → add the edge to the MST and union the two sets. Same root → discard the edge (it would create a cycle).

Stop at V−1 Edges

The moment you have accepted V−1 edges, every vertex is connected and the tree is complete. Any remaining heavier edges are never even examined.

Section 05

Hands-On: Watch Every Line of Code Build the Tree

This is the heart of the tutorial. The graph on the left colours each vertex by its Union-Find component, and the Python on the right is wired to it. Press Play (or step manually) and the highlighted line of code is the task happening on the canvas: when sorted(edges) runs, the edge strip orders itself; when find runs, the edge under consideration pulses; a union turns the edge green and recolours the absorbed component into the growing blue tree; a cycle edge flashes red and is discarded.

Interactive — Kruskal's Algorithm, Line by Line

MST Cost

0 / 5

Edges

Components

a component considering in MST cycle → skip

Press Play to start, or use Step → to walk through one line at a time.

step 0 / 0

This is the exact graph used in the Python in Section 06. Final MST cost is 17, and edge A–C (4) is skipped as a cycle.

Section 06

Python Implementation

Full Version — Union-Find with Path Compression & Union by Rank

This is the production-quality implementation. The two helper functions are the Union-Find engine; kruskal itself is just a sorted loop over the edges.

def find(parent, x):
    while parent[x] != x:               # walk up to the root of x's set
        parent[x] = parent[parent[x]]   # path compression: flatten the tree
        x = parent[x]
    return x

def union(parent, rank, a, b):
    ra, rb = find(parent, a), find(parent, b)
    if ra == rb:
        return False                    # same set -> this edge is a cycle
    if rank[ra] < rank[rb]:
        ra, rb = rb, ra                 # attach the shorter tree...
    parent[rb] = ra                     # ...under the taller one
    if rank[ra] == rank[rb]:
        rank[ra] += 1
    return True

def kruskal(nodes, edges):
    parent = {v: v for v in nodes}      # every vertex is its own set
    rank   = {v: 0 for v in nodes}
    mst, total = [], 0
    for w, u, v in sorted(edges):       # greedily scan edges, lightest first
        if union(parent, rank, u, v):   # keep only edges that link two sets
            mst.append((u, v, w))
            total += w
            if len(mst) == len(nodes) - 1:
                break                   # V-1 edges -> spanning tree done
    return mst, total

Running It on Our Six-Village Graph

Note that each edge is stored as (weight, u, v) so that Python's sorted() orders by weight first — the only thing Kruskal needs.

nodes = ['A', 'B', 'C', 'D', 'E', 'F']
edges = [
    (1, 'A', 'B'), (2, 'C', 'D'), (3, 'B', 'C'), (4, 'A', 'C'),
    (5, 'D', 'E'), (6, 'E', 'F'), (7, 'B', 'E'), (8, 'A', 'F'),
]

tree, cost = kruskal(nodes, edges)
for u, v, w in tree:
    print(f"{u} -- {v}  (weight {w})")
print("Total MST cost:", cost)

OUTPUT

A -- B (weight 1) C -- D (weight 2) B -- C (weight 3) D -- E (weight 5) E -- F (weight 6) Total MST cost: 17

🎉

Where Did A–C (4) Go?

Edge A–C (4) never appears in the output. By the time the scan reached it, A and C were already linked through A–B–C, so union returned False and the edge was skipped. Likewise B–E (7) and A–F (8) were never even examined — the loop hit V−1 = 5 edges and stopped early.

Section 07

The Run, Traced Step by Step

Here is exactly what the animation does, written out. Edges are processed in sorted order, and the Union-Find check decides each one's fate.

Edge	find(u) vs find(v)	Decision	Components After	Total
A–B (1)	different	Union — add	{A,B} {C} {D} {E} {F}	1
C–D (2)	different	Union — add	{A,B} {C,D} {E} {F}	3
B–C (3)	different	Union — add	{A,B,C,D} {E} {F}	6
A–C (4)	same root	Skip — cycle	unchanged	6
D–E (5)	different	Union — add	{A,B,C,D,E} {F}	11
E–F (6)	different	Union — add → STOP	{A,B,C,D,E,F}	17
B–E (7), A–F (8)	—	Never examined (5 edges reached)	—	17

Section 08

Time & Space Complexity

Phase	Work Done	Cost
Sort the edges	One comparison sort of all E edges	O(E log E)
Union-Find operations	Up to 2E finds + E unions, near-constant each	O(E α(V)) ≈ O(E)
Overall	The sort dominates	O(E log E) = O(E log V)
Space	Edge list + parent/rank arrays	O(V + E)

📐

Why O(E log E) Equals O(E log V)

In any simple graph E ≤ V², so log E ≤ log V² = 2 log V — a constant factor. That is why the two forms are interchangeable. Because the runtime is driven by sorting edges, Kruskal's is the natural choice on sparse graphs, where E is small. On dense graphs (E ≈ V²), Prim's O(V²) matrix form often wins instead.

Section 09

Benefits of Kruskal's Algorithm

✅

Provably Optimal

cut + cycle property

Greedy, yet guaranteed to return an exact minimum spanning tree — no approximation, no backtracking.

🧮

Great on Sparse Graphs

edge-centric

When edges are few, sorting them is cheap and the whole algorithm flies. It works directly from a plain edge list — no adjacency structure required.

🪱

Handles Disconnected Graphs

spanning forest

If the graph isn't fully connected, Kruskal's naturally produces a minimum spanning forest — one MST per component. Prim's cannot do this directly.

💡

Prim vs Kruskal — the One-Liner

Both return a correct MST. Kruskal sorts all edges and merges a forest with Union-Find — best on sparse graphs and edge lists. Prim grows one tree from a vertex with a priority queue — best on dense graphs. Same answer, different path to it.

Section 10

Real-World Use Cases

🚧

Infrastructure Networks

Connecting cities with roads, rail, fibre, water, or power lines at minimum total cost — the classic MST application, and Kruskal's shines when the candidate links are listed as edges.

roads, grids, telecom

🧮

Clustering

Single-linkage clustering is literally Kruskal's stopped early: keep merging the closest points until you reach k clusters. Removing the heaviest MST edges splits data into natural groups.

single-linkage, data mining

📷

Image Segmentation

Graph-based segmentation (e.g. Felzenszwalb–Huttenlocher) sorts pixel-difference edges and merges regions Kruskal-style to find coherent areas in an image.

computer vision

🔌

Circuit & Chip Design

Wiring components together with the least total trace length maps onto an MST over the pins — reducing material, signal delay, and cost.

VLSI, PCB routing

📊

TSP Approximation

An MST gives a 2-approximation for the metric Travelling Salesman Problem — a fast, provably-bounded starting point for routing heuristics.

logistics, routing

🌐

Connectivity & Components

The same Union-Find engine answers "is the network still connected?" incrementally — used in social-graph analysis, percolation, and dynamic connectivity problems.

graph analytics

Section 11

Kruskal vs Prim — Side by Side

Property	Kruskal's Algorithm	Prim's Algorithm
Strategy	Sort edges; merge a forest	Grow one tree from a vertex
Core data structure	Union-Find (DSU) + sorted edges	Priority queue / key array
Typical time	O(E log E)	O(E log V) or O(V²)
Best on	Sparse graphs / edge lists	Dense graphs
Partial result is	A forest until the final merge	Always one connected tree
Disconnected graph	Yes — gives a spanning forest	No — needs a connected graph

Section 12

Golden Rules

🧮 Kruskal's Algorithm — Things That Trip People Up

Sorting is the whole cost. Everything else is near-linear thanks to Union-Find. If your edges arrive pre-sorted (or you can use a counting/radix sort on small integer weights), Kruskal's drops to almost O(E).

Always use both Union-Find optimisations. Path compression and union by rank together give O(α(V)) per operation. Skipping them can degrade a find to O(V) in the worst case.

Cycle check is just find(u) == find(v). Compare roots, not the vertices themselves. Add the edge only when the roots differ, then immediately union.

Stop at V−1 edges. Once the MST has V−1 edges it is complete — break out of the loop instead of scanning the remaining heavy edges for no reason.

Store edges weight-first. Using (weight, u, v) lets a plain sorted() do the right thing. Sorting tuples by a later field is a common, silent bug.

An MST is not unique. When several edges share a weight, more than one valid minimum spanning tree can exist — but they all share the same total cost.