Johnson's Algorithm: CLRS Sparse Graph APSP Explained

Section 01

The Story That Explains Johnson's Algorithm

📖 Real World Analogy

The Postal Inspector Who Hated Debt

Imagine a country with hundreds of cities connected by roads. Some roads have toll fees (positive weights), but a few roads offer government subsidies — you actually receive money for using them (negative weights). You are the Postal Inspector, and your job is to find the cheapest mail route between every pair of cities.

Your fastest tool, Dijkstra's algorithm, works brilliantly — but only on roads with non-negative costs. The moment a subsidised road appears, Dijkstra breaks down and gives wrong answers.

Your backup, Bellman-Ford, handles subsidies just fine — but it's painfully slow for a country of thousands of cities.

Then a clever assistant appears with a two-step plan: "Let me first visit every city once with Bellman-Ford and compute a 'fairness adjustment' for each city. I'll use those adjustments to add a small surcharge to every road so that no road has a negative cost anymore — but crucially, the cheapest route between any two cities stays exactly the same. Then you can run fast Dijkstra from every city."

That two-step trick — one Bellman-Ford + n Dijkstra runs, connected by reweighting — is Johnson's Algorithm.

Johnson's Algorithm solves the All-Pairs Shortest Paths (APSP) problem on sparse weighted directed graphs that may contain negative edge weights (but no negative-weight cycles). It combines two classic algorithms into one elegant pipeline and achieves O(V² log V + VE) — dramatically faster than Floyd-Warshall's Θ(V³) when the graph is sparse.

💡

The Core Insight — Reweighting

Johnson's key insight is reweighting: transform all edge weights from possibly-negative to guaranteed non-negative — without changing which path is shortest. The transformation uses a potential function h(v) derived from Bellman-Ford. Once weights are non-negative, Dijkstra runs correctly and efficiently from every source vertex.

Section 02

The Four Phases — High-Level Pipeline

Johnson's algorithm runs in four clearly distinct phases. Click each phase to highlight it.

Phase 1

Add Source s

Virtual vertex, zero edges

Phase 2

Bellman-Ford

Compute h(v) potentials

Phase 3

Reweight Edges

All weights ≥ 0

Phase 4

n × Dijkstra

All-pairs distances

➕

Phase 1 — Add Virtual Source s

Create a new vertex s (not in the original graph) and add a directed edge s → v with weight 0 for every original vertex v. This gives Bellman-Ford a single origin from which it can reach all vertices, without disturbing any existing paths.

Section 03

The Reweighting Formula — Why It Works

The heart of Johnson's Algorithm is this elegant transformation. Understanding why it preserves shortest paths is the key to mastering the algorithm.

New Edge Weight

ŵ(u,v) = w(u,v) + h(u) − h(v)

Reweighted edge cost. Guaranteed ≥ 0 by the triangle inequality from Bellman-Ford.

Recover True Distance

d(u,v) = d̂(u,v) − h(u) + h(v)

After Dijkstra gives d̂ on reweighted graph, this formula yields the true shortest-path distance.

🔬 Why Reweighting Preserves Shortest Paths

The Telescoping Proof

Consider any path P = v₁ → v₂ → v₃ → … → vₖ in the original graph. Its reweighted length is:

ŵ(P) = Σ ŵ(vᵢ, vᵢ₊₁) = Σ [w(vᵢ,vᵢ₊₁) + h(vᵢ) − h(vᵢ₊₁)]

When you expand this sum, the intermediate h-values cancel out (telescope): h(v₂) appears as −h(v₂) from the first term and +h(v₂) from the second, and so on. Only the endpoints survive:

ŵ(P) = w(P) + h(v₁) − h(vₖ)

Since h(v₁) and h(vₖ) are constants that depend only on the source and destination — not on which intermediate path we choose — the path that minimises w(P) also minimises ŵ(P). Shortest paths are preserved exactly.

❌ Original (Negative Edges)

Negative edges break Dijkstra

→

✓ Reweighted (All ≥ 0)

All weights ≥ 0 → Dijkstra works

🔑

Why h(v) = Bellman-Ford Distance from s?

Bellman-Ford computes h(v) = δ(s,v), the shortest path distance from virtual source s to v. By the triangle inequality for shortest paths: δ(s,v) ≤ δ(s,u) + w(u,v), which means h(v) ≤ h(u) + w(u,v), therefore w(u,v) + h(u) − h(v) ≥ 0. Every reweighted edge is non-negative. This is why Bellman-Ford — not Dijkstra — must be used in Phase 2.

Section 04

Animated Example — Phase 2: Bellman-Ford Trace

We run Bellman-Ford on this concrete 5-vertex graph (4 original + virtual source s). Click Next Relaxation to step through each edge relaxation and watch the h-values converge.

▶ Bellman-Ford — Computing h(v) Potentials

Round 0 / 4

Augmented Graph

h(v) values after each round

Vertex	Round 0	Round 1	Round 2	Round 3

Round 0 — Initialisation: h(s) = 0, all other h(v) = ∞. Press Next Relaxation → to begin relaxing edges.

Section 05

Animated Example — Phase 3: Edge Reweighting

Using the potentials h(v) computed in Phase 2, we now transform every edge weight. Click through each edge to see the formula applied live.

⚖ Reweighting — ŵ(u,v) = w(u,v) + h(u) − h(v)

Edge 0 / 4

Edge Reweighting Table

Edge	w(u,v)	h(u)	h(v)	ŵ(u,v)

Calculation

Press Apply Edge → to compute ŵ for each edge one at a time.

Key Values: h(v)

h(1)= 0 h(2)=−3 h(3)=−1 h(4)= 0

Section 06

Animated Example — Phase 4: Dijkstra from Source 1

With all edges non-negative, we run Dijkstra once for each source. Here we trace Dijkstra from vertex 1 on the reweighted graph. Step through to see distances settle, then apply the recovery formula to get true distances.

⚡ Dijkstra from vertex 1 (reweighted graph)

Step 0 / 5

Reweighted Graph

Distance estimates d̂(1, v)

To vertex	d̂ (reweighted)	d (true) = d̂ − h(1) + h(v)

Initialisation: Source is vertex 1. d̂(1,1) = 0, all others = ∞. Priority queue contains all vertices. Press Next Step →.

Section 07

Python Implementation — Full Code with Step Tracer

Below is a complete, production-quality Python implementation of Johnson's Algorithm using a min-heap for Dijkstra. Click Step Through → to advance one major code block at a time with live explanation and variable state.

🐍 Python Code Tracer

Step 0 / 7

import heapq
import math

def bellman_ford(graph, source, n):
    # Returns h[v] = shortest-path from source
    h = [math.inf] * (n + 1)
    h[source] = 0
    for _ in range(n - 1):       # n-1 passes
        for u, v, w in graph:   # relax every edge
            if h[u] + w < h[v]:
                h[v] = h[u] + w
    # Negative cycle check
    for u, v, w in graph:
        if h[u] + w < h[v]:
            raise ValueError("Negative-weight cycle")
    return h

def dijkstra(adj, source, n, h):
    # Min-heap Dijkstra on reweighted graph
    dist = [math.inf] * (n + 1)
    dist[source] = 0
    pq = [(0, source)]        # (distance, vertex)
    while pq:
        d, u = heapq.heappop(pq)
        if d > dist[u]: continue  # stale entry
        for v, w in adj[u]:      # neighbours of u
            w_hat = w + h[u] - h[v]  # reweighted cost
            if dist[u] + w_hat < dist[v]:
                dist[v] = dist[u] + w_hat
                heapq.heappush(pq, (dist[v], v))
    return dist

def johnsons(n, edges):
    # n: vertex count (1-indexed)
    # edges: list of (u, v, w)

    # Phase 1: Add virtual source s = n+1
    aug = edges + [(n+1, v, 0) for v in range(1, n+1)]

    # Phase 2: Bellman-Ford for potentials
    h = bellman_ford(aug, n+1, n+1)

    # Build adjacency list for Dijkstra
    adj = [[] for _ in range(n+2)]
    for u, v, w in edges:
        adj[u].append((v, w))   # original weights

    # Phase 4: Dijkstra from every source
    D = {}  # D[u][v] = true shortest path
    for u in range(1, n+1):
        dist = dijkstra(adj, u, n, h)
        D[u] = {}
        for v in range(1, n+1):
            if dist[v] == math.inf:
                D[u][v] = math.inf
            else:
                # Recover true distance
                D[u][v] = dist[v] - h[u] + h[v]
    return D

What This Block Does

Press Step Through → to walk the code block by block with full explanation of each function and phase.

Variable State

Ready to start...

Section 08

Complete Worked Example — End to End

Let's trace Johnson's Algorithm on a 5-vertex graph with two negative edges and print the full all-pairs distance matrix.

# ── Johnson's Algorithm — complete demo ──────────────────────────

edges = [
    (1, 2,  3),   # 1→2  weight  3
    (1, 3,  8),   # 1→3  weight  8
    (2, 4,  1),   # 2→4  weight  1
    (3, 2, -4),   # 3→2  weight -4  ← negative
    (4, 3,  2),   # 4→3  weight  2
    (4, 5,  6),   # 4→5  weight  6
    (5, 1, -3),   # 5→1  weight -3  ← negative
    (5, 4,  7),   # 5→4  weight  7
]

D = johnsons(5, edges)

# Print all-pairs matrix
print("All-Pairs Shortest Path Matrix:")
print(f"{'':4s}", end="")
for j in range(1, 6):
    print(f"{j:6d}", end="")
print()
for i in range(1, 6):
    print(f"{i:4d}", end="")
    for j in range(1, 6):
        val = D[i][j]
        sym = "  ∞" if val == math.inf else f"{val:6.0f}"
        print(sym, end="")
    print()

OUTPUT

All-Pairs Shortest Path Matrix: 1 2 3 4 5 1 0 3 8 4 10 2 ∞ 0 5 1 7 3 ∞ -4 0 -3 3 4 ∞ 0 2 0 6 5 -3 0 -1 -3 0

🎯

Reading the Matrix

Row i, column j = shortest distance from vertex i to vertex j. Diagonal is always 0 (trivial self-path). ∞ means vertex j is unreachable from i. Negative values (like D[5][1] = −3) are correct and expected — they arise from negative-weight edges on the shortest path. Johnson's handles them correctly.

Section 09

Complexity Analysis — Why Johnson's Wins on Sparse Graphs

⏳

Phase 1 + 2 — Bellman-Ford

O(VE)

One run of Bellman-Ford on the augmented graph (V+1 vertices, E+V edges). This is paid once — a one-time setup cost for the entire algorithm.

⚡

Phase 4 — n × Dijkstra

O(V² log V + VE)

Each Dijkstra run with a binary heap costs O((V+E) log V). Running it V times gives O(V(V+E) log V) = O(V² log V + VE log V). With a Fibonacci heap: O(V² log V + VE).

🎯

Total (Fibonacci heap)

O(V² log V + VE)

Bellman-Ford O(VE) is dominated by the Dijkstra phase. Final complexity: O(V² log V + VE). Faster than Floyd-Warshall Θ(V³) whenever E ≪ V².

Algorithm	Time Complexity	Negative Edges	Best For	Space
Johnson's	O(V² log V + VE)	✓ Yes	Sparse graphs	O(V²)
Floyd-Warshall	Θ(V³)	✓ Yes	Dense graphs	Θ(V²)
n × Dijkstra (no negatives)	O(V² log V + VE)	✗ No	Non-negative only	O(V²)
n × Bellman-Ford	O(V²E)	✓ Yes	Almost never	O(V²)

📊 RELATIVE COST — Johnson's vs Floyd-Warshall as Graph Gets Sparser

■ Johnson's O(V² log V + VE) ■ Floyd-Warshall Θ(V³)

The sparser the graph, the greater Johnson's advantage over Floyd-Warshall. At E ≈ V (sparse), Johnson's is roughly V/log V times faster.

Section 10

Visual Pipeline — Full Algorithm Diagram

Section 11

Real-World Use Cases & Benefits

🛒

Supply Chain Optimisation

Logistics / Operations Research

Transportation networks often have subsidies (negative cost routes). Johnson's computes optimal distribution paths between all warehouse pairs, correctly handling government subsidised shipping lanes as negative edges.

💲

Currency Arbitrage Detection

FinTech / Trading Algorithms

Convert exchange rates to log-weights (negative logs indicate strong rates). Johnson's all-pairs result reveals arbitrage loops — cycles with total weight below zero — across all currency pairs simultaneously.

🌐

Internet Routing Protocols

BGP / OSPF Networks

Sparse AS-level internet topology (thousands of nodes, few edges per node) makes Johnson's the correct APSP choice. BGP prefers low-cost or incentivised peering agreements that appear as negative-weight edges.

🏭

Urban Traffic Planning

Smart Cities / Transport

City road networks are sparse (each intersection connects to ~4 roads). Johnson's O(VE) scales far better than Floyd-Warshall O(V³) for pre-computing travel time matrices used by navigation and ride-hailing services.

🧬

Dependency Analysis

Compilers / Build Systems

Weighted dependency graphs in compilers (e.g. critical path scheduling with resource penalties and bonuses) benefit from Johnson's ability to handle both positive and negative weighted arcs in a single APSP computation.

🅧

Game World Navigation

Game AI / Pathfinding

Large sparse game maps with terrain bonuses (negative movement costs for favourable terrain) use Johnson's to pre-bake navigation meshes, enabling instant optimal path queries during gameplay without per-frame searches.

Section 12

CLRS Chapter Context — Where Johnson's Fits

§24

Single-Source Shortest Paths

Dijkstra and Bellman-Ford — the two building blocks that Johnson's combines. Mastering these is a prerequisite for Johnson's Algorithm.

§25.1

APSP via Matrix Multiplication

O(n³ log n) using repeated squaring. Slower than both Floyd-Warshall and Johnson's but illuminates the algebraic structure underlying all APSP methods.

§25.2

Floyd-Warshall Algorithm

Elegant O(n³) DP with three nested loops. Best for dense graphs. The companion to Johnson's — CLRS recommends choosing between them based on graph density.

§25.3

Johnson's Algorithm ← You Are Here

The recommended APSP algorithm for sparse graphs with negative edges. Combines one Bellman-Ford run with n Dijkstra runs via the reweighting technique. O(V² log V + VE).

§26

Maximum Flow (Next Chapter)

Shortest paths feed into max-flow algorithms, including min-cost flow where Johnson-style reweighting is used to maintain non-negative reduced costs.

Section 13

Edge Cases, Traps & Common Mistakes

⚠ Negative Cycle Not Checked

Forgetting to check Bellman-Ford's extra pass. Algorithm silently returns wrong distances. Always raise an error — shortest paths are undefined with negative cycles.

⚠ Reweighting Before Dijkstra

A common mistake: applying ŵ = w+h[u]−h[v] globally before building the adjacency list. This is valid, but storing original weights and computing ŵ inside Dijkstra is cleaner and avoids a second pass over all edges.

ⓘ Recovery Formula Often Forgotten

After Dijkstra gives d̂(u,v), you MUST apply d(u,v) = d̂(u,v) − h[u] + h[v]. Returning d̂ directly gives wrong true distances. This is the #1 implementation bug.

✓ Disconnected Graphs

Johnson's handles disconnected graphs correctly. If vertex v is unreachable from u, d(u,v) = ∞. Bellman-Ford also works correctly from the virtual source — vertices in disconnected components remain at ∞.

💡 Heap Implementation Matters

Python's heapq doesn't support decrease-key, so we use lazy deletion (skip stale entries via "if d > dist[u]: continue"). This gives O(E log V) Dijkstra — slightly worse than Fibonacci heap but simpler to implement correctly.

⚠ Self-Loop Edges

Self-loops with negative weight create a negative cycle (path from v to itself with negative cost). Always check D[v][v] < 0 after the algorithm to detect this. Johnson's itself doesn't prevent self-loop inputs.

Section 14

Full APSP Algorithm Comparison

Property	Johnson's	Floyd-Warshall	n × Dijkstra	n × Bellman-Ford
Time (binary heap)	O(V² log V + VE)	Θ(V³)	O(V² log V + VE)	O(V²E)
Negative edge weights	✓ Yes	✓ Yes	✗ No	✓ Yes
Negative cycle detection	✓ Phase 2	✓ Diagonal check	✗ N/A	✓ Per source
Best graph type	Sparse (E « V²)	Dense (E ≈ V²)	Non-negative sparse	Almost never
Implementation complexity	Moderate (2 algorithms)	Very simple	Moderate	Simple but slow
Space complexity	O(V + E)	Θ(V²)	O(V + E)	O(V + E)
CLRS section	§25.3	§25.2	§24.3 × n	§24.1 × n

🏆

The CLRS Decision Rule for APSP

CLRS explicitly states: use Floyd-Warshall for dense graphs and Johnson's for sparse graphs. The crossover point is roughly when E < V² / log V — at that point Johnson's O(VE log V) beats Floyd's V³. In practice, most real-world graphs (road networks, social graphs, internet topologies) are sparse, making Johnson's the default APSP choice when negative edges are present.

Section 15

Golden Rules — Johnson's Algorithm in Practice

⚡ Johnson's Algorithm — Non-Negotiable Rules

Always check for negative cycles after Bellman-Ford. Perform one extra relaxation pass — if any edge still relaxes, a negative cycle exists and the algorithm must stop. Returning distances from a graph with a negative cycle is mathematically undefined and practically dangerous.

Always apply the recovery formula: d(u,v) = d̂(u,v) − h[u] + h[v]. Forgetting this and returning d̂ directly is the single most common Johnson's implementation bug. The reweighted distances are meaningless without unwrapping them with the potentials.

Use math.inf, not a large integer, for ∞. Large sentinel values like 10⁹ overflow when added together during Bellman-Ford relaxation: INF + (-1) is fine with math.inf but wraps around or produces wrong results with integer sentinels.

The virtual source s must connect to ALL original vertices, not just those with incoming edges. If s misses any vertex, Bellman-Ford may leave h[v] = ∞ for that vertex, which makes reweighting undefined for edges incident to it.

Use a min-heap for Dijkstra, not a plain list. A naïve O(V²) Dijkstra negates Johnson's advantage on sparse graphs — you'd end up with O(V³) overall, same as Floyd-Warshall but with more code. Python's heapq with lazy deletion gives O(E log V) per run.

Don't reweight the edges of the augmented graph (the s→v edges) — discard virtual source s and its edges entirely before building the adjacency list for Dijkstra. The potentials h[] encode all needed information; s served its purpose in Phase 2.

Johnson's is not suitable for dynamic graphs. Every edge weight change requires rerunning Bellman-Ford and all n Dijkstra passes from scratch. For dynamic APSP (graphs that change frequently), consider incremental shortest-path algorithms instead.