The Story That Explains Johnson's Algorithm
Your fastest tool, Dijkstra's algorithm, works brilliantly — but only on roads with non-negative costs. The moment a subsidised road appears, Dijkstra breaks down and gives wrong answers.
Your backup, Bellman-Ford, handles subsidies just fine — but it's painfully slow for a country of thousands of cities.
Then a clever assistant appears with a two-step plan: "Let me first visit every city once with Bellman-Ford and compute a 'fairness adjustment' for each city. I'll use those adjustments to add a small surcharge to every road so that no road has a negative cost anymore — but crucially, the cheapest route between any two cities stays exactly the same. Then you can run fast Dijkstra from every city."
That two-step trick — one Bellman-Ford + n Dijkstra runs, connected by reweighting — is Johnson's Algorithm.
Johnson's Algorithm solves the All-Pairs Shortest Paths (APSP) problem on sparse weighted directed graphs that may contain negative edge weights (but no negative-weight cycles). It combines two classic algorithms into one elegant pipeline and achieves O(V² log V + VE) — dramatically faster than Floyd-Warshall's Θ(V³) when the graph is sparse.
Johnson's key insight is reweighting: transform all edge weights from possibly-negative to guaranteed non-negative — without changing which path is shortest. The transformation uses a potential function h(v) derived from Bellman-Ford. Once weights are non-negative, Dijkstra runs correctly and efficiently from every source vertex.
The Four Phases — High-Level Pipeline
Johnson's algorithm runs in four clearly distinct phases. Click each phase to highlight it.
Create a new vertex s (not in the original graph) and add a directed edge s → v with weight 0 for every original vertex v. This gives Bellman-Ford a single origin from which it can reach all vertices, without disturbing any existing paths.
The Reweighting Formula — Why It Works
The heart of Johnson's Algorithm is this elegant transformation. Understanding why it preserves shortest paths is the key to mastering the algorithm.
ŵ(P) = Σ ŵ(vᵢ, vᵢ₊₁) = Σ [w(vᵢ,vᵢ₊₁) + h(vᵢ) − h(vᵢ₊₁)]
When you expand this sum, the intermediate h-values cancel out (telescope): h(v₂) appears as −h(v₂) from the first term and +h(v₂) from the second, and so on. Only the endpoints survive:
ŵ(P) = w(P) + h(v₁) − h(vₖ)
Since h(v₁) and h(vₖ) are constants that depend only on the source and destination — not on which intermediate path we choose — the path that minimises w(P) also minimises ŵ(P). Shortest paths are preserved exactly.
Negative edges break Dijkstra
All weights ≥ 0 → Dijkstra works
Bellman-Ford computes h(v) = δ(s,v), the shortest path distance from virtual source s to v. By the triangle inequality for shortest paths: δ(s,v) ≤ δ(s,u) + w(u,v), which means h(v) ≤ h(u) + w(u,v), therefore w(u,v) + h(u) − h(v) ≥ 0. Every reweighted edge is non-negative. This is why Bellman-Ford — not Dijkstra — must be used in Phase 2.
Animated Example — Phase 2: Bellman-Ford Trace
We run Bellman-Ford on this concrete 5-vertex graph (4 original + virtual source s). Click Next Relaxation to step through each edge relaxation and watch the h-values converge.
| Vertex | Round 0 | Round 1 | Round 2 | Round 3 |
|---|
Animated Example — Phase 3: Edge Reweighting
Using the potentials h(v) computed in Phase 2, we now transform every edge weight. Click through each edge to see the formula applied live.
| Edge | w(u,v) | h(u) | h(v) | ŵ(u,v) |
|---|
Animated Example — Phase 4: Dijkstra from Source 1
With all edges non-negative, we run Dijkstra once for each source. Here we trace Dijkstra from vertex 1 on the reweighted graph. Step through to see distances settle, then apply the recovery formula to get true distances.
| To vertex | d̂ (reweighted) | d (true) = d̂ − h(1) + h(v) |
|---|
Python Implementation — Full Code with Step Tracer
Below is a complete, production-quality Python implementation of Johnson's Algorithm using a min-heap for Dijkstra. Click Step Through → to advance one major code block at a time with live explanation and variable state.
import heapq import math def bellman_ford(graph, source, n): # Returns h[v] = shortest-path from source h = [math.inf] * (n + 1) h[source] = 0 for _ in range(n - 1): # n-1 passes for u, v, w in graph: # relax every edge if h[u] + w < h[v]: h[v] = h[u] + w # Negative cycle check for u, v, w in graph: if h[u] + w < h[v]: raise ValueError("Negative-weight cycle") return h def dijkstra(adj, source, n, h): # Min-heap Dijkstra on reweighted graph dist = [math.inf] * (n + 1) dist[source] = 0 pq = [(0, source)] # (distance, vertex) while pq: d, u = heapq.heappop(pq) if d > dist[u]: continue # stale entry for v, w in adj[u]: # neighbours of u w_hat = w + h[u] - h[v] # reweighted cost if dist[u] + w_hat < dist[v]: dist[v] = dist[u] + w_hat heapq.heappush(pq, (dist[v], v)) return dist def johnsons(n, edges): # n: vertex count (1-indexed) # edges: list of (u, v, w) # Phase 1: Add virtual source s = n+1 aug = edges + [(n+1, v, 0) for v in range(1, n+1)] # Phase 2: Bellman-Ford for potentials h = bellman_ford(aug, n+1, n+1) # Build adjacency list for Dijkstra adj = [[] for _ in range(n+2)] for u, v, w in edges: adj[u].append((v, w)) # original weights # Phase 4: Dijkstra from every source D = {} # D[u][v] = true shortest path for u in range(1, n+1): dist = dijkstra(adj, u, n, h) D[u] = {} for v in range(1, n+1): if dist[v] == math.inf: D[u][v] = math.inf else: # Recover true distance D[u][v] = dist[v] - h[u] + h[v] return D
Complete Worked Example — End to End
Let's trace Johnson's Algorithm on a 5-vertex graph with two negative edges and print the full all-pairs distance matrix.
# ── Johnson's Algorithm — complete demo ──────────────────────────
edges = [
(1, 2, 3), # 1→2 weight 3
(1, 3, 8), # 1→3 weight 8
(2, 4, 1), # 2→4 weight 1
(3, 2, -4), # 3→2 weight -4 ← negative
(4, 3, 2), # 4→3 weight 2
(4, 5, 6), # 4→5 weight 6
(5, 1, -3), # 5→1 weight -3 ← negative
(5, 4, 7), # 5→4 weight 7
]
D = johnsons(5, edges)
# Print all-pairs matrix
print("All-Pairs Shortest Path Matrix:")
print(f"{'':4s}", end="")
for j in range(1, 6):
print(f"{j:6d}", end="")
print()
for i in range(1, 6):
print(f"{i:4d}", end="")
for j in range(1, 6):
val = D[i][j]
sym = " ∞" if val == math.inf else f"{val:6.0f}"
print(sym, end="")
print()
Row i, column j = shortest distance from vertex i to vertex j. Diagonal is always 0 (trivial self-path). ∞ means vertex j is unreachable from i. Negative values (like D[5][1] = −3) are correct and expected — they arise from negative-weight edges on the shortest path. Johnson's handles them correctly.
Complexity Analysis — Why Johnson's Wins on Sparse Graphs
| Algorithm | Time Complexity | Negative Edges | Best For | Space |
|---|---|---|---|---|
| Johnson's | O(V² log V + VE) | ✓ Yes | Sparse graphs | O(V²) |
| Floyd-Warshall | Θ(V³) | ✓ Yes | Dense graphs | Θ(V²) |
| n × Dijkstra (no negatives) | O(V² log V + VE) | ✗ No | Non-negative only | O(V²) |
| n × Bellman-Ford | O(V²E) | ✓ Yes | Almost never | O(V²) |
The sparser the graph, the greater Johnson's advantage over Floyd-Warshall. At E ≈ V (sparse), Johnson's is roughly V/log V times faster.
Visual Pipeline — Full Algorithm Diagram
Real-World Use Cases & Benefits
CLRS Chapter Context — Where Johnson's Fits
Edge Cases, Traps & Common Mistakes
Full APSP Algorithm Comparison
| Property | Johnson's | Floyd-Warshall | n × Dijkstra | n × Bellman-Ford |
|---|---|---|---|---|
| Time (binary heap) | O(V² log V + VE) | Θ(V³) | O(V² log V + VE) | O(V²E) |
| Negative edge weights | ✓ Yes | ✓ Yes | ✗ No | ✓ Yes |
| Negative cycle detection | ✓ Phase 2 | ✓ Diagonal check | ✗ N/A | ✓ Per source |
| Best graph type | Sparse (E « V²) | Dense (E ≈ V²) | Non-negative sparse | Almost never |
| Implementation complexity | Moderate (2 algorithms) | Very simple | Moderate | Simple but slow |
| Space complexity | O(V + E) | Θ(V²) | O(V + E) | O(V + E) |
| CLRS section | §25.3 | §25.2 | §24.3 × n | §24.1 × n |
CLRS explicitly states: use Floyd-Warshall for dense graphs and Johnson's for sparse graphs. The crossover point is roughly when E < V² / log V — at that point Johnson's O(VE log V) beats Floyd's V³. In practice, most real-world graphs (road networks, social graphs, internet topologies) are sparse, making Johnson's the default APSP choice when negative edges are present.
Golden Rules — Johnson's Algorithm in Practice
d(u,v) = d̂(u,v) − h[u] + h[v].
Forgetting this and returning d̂ directly is the single most common
Johnson's implementation bug. The reweighted distances are meaningless
without unwrapping them with the potentials.
math.inf, not a large integer, for ∞.
Large sentinel values like 10⁹ overflow when added together during
Bellman-Ford relaxation: INF + (-1) is fine with math.inf
but wraps around or produces wrong results with integer sentinels.
heapq with lazy deletion gives O(E log V) per run.