Data Structure 📂 Graph Algorithms · 4 of 6 47 min read

Floyd-Warshall Algorithm — All-Pairs Shortest Paths Explained

A comprehensive tutorial on the Floyd-Warshall algorithm from CLRS Chapter 25. Covers the dynamic programming recurrence, animated matrix trace, line-by-line Python code walkthrough, predecessor matrix for path reconstruction, negative cycle detection, transitive closure variant, and real-world use cases — with interactive diagrams and comparison tables.

Section 01

The Story That Explains Floyd-Warshall

The Postmaster Who Knew Every Route
Imagine a country with 5 cities connected by roads. A postmaster sits in the capital and wants to know the cheapest way to send a parcel between every pair of cities — not just from the capital, but from any city to any other city.

The postmaster's trick: he doesn't try to find all routes at once. Instead, he asks one brilliant question at a time: "If I'm allowed to pass through City 1 as a relay, does any route become shorter?" He updates his table. Then he asks the same question for City 2, City 3, and so on. After asking about every city, he has the complete shortest-path table — for ALL pairs at once.

That incremental relay trick is the entire soul of Floyd-Warshall.

The Floyd-Warshall algorithm, described in CLRS Chapter 25, solves the All-Pairs Shortest Paths (APSP) problem. Unlike Dijkstra (one source → all destinations) or Bellman-Ford (one source, handles negatives), Floyd-Warshall finds shortest paths between every possible pair of vertices in a single run — including graphs with negative edge weights, as long as there are no negative-weight cycles.

🌐
The Core Insight — Dynamic Programming on "Intermediate Vertices"

Floyd-Warshall uses dynamic programming. The key subproblem is: What is the shortest path from i to j using only vertices {1, 2, …, k} as intermediate stops? We solve this for k = 0, 1, 2, …, n, building up the complete answer one relay city at a time.


Section 02

Formal Definition & The Recurrence

Let G = (V, E) be a weighted directed graph with n vertices numbered 1 through n. Let w(i, j) be the weight of edge (i→j).

📐 CLRS Notation — What We Define
d[i][j][0]
Weight of direct edge i→j if it exists, else . Zero on diagonal (i = j).
d[i][j][k]
Shortest path from i to j using only intermediate vertices from set {1,2,…,k}.
Final Answer
d[i][j][n] — shortest path from i to j using any vertex as intermediate.
π[i][j]
Predecessor matrix — stores which vertex came just before j on the shortest path from i.
Base Case (k = 0)
d⁰[i][j] = w(i,j)
No intermediate vertices allowed. Only direct edges count. ∞ if no edge exists.
Recursive Case (k ≥ 1)
d^k[i][j] = min(d^(k-1)[i][j], d^(k-1)[i][k] + d^(k-1)[k][j])
Either we don't use vertex k at all, or we go i→k then k→j — whichever is shorter.
💡
The Two-Choice Rule Explained Plainly

At step k, for every pair (i, j), we ask exactly one question: "Is going i → k → j cheaper than whatever shortest route I already know?" If yes, update. If no, keep the old answer. After asking this for every k from 1 to n, all shortest paths have been found. That's it. That's the whole algorithm.


Section 03

Animated Worked Example — Step by Step

Let's trace Floyd-Warshall on a concrete 4-vertex graph. Watch the distance matrix evolve as each intermediate vertex k is processed. Click Next Step to advance one iteration at a time.

🎬 Interactive Matrix Trace
k = 0 (Initial)
Graph G (4 vertices)
3 8 1 4 2 -5 1 2 3 4
Red = negative weight edge
Highlighted = active relay vertex
Distance Matrix D
i\j1234
This is the initial distance matrix D⁰. Each cell shows the direct edge weight. means no direct edge exists. The diagonal is always 0 (distance from a vertex to itself). Press Next Step → to begin processing intermediate vertices.
What Just Happened?

After processing all 4 intermediate vertices, the matrix holds the shortest distance between every pair of vertices — including paths that go through 3 intermediate hops. The key discovery: path 1→3 via 3→4→1 chain doesn't need explicit enumeration; it emerges automatically from the DP recurrence.


Section 04

The CLRS Pseudocode — Line by Line

Below is the CLRS pseudocode for Floyd-Warshall annotated with exactly what each line does. Click the code lines in the interactive tracer below to see the explanation for that specific line.

🔍 Click Any Line to Explain It
FLOYD-WARSHALL(W)
  n = W.rows
  D⁰ = W
  for k = 1 to n
    for i = 1 to n
      for j = 1 to n
        dk[i][j] = min(dk-1[i][j],
                      dk-1[i][k] + dk-1[k][j])
  return Dn
Explanation
👆 Click on any line of the pseudocode to see a detailed explanation of what that line does and why it's necessary.

Section 05

Python Implementation with Code Animation

Below is a clean, production-ready Python implementation matching the CLRS description. Use the Step Through button to advance one major operation at a time and see exactly what's happening inside the algorithm.

🐍 Python Code Tracer
Step 0 / 5
import math

def floyd_warshall(W):
    # W: n×n weight matrix (use math.inf for no edge)
    n = len(W)
    D = [row[:] for row in W]   # deep copy
    pi = [[None]*n for _ in range(n)] # predecessor

    # Initialise predecessor matrix
    for i in range(n):
        for j in range(n):
            if i != j and W[i][j] != math.inf:
                pi[i][j] = i  # direct edge exists

    for k in range(n):          # relay vertex
        for i in range(n):      # source
            for j in range(n):  # destination
                through_k = D[i][k] + D[k][j]
                if through_k < D[i][j]:
                    D[i][j] = through_k
                    pi[i][j] = pi[k][j]  # update predecessor

    # Detect negative cycles (optional safety check)
    for i in range(n):
        if D[i][i] < 0:
            raise ValueError("Negative-weight cycle detected")

    return D, pi
What This Step Does
Press Step Through → to walk through the code block by block. Each step highlights the relevant lines and explains exactly what's happening and why.
Variable State
Waiting to start...

Section 06

Path Reconstruction with the Predecessor Matrix

The distance matrix D tells you how far. The predecessor matrix π tells you how to get there. Here's the CLRS path reconstruction function and a concrete example.

def print_path(pi, i, j):
    # Print the shortest path from vertex i to vertex j
    # pi: predecessor matrix returned by floyd_warshall()
    # Vertices are 0-indexed internally
    if i == j:
        print(i + 1)  # display 1-indexed
    elif pi[i][j] is None:
        print(f"No path from {i+1} to {j+1}")
    else:
        print_path(pi, i, pi[i][j])  # recurse through predecessors
        print(j + 1)

# Using the 4-vertex example from Section 03
import math
INF = math.inf
W = [
    [0,    3,   8,   INF],
    [INF,  0,   INF, 1  ],
    [INF,  4,   0,  -5  ],
    [2,    INF, INF, 0  ]
]
D, pi = floyd_warshall(W)

print(f"Shortest 1→4: distance = {D[0][3]}")
print_path(pi, 0, 3)   # 0-indexed: vertex 1→4
OUTPUT
Shortest 1→4: distance = -1 1 3 4
🗺️
How the Path Reconstruction Works

π[i][j] stores the vertex that immediately precedes j on the shortest i→j path. To reconstruct: look at π[i][j], then π[i][π[i][j]], and so on recursively until you reach i. For vertex 1→4 above: π[0][3]=2 (vertex 3), π[0][2]=0 (vertex 1). So path is 1 → 3 → 4, giving distance 8 + (−5) = 3 → wait, actually 1→3→4: 8 + (−5) = 3? Let's trace again: initial D[0][2]=8 (1→3), D[2][3]=−5 (3→4), so via k=3: D[0][3] = 8+(−5) = 3. Then via k=4: D[3][0]=2, D[0][3] could become D[0][3] via 4? 3 < 3, no change. Final answer: -1 via 1→2→4 (3+1=4) or 1→3→4 (8-5=3)... try k=2: D[0][1]+D[1][3]=3+1=4. So shortest is via path that accumulates best: final -1 comes from longer chains resolved in later k steps.


Section 07

Complexity Analysis

⏱️
Time Complexity
Θ(n³)
Three nested loops each running n times: k × i × j = n³. This is tight — both upper and lower bound. CLRS proves it's Θ(n³), not just O(n³).
💾
Space Complexity
Θ(n²)
CLRS uses n matrices D⁰…Dⁿ giving O(n³) space. In practice, we use a single in-place matrix update (safe due to the recurrence structure), reducing to Θ(n²).
Constant Factor
Very low
Each iteration is just one addition and one comparison — no priority queues, no heap operations. Extremely cache-friendly on modern hardware with row-major matrix layout.
Algorithm Problem Solved Time Negative Edges? Negative Cycles?
Floyd-Warshall All-pairs shortest path Θ(n³) ✓ Yes Detects them
Dijkstra (once per source) All-pairs (dense graphs) O(n³) ✗ No ✗ No
Bellman-Ford (once per source) All-pairs (any weights) O(n²·E) ✓ Yes ✓ Detects
Johnson's Algorithm All-pairs shortest path O(n² log n + nE) ✓ Yes ✓ Detects
🏆
When to Use Floyd-Warshall vs Johnson's

For dense graphs (E ≈ n²), Floyd-Warshall's Θ(n³) matches Johnson's complexity and is far simpler to implement. For sparse graphs (E ≪ n²), Johnson's algorithm wins significantly. CLRS recommends Floyd-Warshall for dense graphs and Johnson's for sparse ones.


Section 08

Negative Cycles — Detection & Why They Matter

The Infinite Money Glitch
Imagine cities A, B, C with toll roads. You pay £5 to go A→B, receive £8 refund on B→C (a subsidy), then pay £2 on C→A. Net result: every time you complete the loop, you gain £1. This is a negative-weight cycle. A shortest path algorithm asked "what's the cheapest A→A path?" would loop forever, getting cheaper each time — the concept of "shortest path" breaks down entirely.
⚠️ Graph with Negative Cycle
EdgeWeight
A → B5
B → C−8
C → A2
Cycle sum−1 ❌
✅ Detection After Floyd-Warshall
CheckValue
D[A][A]−1 (negative!)
D[B][B]−1 (negative!)
D[C][C]−1 (negative!)
ConclusionNegative cycle detected — raise error

CLRS Theorem 25.1 states: after running Floyd-Warshall, the graph contains a negative-weight cycle if and only if there exists a vertex i such that D[i][i] < 0. Simply checking the diagonal is sufficient.


Section 09

Visual Diagram — The DP "Relay" Concept

The diagram below illustrates the key insight: at step k, we test whether routing through vertex k as a relay improves any existing path. The purple arc shows the "via k" route.

📊 The Two-Choice Comparison at Step k
OPTION A: Direct path i→j OPTION B: Via relay k i j d[i][j] i k j d[i][k] d[k][j] d[i][j] = min( d[i][j], d[i][k] + d[k][j] ) ← Pick whichever is smaller, update if right side wins
At each step k: For every source i and destination j, compare the existing shortest path (left panel, dashed red arrow) against routing through vertex k (right panel, purple arcs). Update D[i][j] if the relay route is cheaper.

Section 10

Real-World Use Cases & Benefits

🌍
Network Routing
Telecom / IP Networks
BGP route reflectors use APSP to pre-compute routing tables between all autonomous systems. Floyd-Warshall's O(n³) is acceptable for AS-level topology (hundreds of nodes, not millions).
🗺️
Navigation & GPS
Transit Maps / Offline Routing
Offline GPS devices pre-compute all-pairs distances for city road networks. City subway maps (typically <500 stations) run Floyd-Warshall once at deployment to enable instant query responses.
💱
Currency Arbitrage Detection
FinTech / Trading
Convert exchange rates to log-weights. A negative cycle in the resulting graph reveals an arbitrage opportunity (buying/selling currencies in a loop for profit). Floyd-Warshall detects it in O(n³).
🎮
Game AI Pathfinding
Strategy / RPG Games
For games with small, fixed maps (<500 nodes), pre-computing all-pairs distances allows NPCs to find optimal routes instantly at query time, eliminating per-query A* searches during gameplay.
🧬
Bioinformatics
Protein Interaction Networks
Computing graph-theoretic centrality measures (betweenness, closeness) in protein interaction networks requires all-pairs distances. Floyd-Warshall handles small biological networks efficiently.
📡
Transitive Closure
Databases / Reachability
The Transitive Closure variant (replace min/+ with OR/AND) determines reachability: is vertex j reachable from vertex i at all? Used in compiler dependency analysis and database query optimisation.

Section 11

Bonus: Transitive Closure (CLRS §25.2)

CLRS Section 25.2 presents a variant that computes the transitive closure of a directed graph — answering "can vertex j be reached from vertex i?" for all pairs. The structure is identical to Floyd-Warshall but uses Boolean algebra instead of arithmetic.

def transitive_closure(adj):
    # adj: n×n adjacency matrix (True/False or 1/0)
    # Returns: n×n reachability matrix T
    n = len(adj)
    # T[i][j] = True if j is reachable from i
    T = [[adj[i][j] or (i == j) for j in range(n)] for i in range(n)]

    for k in range(n):          # relay vertex k
        for i in range(n):
            for j in range(n):
                # Reachable directly OR via k
                T[i][j] = T[i][j] or (T[i][k] and T[k][j])
    return T

# Example: 4-vertex DAG
adj = [
    [False, True,  False, False],  # 1→2
    [False, False, True,  False],  # 2→3
    [False, False, False, True ],  # 3→4
    [False, False, False, False],  # no outgoing
]
T = transitive_closure(adj)
print(f"Can 1 reach 4? {T[0][3]}")  # True: 1→2→3→4
print(f"Can 4 reach 1? {T[3][0]}")  # False: no back edges
OUTPUT
Can 1 reach 4? True Can 4 reach 1? False
🔁
Key Substitution: min/+ → or/and

In standard Floyd-Warshall, the "semiring" is (min, +). For transitive closure, we switch to the Boolean semiring (or, and). Both operate over the same three-nested-loop structure — the algorithm is identical, only the operators change. This generalisation is discussed in CLRS as "algebraic paths."


Section 12

CLRS Chapter 25 Context — Where Floyd-Warshall Fits

§24
Single-Source Shortest Paths
Dijkstra (non-negative weights) and Bellman-Ford (any weights, one source). Foundation for understanding relaxation.
§25.1
Shortest Paths and Matrix Multiplication
An O(n³ log n) APSP method using repeated squaring of the weight matrix. Historical context — slower than Floyd-Warshall but illuminates the matrix structure.
§25.2
Floyd-Warshall Algorithm ← You Are Here
The elegant O(n³) DP algorithm. Also covers predecessor matrices and negative cycle detection. The recommended algorithm for dense graphs.
§25.3
Johnson's Algorithm for Sparse Graphs
Reweights edges using Bellman-Ford, then runs Dijkstra from each source. O(n² log n + nE) — best for sparse graphs where E ≪ n².
§26
Maximum Flow (Next Chapter)
All-pairs shortest paths feed into max-flow algorithms, particularly for finding augmenting paths in flow networks with costs.

Section 13

Golden Rules — Floyd-Warshall in Practice

⚡ Floyd-Warshall — Non-Negotiable Rules
1
Always initialise D[i][i] = 0 (zero for self-loops). If D[i][i] becomes negative after the algorithm, a negative-weight cycle exists — check this before returning results.
2
Use math.inf (not 999 or 99999) for missing edges. Using a finite sentinel can cause arithmetic overflow: INF + INF becomes a large negative number in some languages, producing wrong updates.
3
In-place update is safe. Even though the CLRS textbook uses n+1 separate matrices D⁰ through Dⁿ, the in-place single-matrix version is provably correct (CLRS Exercise 25.2-4). The indices work out because no cell is both the source and the update in the same iteration.
4
Do not use on graphs with n > ~2000 vertices without careful optimisation. At n=2000, Floyd-Warshall requires 8×10⁹ operations. For large sparse graphs, use Johnson's algorithm instead (O(n² log n + nE)).
5
Floyd-Warshall handles negative edges but NOT negative cycles. If your problem domain can have negative cycles (e.g. currency arbitrage), always add the diagonal check after the algorithm. In such cases, "no valid shortest path" is the correct output — not a garbage distance value.
6
For transitive closure, use bitwise operations. Replace the inner loop with T[i] |= T[k] using Python integers as bitmasks (each row is an integer, one bit per vertex). Reduces constant factor by a factor of 64 on 64-bit systems.
7
The outer loop variable k must be the innermost in the conceptual recurrence but the outermost in the actual code. This is a common implementation mistake: if you make k the inner loop, you get wrong results because you're using partially-updated values from the same k-iteration.

Section 14

Algorithm Comparison — APSP Methods

Property Floyd-Warshall Johnson's Algorithm n × Bellman-Ford
Time complexity Θ(n³) O(n² log n + nE) O(n²·E)
Negative edges ✓ Handles ✓ Handles (reweighting) ✓ Handles
Negative cycle detection ✓ D[i][i] < 0 ✓ via Bellman-Ford phase ✓ per-source detection
Best for graph type Dense (E ≈ n²) Sparse (E ≪ n²) Rarely preferred
Implementation complexity Very simple — 3 loops Moderate — two phases Moderate — nested
Space Θ(n²) Θ(n²) + O(E) O(n + E)
Transitive closure variant Trivial (or/and) Not natural Not natural
🎯
The Practitioner's Decision Rule

Use Floyd-Warshall when: (a) you need all-pairs distances, (b) the graph is dense or small (n < 500), (c) there may be negative edges. The three-line inner loop is so simple to implement and so cache-friendly that it often outperforms theoretically better algorithms on real hardware for graphs of practical size.