Arrays & Matrices in Algorithms (CLRS)

Section 01

The Story That Explains Arrays & Matrices

📖 Real World Analogy

The Wall of Numbered Lockers

Picture a long corridor with a wall of lockers, each one identical, each bolted right next to the last with no gaps. Every locker has a number painted on it: 0, 1, 2, 3… If a friend says “grab whatever is in locker 7”, you don’t walk past lockers 0 through 6 counting as you go — you stride straight to locker 7, because you know exactly where it is. The wall starts at a fixed spot, every locker is the same width, so locker 7 sits at start + 7 × width.

That single idea — identical cells packed in a row, addressable by a number — is the entire reason arrays are the most important data structure in computing. Stack a second wall of lockers on top, then a third, and you have a matrix: a grid you address with two numbers, a row and a column.

In Introduction to Algorithms (Cormen, Leiserson, Rivest & Stein — the book most people just call “CLRS” or “Cormen”), the array is the silent workhorse behind almost every algorithm. Sorting, dynamic programming, divide-and-conquer, graph adjacency — all of it sits on top of arrays and their two-dimensional cousins, matrices. This hands-on tutorial builds you up from a single row of cells to Strassen’s matrix multiplication, with runnable Python at every step.

💡

The Core Insight

An array trades flexibility for speed. Because every element is the same size and stored back-to-back in memory, the computer can compute the address of any element with one multiplication and one addition — so reading arr[i] takes the same constant time whether i is 0 or 7,000.

Section 02

What Exactly Is an Array?

An array is a collection of elements of the same type, stored in contiguous memory locations, each reachable by an integer index. CLRS treats the array as a primitive: A[i] is assumed to be a Θ(1) operation. The magic is the address formula.

Element Address

addr(A[i]) = base + i × size

Why indexing is O(1): one multiply, one add — no scanning, no counting.

2D Row-Major Address

addr(A[i][j]) = base + (i×cols + j) × size

A matrix is flattened into one long array; the row index is scaled by the row width.

🔗 Animated — Contiguous Memory & O(1) Random Access

No matter which index you request, the address is computed in a single step — this is why arrays beat linked lists for lookups.

Section 03

Array Operations & Their Cost

Not every operation on an array is cheap. Reading is instant, but anything that shifts elements pays a linear price. Knowing this table by heart is the difference between writing fast code and slow code.

Operation	Example	Time	Why
Access by index	`arr[i]`	O(1)	Direct address computation
Update by index	`arr[i] = x`	O(1)	Overwrite one known cell
Search (unsorted)	`x in arr`	O(n)	May scan every element
Search (sorted)	binary search	O(log n)	Halve the range each step
Append at end	`arr.append(x)`	O(1)*	Amortised; occasional resize
Insert at front/middle	`arr.insert(0, x)`	O(n)	Every later element shifts right
Delete front/middle	`del arr[0]`	O(n)	Every later element shifts left
Traverse all	`for x in arr`	O(n)	Visit each element once

Hands-On: Array Basics in Python

# Create an array (a Python list) of integers
arr = [17, 4, 29, 8, 33, 12]

# Random access by index -- O(1), the array's superpower
print(arr[3])           # 8  -> jumps straight to position 3

# Traverse every element -- O(n)
for i in range(len(arr)):
    print(i, arr[i])

# Append at the end -- amortised O(1)
arr.append(99)

# Insert at the front -- O(n), everything shifts right
arr.insert(0, 5)

# Delete from the middle -- O(n), everything shifts left
del arr[2]

print(arr)             # [5, 17, 8, 33, 12, 99]

OUTPUT

8 0 17 1 4 2 29 3 8 4 33 5 12 [5, 17, 8, 33, 12, 99]

⚡

Append at the end, never at the front

If you find yourself doing arr.insert(0, x) inside a loop, you have accidentally written an O(n²) algorithm. Build lists by appending to the end, then reverse once at the finish if order matters — or reach for a collections.deque.

Section 04

From a Row to a Grid — Matrices

A matrix is a two-dimensional array: rows stacked on rows. We address a cell with two indices, A[i][j] — row i, column j. But memory is one-dimensional, so the grid must be flattened. There are two ways to do it, and the choice has real performance consequences.

✓ Row-Major (C, Python, Java)

Order in memory	Cell
1st	A[0][0]
2nd	A[0][1]
3rd	A[0][2]
4th	A[1][0]
…	row by row

⚠ Column-Major (Fortran, MATLAB, R)

Order in memory	Cell
1st	A[0][0]
2nd	A[1][0]
3rd	A[2][0]
4th	A[0][1]
…	column by column

🧮 Animated — Row-Major Traversal: Grid → Linear Memory

Row-major flattening: index = i × cols + j. Walking a matrix in row order touches memory sequentially — which the CPU cache loves.

⚠️

Cache Locality Is Not Free Advice

In a row-major language, looping for i: for j: A[i][j] reads memory in order and runs fast. Swapping the loops to for j: for i: jumps across memory and can be several times slower on a large matrix — same Big-O, very different wall-clock time.

Section 05

Creating Matrices in Python — and the Trap

The single most common bug beginners hit with 2D arrays in Python is the shared-row trap. It looks harmless and silently corrupts your data.

# WRONG -- all rows are the SAME list object in memory
grid = [[0] * 3] * 2
grid[0][0] = 9
print(grid)        # [[9, 0, 0], [9, 0, 0]]  <- both rows changed!

# RIGHT -- each row is built as its own independent list
grid = [[0] * 3 for _ in range(2)]
grid[0][0] = 9
print(grid)        # [[9, 0, 0], [0, 0, 0]]  correct

OUTPUT

[[9, 0, 0], [9, 0, 0]] [[9, 0, 0], [0, 0, 0]]

💥

Why [[0]*3]*2 Breaks

The * 2 does not copy the inner list — it stores the same reference twice. Mutating one row mutates “both” because there is only one row underneath. Always build rows with a comprehension: [[0]*c for _ in range(r)].

Section 06

Traversing & Transforming a Matrix

matrix = [
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12],
]

rows = len(matrix)
cols = len(matrix[0])

# Row-major traversal -- cache-friendly, O(rows x cols)
for i in range(rows):
    for j in range(cols):
        print(matrix[i][j], end=' ')
    print()

# Transpose -- swap rows and columns, O(rows x cols)
transpose = [[matrix[i][j] for i in range(rows)]
             for j in range(cols)]

# The Pythonic one-liner transpose using zip
transpose = [list(row) for row in zip(*matrix)]

print(transpose)

OUTPUT

1 2 3 4 5 6 7 8 9 10 11 12 [[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]

Section 07

Matrix Multiplication — The Heart of It

Multiplying an n × p matrix by a p × m matrix produces an n × m result. Each output cell C[i][j] is the dot product of row i of A with column j of B. The rule to memorise: the inner dimensions must match.

Definition

C[i][j] = Σ_k A[i][k] × B[k][j]

Sum over the shared dimension k. One cell = one dot product.

Naive Cost (CLRS)

T(n) = Θ(n³)

n² output cells, each costing n multiply-adds → the classic triple loop.

✖ Animated — One Output Cell Is One Dot Product

The amber scanners march in lockstep across A’s row and down B’s column, accumulating the dot product into a single result cell.

Hands-On: Naive Matrix Multiplication

# SQUARE-MATRIX-MULTIPLY style triple loop (CLRS Ch. 4)
def matrix_multiply(A, B):
    n = len(A)             # rows of A
    p = len(B)             # shared dimension (cols of A == rows of B)
    m = len(B[0])          # cols of B

    # Result C is n x m, initialised with zeros
    C = [[0] * m for _ in range(n)]

    for i in range(n):
        for j in range(m):
            total = 0
            for k in range(p):
                total += A[i][k] * B[k][j]
            C[i][j] = total
    return C

A = [[2, 3, 4],
     [1, 0, 5]]
B = [[5, 7],
     [1, 2],
     [6, 0]]

print(matrix_multiply(A, B))   # [[37, 20], [35, 7]]

OUTPUT

[[37, 20], [35, 7]]

Section 08

Divide & Conquer — Strassen’s Algorithm

📖 Story

Buying Eight Things, Paying for Seven

For two centuries everyone assumed multiplying matrices had to cost Θ(n³). Split each matrix into four quadrants and the obvious recursive method needs eight smaller multiplications — no improvement at all. Then in 1969 Volker Strassen found an algebraic trick: by combining the quadrants cleverly, you can get the same answer with only seven multiplications and a few extra additions. Trading one multiplication for some cheap additions, applied recursively, drops the exponent from 3 to about 2.81.

Partition into Quadrants

Split each n × n matrix into four (n/2) × (n/2) blocks: A → A11, A12, A21, A22 and likewise for B.

Form 10 Sums/Differences

Create ten intermediate matrices by adding and subtracting the quadrants — these are cheap Θ(n²) operations.

7 Recursive Products

Compute exactly seven products P1…P7 by recursion — not eight. This is where the saving lives.

Recombine into C

Add and subtract the seven products to assemble the four quadrants of the result C.

# Strassen's 7 products for 2x2 blocks (CLRS Ch. 4)
# A = [[a, b],      B = [[e, f],
#      [c, d]]           [g, h]]
def strassen_2x2(a, b, c, d, e, f, g, h):
    p1 = a * (f - h)
    p2 = (a + b) * h
    p3 = (c + d) * e
    p4 = d * (g - e)
    p5 = (a + d) * (e + h)
    p6 = (b - d) * (g + h)
    p7 = (a - c) * (e + f)

    c11 = p5 + p4 - p2 + p6
    c12 = p1 + p2
    c21 = p3 + p4
    c22 = p1 + p5 - p3 - p7
    return [[c11, c12], [c21, c22]]

print(strassen_2x2(1, 2, 3, 4, 5, 6, 7, 8))
# [[19, 22], [43, 50]]  -- same as the naive product

OUTPUT

[[19, 22], [43, 50]]

Method	Recurrence	Complexity	When It Wins
Naive triple loop	T(n) = 8T(n/2) + Θ(n²)	Θ(n³)	Small matrices, simplicity
Strassen	T(n) = 7T(n/2) + Θ(n²)	Θ(n^lg 7) ≈ Θ(n^2.81)	Large dense matrices

⚠️

Strassen Is Not a Free Lunch

The hidden constant is large, recursion adds overhead, and the extra additions can hurt numerical stability. In practice Strassen only beats the naive method on big matrices, and real libraries (BLAS, NumPy) usually use cache-tuned Θ(n³) kernels instead. Strassen matters because it proved Θ(n³) was not a hard floor.

Section 09

The Maximum Subarray Problem

📖 Story

The Best Window to Have Held the Stock

CLRS introduces divide-and-conquer with a money question: given the daily change in a stock’s price, which contiguous stretch of days would have earned the most? That is the maximum subarray — the contiguous slice with the largest sum. The brute-force answer checks every pair of endpoints in Θ(n²). Divide-and-conquer gets it to Θ(n lg n), and Kadane’s clever scan finishes it in a single Θ(n) pass.

# Kadane's Algorithm -- maximum subarray sum in O(n)
def max_subarray(nums):
    best = nums[0]              # best sum seen anywhere so far
    current = nums[0]           # best sum ending at the current index

    for x in nums[1:]:
        # extend the running slice, or start fresh at x
        current = max(x, current + x)
        best = max(best, current)
    return best

changes = [-2, 1, -3, 4, -1, 2, 1, -5, 4]
print(max_subarray(changes))   # 6  -> the slice [4, -1, 2, 1]

OUTPUT

Approach	Idea	Time	Space
Brute force	Test every start/end pair	O(n²)	O(1)
Divide & conquer (CLRS)	Left / right / crossing	O(n log n)	O(log n)
Kadane (DP)	One running scan	O(n)	O(1)

Section 10

Matrix-Chain Multiplication (Dynamic Programming)

📖 Story

Same Answer, Wildly Different Bill

Matrix multiplication is associative: (A×B)×C equals A×(B×C). The answer is identical — but the cost is not. Depending on the order you bracket the chain, you might do 9,375 scalar multiplications or 11,875 for the very same product. CLRS Chapter 15 uses this as the flagship example of dynamic programming: find the cheapest parenthesisation without trying all of the exponentially many orderings.

# Matrix-Chain Order -- minimum scalar multiplications (CLRS Ch. 15)
def matrix_chain_order(dims):
    n = len(dims) - 1                  # number of matrices
    # m[i][j] = min cost to multiply matrices i..j
    m = [[0] * (n + 1) for _ in range(n + 1)]

    for length in range(2, n + 1):       # chain length
        for i in range(1, n - length + 2):
            j = i + length - 1
            m[i][j] = float('inf')
            for k in range(i, j):
                cost = (m[i][k] + m[k + 1][j]
                        + dims[i - 1] * dims[k] * dims[j])
                if cost < m[i][j]:
                    m[i][j] = cost
    return m[1][n]

# A1(30x35) A2(35x15) A3(15x5) A4(5x10)
dims = [30, 35, 15, 5, 10]
print(matrix_chain_order(dims))   # 9375  (vs 11875 for a naive bracketing)

OUTPUT

9375

📐

The DP Pattern

Build a 2D table m[i][j] bottom-up: shortest chains first, then progressively longer ones, each reusing already-solved sub-chains. This fills an n × n matrix in Θ(n³) time — a perfect illustration of why arrays and matrices underpin dynamic programming itself.

Section 11

When Arrays & Matrices Shine — and When They Don’t

✅

Fast Random Access

Need element 7,000 instantly? Arrays give O(1) indexing that no linked structure can match.

lookups, hashing buckets, DP tables

✅

Dense Numeric Grids

Images, simulations, and linear algebra live in matrices. Contiguous storage feeds the CPU cache beautifully.

images, ML tensors, physics sims

✅

Algorithm Scaffolding

Sorting, prefix sums, sliding windows, and DP all assume O(1) indexed storage underneath.

dynamic programming, two-pointer

❌

Frequent Front Insertions

Inserting or deleting at the front is O(n) because everything shifts. Use a deque or linked list instead.

queues, undo stacks at head

❌

Unknown / Sparse Sizes

A mostly-empty 10000 × 10000 grid wastes huge memory. Use a dict or sparse matrix format.

sparse graphs, sparse matrices

❌

Constant Reordering

If elements are constantly inserted and removed mid-sequence, the shifting cost dominates. Reach for balanced trees.

ordered sets, priority structures

Section 12

Golden Rules

🧮 Arrays & Matrices — Non-Negotiable Rules

Indexing is O(1); shifting is O(n). Reading and writing arr[i] is free, but any insert or delete that is not at the end pays a linear price. Design loops with that asymmetry in mind.

Never build 2D lists with [[0]*c]*r. That shares one row object across every row. Always use [[0]*c for _ in range(r)].

Walk matrices in row-major order (for i: for j:) in Python, C, and Java. It touches memory sequentially and keeps the cache happy — same Big-O, far faster in practice.

Check the inner dimension before multiplying. An n × p times a p × m is valid only when the two p’s match. Mismatched dimensions are the number one matrix-multiplication bug.

Know your three matrix-multiply tiers: naive Θ(n³) for clarity, Strassen Θ(n^2.81) for very large matrices, and a tuned BLAS/NumPy kernel for anything real.

Reach for Kadane before brute force. The maximum-subarray scan is O(n) and O(1) space — a textbook reminder that a smarter pass often beats checking every pair.

A DP table is just a matrix. Matrix-chain ordering and countless other dynamic-programming solutions are nothing more than filling a 2D array in the right order. Master arrays and you have already mastered the scaffolding of DP.