The Story That Explains Arrays & Matrices
That single idea — identical cells packed in a row, addressable by a number — is the entire reason arrays are the most important data structure in computing. Stack a second wall of lockers on top, then a third, and you have a matrix: a grid you address with two numbers, a row and a column.
In Introduction to Algorithms (Cormen, Leiserson, Rivest & Stein — the book most people just call “CLRS” or “Cormen”), the array is the silent workhorse behind almost every algorithm. Sorting, dynamic programming, divide-and-conquer, graph adjacency — all of it sits on top of arrays and their two-dimensional cousins, matrices. This hands-on tutorial builds you up from a single row of cells to Strassen’s matrix multiplication, with runnable Python at every step.
An array trades flexibility for speed. Because every element is the same
size and stored back-to-back in memory, the computer can compute the address of
any element with one multiplication and one addition — so reading
arr[i] takes the same constant time whether i is 0 or 7,000.
What Exactly Is an Array?
An array is a collection of elements of the same type, stored in
contiguous memory locations, each reachable by an integer
index. CLRS treats the array as a primitive: A[i] is assumed
to be a Θ(1) operation. The magic is the address formula.
No matter which index you request, the address is computed in a single step — this is why arrays beat linked lists for lookups.
Array Operations & Their Cost
Not every operation on an array is cheap. Reading is instant, but anything that shifts elements pays a linear price. Knowing this table by heart is the difference between writing fast code and slow code.
| Operation | Example | Time | Why |
|---|---|---|---|
| Access by index | arr[i] | O(1) | Direct address computation |
| Update by index | arr[i] = x | O(1) | Overwrite one known cell |
| Search (unsorted) | x in arr | O(n) | May scan every element |
| Search (sorted) | binary search | O(log n) | Halve the range each step |
| Append at end | arr.append(x) | O(1)* | Amortised; occasional resize |
| Insert at front/middle | arr.insert(0, x) | O(n) | Every later element shifts right |
| Delete front/middle | del arr[0] | O(n) | Every later element shifts left |
| Traverse all | for x in arr | O(n) | Visit each element once |
Hands-On: Array Basics in Python
# Create an array (a Python list) of integers
arr = [17, 4, 29, 8, 33, 12]
# Random access by index -- O(1), the array's superpower
print(arr[3]) # 8 -> jumps straight to position 3
# Traverse every element -- O(n)
for i in range(len(arr)):
print(i, arr[i])
# Append at the end -- amortised O(1)
arr.append(99)
# Insert at the front -- O(n), everything shifts right
arr.insert(0, 5)
# Delete from the middle -- O(n), everything shifts left
del arr[2]
print(arr) # [5, 17, 8, 33, 12, 99]
If you find yourself doing arr.insert(0, x) inside a loop, you have accidentally
written an O(n²) algorithm. Build lists by appending to the end, then
reverse once at the finish if order matters — or reach for a collections.deque.
From a Row to a Grid — Matrices
A matrix is a two-dimensional array: rows stacked on rows. We address a cell
with two indices, A[i][j] — row i, column j. But
memory is one-dimensional, so the grid must be flattened. There are two ways to do it,
and the choice has real performance consequences.
| Order in memory | Cell |
|---|---|
| 1st | A[0][0] |
| 2nd | A[0][1] |
| 3rd | A[0][2] |
| 4th | A[1][0] |
| … | row by row |
| Order in memory | Cell |
|---|---|
| 1st | A[0][0] |
| 2nd | A[1][0] |
| 3rd | A[2][0] |
| 4th | A[0][1] |
| … | column by column |
Row-major flattening: index = i × cols + j. Walking a matrix in row order touches memory sequentially — which the CPU cache loves.
In a row-major language, looping for i: for j: A[i][j] reads memory in order
and runs fast. Swapping the loops to for j: for i: jumps across memory and can
be several times slower on a large matrix — same Big-O, very different
wall-clock time.
Creating Matrices in Python — and the Trap
The single most common bug beginners hit with 2D arrays in Python is the shared-row trap. It looks harmless and silently corrupts your data.
# WRONG -- all rows are the SAME list object in memory
grid = [[0] * 3] * 2
grid[0][0] = 9
print(grid) # [[9, 0, 0], [9, 0, 0]] <- both rows changed!
# RIGHT -- each row is built as its own independent list
grid = [[0] * 3 for _ in range(2)]
grid[0][0] = 9
print(grid) # [[9, 0, 0], [0, 0, 0]] correct
[[0]*3]*2 Breaks
The * 2 does not copy the inner list — it stores the same reference
twice. Mutating one row mutates “both” because there is only one row underneath.
Always build rows with a comprehension: [[0]*c for _ in range(r)].
Traversing & Transforming a Matrix
matrix = [
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
]
rows = len(matrix)
cols = len(matrix[0])
# Row-major traversal -- cache-friendly, O(rows x cols)
for i in range(rows):
for j in range(cols):
print(matrix[i][j], end=' ')
print()
# Transpose -- swap rows and columns, O(rows x cols)
transpose = [[matrix[i][j] for i in range(rows)]
for j in range(cols)]
# The Pythonic one-liner transpose using zip
transpose = [list(row) for row in zip(*matrix)]
print(transpose)
Matrix Multiplication — The Heart of It
Multiplying an n × p matrix by a p × m matrix
produces an n × m result. Each output cell C[i][j] is the
dot product of row i of A with column j of B. The
rule to memorise: the inner dimensions must match.
The amber scanners march in lockstep across A’s row and down B’s column, accumulating the dot product into a single result cell.
Hands-On: Naive Matrix Multiplication
# SQUARE-MATRIX-MULTIPLY style triple loop (CLRS Ch. 4)
def matrix_multiply(A, B):
n = len(A) # rows of A
p = len(B) # shared dimension (cols of A == rows of B)
m = len(B[0]) # cols of B
# Result C is n x m, initialised with zeros
C = [[0] * m for _ in range(n)]
for i in range(n):
for j in range(m):
total = 0
for k in range(p):
total += A[i][k] * B[k][j]
C[i][j] = total
return C
A = [[2, 3, 4],
[1, 0, 5]]
B = [[5, 7],
[1, 2],
[6, 0]]
print(matrix_multiply(A, B)) # [[37, 20], [35, 7]]
Divide & Conquer — Strassen’s Algorithm
# Strassen's 7 products for 2x2 blocks (CLRS Ch. 4)
# A = [[a, b], B = [[e, f],
# [c, d]] [g, h]]
def strassen_2x2(a, b, c, d, e, f, g, h):
p1 = a * (f - h)
p2 = (a + b) * h
p3 = (c + d) * e
p4 = d * (g - e)
p5 = (a + d) * (e + h)
p6 = (b - d) * (g + h)
p7 = (a - c) * (e + f)
c11 = p5 + p4 - p2 + p6
c12 = p1 + p2
c21 = p3 + p4
c22 = p1 + p5 - p3 - p7
return [[c11, c12], [c21, c22]]
print(strassen_2x2(1, 2, 3, 4, 5, 6, 7, 8))
# [[19, 22], [43, 50]] -- same as the naive product
| Method | Recurrence | Complexity | When It Wins |
|---|---|---|---|
| Naive triple loop | T(n) = 8T(n/2) + Θ(n²) | Θ(n³) | Small matrices, simplicity |
| Strassen | T(n) = 7T(n/2) + Θ(n²) | Θ(nlg 7) ≈ Θ(n2.81) | Large dense matrices |
The hidden constant is large, recursion adds overhead, and the extra additions can hurt numerical stability. In practice Strassen only beats the naive method on big matrices, and real libraries (BLAS, NumPy) usually use cache-tuned Θ(n³) kernels instead. Strassen matters because it proved Θ(n³) was not a hard floor.
The Maximum Subarray Problem
# Kadane's Algorithm -- maximum subarray sum in O(n)
def max_subarray(nums):
best = nums[0] # best sum seen anywhere so far
current = nums[0] # best sum ending at the current index
for x in nums[1:]:
# extend the running slice, or start fresh at x
current = max(x, current + x)
best = max(best, current)
return best
changes = [-2, 1, -3, 4, -1, 2, 1, -5, 4]
print(max_subarray(changes)) # 6 -> the slice [4, -1, 2, 1]
| Approach | Idea | Time | Space |
|---|---|---|---|
| Brute force | Test every start/end pair | O(n²) | O(1) |
| Divide & conquer (CLRS) | Left / right / crossing | O(n log n) | O(log n) |
| Kadane (DP) | One running scan | O(n) | O(1) |
Matrix-Chain Multiplication (Dynamic Programming)
# Matrix-Chain Order -- minimum scalar multiplications (CLRS Ch. 15)
def matrix_chain_order(dims):
n = len(dims) - 1 # number of matrices
# m[i][j] = min cost to multiply matrices i..j
m = [[0] * (n + 1) for _ in range(n + 1)]
for length in range(2, n + 1): # chain length
for i in range(1, n - length + 2):
j = i + length - 1
m[i][j] = float('inf')
for k in range(i, j):
cost = (m[i][k] + m[k + 1][j]
+ dims[i - 1] * dims[k] * dims[j])
if cost < m[i][j]:
m[i][j] = cost
return m[1][n]
# A1(30x35) A2(35x15) A3(15x5) A4(5x10)
dims = [30, 35, 15, 5, 10]
print(matrix_chain_order(dims)) # 9375 (vs 11875 for a naive bracketing)
Build a 2D table m[i][j] bottom-up: shortest chains first, then progressively
longer ones, each reusing already-solved sub-chains. This fills an
n × n matrix in Θ(n³) time — a
perfect illustration of why arrays and matrices underpin dynamic programming itself.
When Arrays & Matrices Shine — and When They Don’t
Golden Rules
arr[i] is free, but any insert or delete that is not at the end pays a linear
price. Design loops with that asymmetry in mind.
[[0]*c]*r. That shares one row
object across every row. Always use [[0]*c for _ in range(r)].
for i: for j:) in Python, C,
and Java. It touches memory sequentially and keeps the cache happy — same Big-O, far
faster in practice.