The Story That Makes Heapsort Click
That self-organising board is a max-heap: a structure that always keeps the largest item at the root, ready to grab in one step, and quickly repairs itself after each removal.
Heapsort is simply this idea applied to sorting: turn the array into a max-heap, then repeatedly pull the maximum off the top and place it at the end. Pull the biggest, park it, repair, repeat — until everything is in order.
Heapsort has two phases. First it rearranges the array into a binary max-heap in place. Then it repeatedly swaps the root (the maximum) with the last heap element, shrinks the heap by one, and sifts down the new root to restore the heap. Each extracted maximum lands in its final sorted position at the tail.
A binary heap gives you the maximum in O(1) and repairs itself in O(log n). Do that n times and you get a sort that is guaranteed O(n log n) in every case — with no extra memory. That guarantee, not raw speed, is heapsort's whole reason to exist.
The Foundation — A Binary Heap Lives in an Array
A binary heap is a complete binary tree (every level full, except possibly the last, filled left to right). Because it is complete, you never need real tree pointers — you can store it in a plain array and compute parent/child positions with arithmetic. In a max-heap, every parent is ≥ its children.
The tree above is stored as the array [9, 7, 8, 3, 5, 2, 1]. The position arithmetic is the magic that removes the need for pointers:
Storing a complete tree in an array means zero pointer overhead and perfect packing. Navigation is just multiply-and-add. This is exactly why heaps are the standard way to implement a priority queue — and why heapsort can run fully in place.
The Two Phases of Heapsort
Hands-On: Watch the Tree, the Array, and the Code Together
Press Play (or step manually). Three things move in lock-step: the
binary heap tree (top-left), the array that stores it
(below the tree), and the highlighted Python line (right). The status bar says
exactly what the active line is doing. Watch the blue comparisons inside heapify,
the red swaps, and the sorted-green tail growing from the right as each maximum is parked.
In the build phase, watch each subtree get fixed from the bottom up until
the biggest value reaches the root. In the sort phase, the root (max) swaps
to the tail, turns green, leaves the tree, and the new root sinks back down via
heapify. The dashed node is the current i; the amber node is the
running largest.
The Full Python Implementation
Here is the complete algorithm — heapify (sift down) plus the two-phase
heap_sort driver.
def heapify(arr, n, i):
# Sift node i down so the subtree of size n becomes a max-heap.
largest = i
left = 2 * i + 1
right = 2 * i + 2
if left < n and arr[left] > arr[largest]:
largest = left
if right < n and arr[right] > arr[largest]:
largest = right
if largest != i: # a child was bigger
arr[i], arr[largest] = arr[largest], arr[i]
heapify(arr, n, largest) # keep sinking
def heap_sort(arr):
n = len(arr)
# Phase 1: build a max-heap (start at last non-leaf).
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)
# Phase 2: repeatedly move the max to the end.
for i in range(n - 1, 0, -1):
arr[0], arr[i] = arr[i], arr[0] # park current max
heapify(arr, i, 0) # repair the smaller heap
return arr
# --- driver ---
data = [5, 3, 8, 1, 9, 2, 7]
print("Before:", data)
heap_sort(data)
print("After: ", data)
A Verbose Version That Shows Both Phases
def heap_sort_verbose(arr):
n = len(arr)
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)
print("Max-heap built:", arr)
for i in range(n - 1, 0, -1):
arr[0], arr[i] = arr[i], arr[0]
heapify(arr, i, 0)
print(f"Parked {arr[i]:>2} at end -> {arr}")
return arr
heap_sort_verbose([5, 3, 8, 1, 9, 2, 7])
Why Use Heapsort? The Real Benefits
Heapsort sits exactly where you want a worst-case guarantee like Merge Sort
but with constant memory like Quick Sort. That combination is why it is the
safety-net fallback inside introsort (the algorithm behind many standard
library sort() functions): start with Quick Sort, and if recursion goes too
deep, switch to heapsort to lock in O(n log n).
When to Use It — and When Not To
Time & Space Complexity
| Case | Why | Time | Space |
|---|---|---|---|
| Best | Build O(n) + n extractions × O(log n) | O(n log n) | O(1) |
| Average | Same structure regardless of order | O(n log n) | O(1) |
| Worst | No input can force more than log-n sift-downs | O(n log n) | O(1) |
| Stable? | Far-apart swaps reorder equal keys | No (not stable) | |
| Adaptive? | Sorted input does not reduce the work | No (not adaptive) | |
It is tempting to think build-heap costs O(n log n), but it does not. Most nodes are near the bottom and sift down only a step or two; only the few near the root can sift the full log n. Summing the work across all levels converges to O(n). The O(n log n) total comes from phase 2 — the n extractions, each costing O(log n).
One Extraction, Step by Step
arr[0] is the largest remaining element — available in O(1).arr[0], arr[i] = arr[i], arr[0] — the max moves to the last heap slot, its final sorted home.heapify(arr, i, 0) — the small value now at the root sinks past its larger children until the heap property is restored and the next max surfaces.Heapsort vs Its Cousins
| Property | Heapsort | Quick Sort | Merge Sort | Insertion Sort |
|---|---|---|---|---|
| Best case | O(n log n) | O(n log n) | O(n log n) | O(n) |
| Worst case | O(n log n) | O(n²) | O(n log n) | O(n²) |
| Extra space | O(1) | O(log n) stack | O(n) | O(1) |
| Stable? | No | No | Yes | Yes |
| In practice | Slower (cache misses) | Usually fastest | Fast, needs memory | Great when tiny |
| Best for… | Guaranteed bound + low memory | General-purpose speed | Stable sort, linked lists | Small / nearly-sorted |
Heapsort is rarely the fastest sort, but it is the most dependable: an ironclad O(n log n) ceiling with O(1) memory. Use it when a blown worst case is unacceptable or memory is scarce — and reach for the heap structure itself whenever you need a priority queue.
Golden Rules
2i+1 and 2i+2; the parent is (i−1)//2. No
pointers needed.
n//2 − 1, and
walks backward to the root. Leaves are already trivial heaps.
heapify always compares a node against the larger of its two
children and sinks the smaller value down. Forgetting to check both children is the
classic bug.