Data Structure 📂 Sorting & Order Statistics · 3 of 10 41 min read

Heapsort, Visualized: A Hands-On Binary Heap Sort Guide

A practical, hands-on heapsort tutorial built around an interactive animation that shows the binary heap as a tree, the array that stores it, and the highlighted Python line all at once. Covers the heap-as-array foundation, build-then-extract phases, real benefits and use cases, why it's a guaranteed O(n log n) in-place sort, and how it compares to quick, merge, and insertion sort.

Section 01

The Story That Makes Heapsort Click

🚑 The Emergency Room Triage Board
In a busy ER, patients are not treated in arrival order — they are treated by urgency. The triage board is organised so the single most critical patient is always at the top, instantly visible. When that patient is taken into surgery, the board re-organises itself so the next-most-critical patient bubbles up to the top.

That self-organising board is a max-heap: a structure that always keeps the largest item at the root, ready to grab in one step, and quickly repairs itself after each removal.

Heapsort is simply this idea applied to sorting: turn the array into a max-heap, then repeatedly pull the maximum off the top and place it at the end. Pull the biggest, park it, repair, repeat — until everything is in order.

Heapsort has two phases. First it rearranges the array into a binary max-heap in place. Then it repeatedly swaps the root (the maximum) with the last heap element, shrinks the heap by one, and sifts down the new root to restore the heap. Each extracted maximum lands in its final sorted position at the tail.

🧠
The Core Insight

A binary heap gives you the maximum in O(1) and repairs itself in O(log n). Do that n times and you get a sort that is guaranteed O(n log n) in every case — with no extra memory. That guarantee, not raw speed, is heapsort's whole reason to exist.


Section 02

The Foundation — A Binary Heap Lives in an Array

A binary heap is a complete binary tree (every level full, except possibly the last, filled left to right). Because it is complete, you never need real tree pointers — you can store it in a plain array and compute parent/child positions with arithmetic. In a max-heap, every parent is ≥ its children.

9i=0 7i=1 8i=2 3i=3 5i=4 2i=5 1i=6

The tree above is stored as the array [9, 7, 8, 3, 5, 2, 1]. The position arithmetic is the magic that removes the need for pointers:

Left Child
2·i + 1
The left child of node i sits at index 2i+1 (if that index is inside the heap).
Right Child
2·i + 2
The right child of node i sits at index 2i+2 (if that index is inside the heap).
Parent
(i − 1) // 2
The parent of node i is at floor((i−1)/2). The root (i=0) has no parent.
Last Non-Leaf
n // 2 − 1
Everything past this index is a leaf, so building the heap starts here and walks back to the root.
🔑
Why an Array, Not Pointers?

Storing a complete tree in an array means zero pointer overhead and perfect packing. Navigation is just multiply-and-add. This is exactly why heaps are the standard way to implement a priority queue — and why heapsort can run fully in place.


Section 03

The Two Phases of Heapsort

🌲 PHASE 1 — BUILD A MAX-HEAP
1a
Start at the last non-leaf node (index n//2 − 1) and walk backwards to the root.
1b
At each node, heapify (sift down): if a child is bigger, swap with the largest child and keep sinking until the heap property holds.
🎯 PHASE 2 — EXTRACT THE MAX, REPEATEDLY
2a
Swap the root (the maximum) with the last element of the heap.
2b
Shrink the heap by one — the element just parked at the tail is now in its final sorted place.
2c
Heapify the root of the smaller heap to bring the next maximum to the top. Repeat until the heap has one element.

Section 04

Hands-On: Watch the Tree, the Array, and the Code Together

Press Play (or step manually). Three things move in lock-step: the binary heap tree (top-left), the array that stores it (below the tree), and the highlighted Python line (right). The status bar says exactly what the active line is doing. Watch the blue comparisons inside heapify, the red swaps, and the sorted-green tail growing from the right as each maximum is parked.

⚡ LIVE TRACE — THIS LINE IS DOING THIS TASK
PHASE: BUILD HEAP heap size = 7
Binary heap (tree view)
Array (how the heap is stored)
1def heapify(arr, n, i):
2 largest = i
3 left = 2 * i + 1
4 right = 2 * i + 2
5 if left < n and arr[left] > arr[largest]:
6 largest = left
7 if right < n and arr[right] > arr[largest]:
8 largest = right
9 if largest != i:
10 arr[i], arr[largest] = arr[largest], arr[i]
11 heapify(arr, n, largest)
 
12def heap_sort(arr):
13 n = len(arr)
14 for i in range(n // 2 - 1, -1, -1):
15 heapify(arr, n, i)
16 for i in range(n - 1, 0, -1):
17 arr[0], arr[i] = arr[i], arr[0]
18 heapify(arr, i, 0)
19 return arr
Press Play or Step ▶ to begin.
Speed Step 0 / 0
👀
How to Read the Animation

In the build phase, watch each subtree get fixed from the bottom up until the biggest value reaches the root. In the sort phase, the root (max) swaps to the tail, turns green, leaves the tree, and the new root sinks back down via heapify. The dashed node is the current i; the amber node is the running largest.


Section 05

The Full Python Implementation

Here is the complete algorithm — heapify (sift down) plus the two-phase heap_sort driver.

def heapify(arr, n, i):
    # Sift node i down so the subtree of size n becomes a max-heap.
    largest = i
    left  = 2 * i + 1
    right = 2 * i + 2

    if left < n and arr[left] > arr[largest]:
        largest = left
    if right < n and arr[right] > arr[largest]:
        largest = right

    if largest != i:                      # a child was bigger
        arr[i], arr[largest] = arr[largest], arr[i]
        heapify(arr, n, largest)        # keep sinking


def heap_sort(arr):
    n = len(arr)

    # Phase 1: build a max-heap (start at last non-leaf).
    for i in range(n // 2 - 1, -1, -1):
        heapify(arr, n, i)

    # Phase 2: repeatedly move the max to the end.
    for i in range(n - 1, 0, -1):
        arr[0], arr[i] = arr[i], arr[0]    # park current max
        heapify(arr, i, 0)              # repair the smaller heap
    return arr


# --- driver ---
data = [5, 3, 8, 1, 9, 2, 7]
print("Before:", data)
heap_sort(data)
print("After: ", data)
OUTPUT
Before: [5, 3, 8, 1, 9, 2, 7] After: [1, 2, 3, 5, 7, 8, 9]

A Verbose Version That Shows Both Phases

def heap_sort_verbose(arr):
    n = len(arr)
    for i in range(n // 2 - 1, -1, -1):
        heapify(arr, n, i)
    print("Max-heap built:", arr)

    for i in range(n - 1, 0, -1):
        arr[0], arr[i] = arr[i], arr[0]
        heapify(arr, i, 0)
        print(f"Parked {arr[i]:>2} at end -> {arr}")
    return arr

heap_sort_verbose([5, 3, 8, 1, 9, 2, 7])
OUTPUT
Max-heap built: [9, 5, 8, 1, 3, 2, 7] Parked 9 at end -> [8, 5, 7, 1, 3, 2, 9] Parked 8 at end -> [7, 5, 2, 1, 3, 8, 9] Parked 7 at end -> [5, 3, 2, 1, 7, 8, 9] Parked 5 at end -> [3, 1, 2, 5, 7, 8, 9] Parked 3 at end -> [2, 1, 3, 5, 7, 8, 9] Parked 2 at end -> [1, 2, 3, 5, 7, 8, 9]

Section 06

Why Use Heapsort? The Real Benefits

🛡️
Guaranteed O(n log n)
no worst-case blow-up
Best, average, and worst case are all O(n log n). Unlike Quick Sort, there is no pathological input that degrades it to O(n²) — the running time is rock solid no matter what data arrives.
📐
In-Place, O(1) Memory
no auxiliary array
It sorts inside the original array. With an iterative heapify it needs only constant extra space — a major edge over Merge Sort, which needs O(n) scratch memory.
🎯
Built on the Heap You Already Need
priority queue core
The binary heap powering heapsort is the same structure behind priority queues, Dijkstra's algorithm, and task schedulers. Learn it once, reuse it everywhere.
The Best-of-Both-Worlds Slot

Heapsort sits exactly where you want a worst-case guarantee like Merge Sort but with constant memory like Quick Sort. That combination is why it is the safety-net fallback inside introsort (the algorithm behind many standard library sort() functions): start with Quick Sort, and if recursion goes too deep, switch to heapsort to lock in O(n log n).


Section 07

When to Use It — and When Not To

Worst-Case Guarantees Matter
Real-time systems, embedded controllers, and security-sensitive code where an O(n²) Quick Sort worst case is unacceptable.
predictable timing
Tight Memory Budgets
When O(n) scratch space for Merge Sort is too expensive but you still need an n log n guarantee, heapsort's O(1) footprint wins.
memory-constrained
Top-k / Order Statistics
A heap naturally surfaces the largest (or smallest) elements, making it ideal for "find the k biggest" without a full sort.
priority queues
Raw Speed on Random Data
Quick Sort is typically 2–3× faster in practice. Heapsort's scattered memory access causes many cache misses.
poor cache locality
Stability Required
Heapsort is not stable — far-apart swaps reorder equal keys. If relative order of equal items must hold, use a stable sort.
equal-key order matters
Linked Lists / External Sort
Heaps need O(1) random access, so they are poor for linked lists, and the lack of locality hurts on data that does not fit in memory.
no random access

Section 08

Time & Space Complexity

CaseWhyTimeSpace
BestBuild O(n) + n extractions × O(log n)O(n log n)O(1)
AverageSame structure regardless of orderO(n log n)O(1)
WorstNo input can force more than log-n sift-downsO(n log n)O(1)
Stable?Far-apart swaps reorder equal keysNo (not stable)
Adaptive?Sorted input does not reduce the workNo (not adaptive)
📐
Why Building the Heap Is Only O(n)

It is tempting to think build-heap costs O(n log n), but it does not. Most nodes are near the bottom and sift down only a step or two; only the few near the root can sift the full log n. Summing the work across all levels converges to O(n). The O(n log n) total comes from phase 2 — the n extractions, each costing O(log n).


Section 09

One Extraction, Step by Step

01
Max Is at the Root
Because it is a max-heap, arr[0] is the largest remaining element — available in O(1).
02
Swap Root to the Tail
arr[0], arr[i] = arr[i], arr[0] — the max moves to the last heap slot, its final sorted home.
03
Shrink the Heap
The parked element leaves the heap. The heap size drops by one; that index is now sorted.
04
Sift the New Root Down
heapify(arr, i, 0) — the small value now at the root sinks past its larger children until the heap property is restored and the next max surfaces.

Section 10

Heapsort vs Its Cousins

PropertyHeapsortQuick SortMerge SortInsertion Sort
Best caseO(n log n)O(n log n)O(n log n)O(n)
Worst caseO(n log n)O(n²)O(n log n)O(n²)
Extra spaceO(1)O(log n) stackO(n)O(1)
Stable?NoNoYesYes
In practiceSlower (cache misses)Usually fastestFast, needs memoryGreat when tiny
Best for…Guaranteed bound + low memoryGeneral-purpose speedStable sort, linked listsSmall / nearly-sorted
🏆
The Practitioner's Take

Heapsort is rarely the fastest sort, but it is the most dependable: an ironclad O(n log n) ceiling with O(1) memory. Use it when a blown worst case is unacceptable or memory is scarce — and reach for the heap structure itself whenever you need a priority queue.


Section 11

Golden Rules

🌲 Heapsort — Things Worth Memorising
1
A binary heap is a complete tree stored in an array. Children of i are 2i+1 and 2i+2; the parent is (i−1)//2. No pointers needed.
2
Build-heap starts at the last non-leaf node, n//2 − 1, and walks backward to the root. Leaves are already trivial heaps.
3
heapify always compares a node against the larger of its two children and sinks the smaller value down. Forgetting to check both children is the classic bug.
4
Building the heap is O(n); the n extractions cost O(n log n) total. The guarantee holds for every input — there is no worst case to fear.
5
Use an iterative heapify to keep space at true O(1). The recursive version is clearer but adds an O(log n) call stack.
6
Heapsort is not stable and not adaptive, and its cache behaviour makes it slower than Quick Sort in practice. Pick it for guarantees and memory, not raw speed.