Data Structure 📂 Data Structures (Core) · 6 of 6 40 min read

Heaps (Min/Max) — Complete Tutorial

A full visual tutorial on min-heaps and max-heaps covering structure, array representation, sift-up/sift-down operations, Python's heapq module, the max-heap negation trick, and classic interview patterns (k-largest, k-smallest, running median).

Section 01

The Story That Explains Heaps

Cinema Seats — Always Sit in the Best Available Spot
Imagine a cinema that has a special rule: seats are numbered by comfort score. Every time someone walks in, the usher instantly hands them the best available seat — no searching, no scanning every row. The moment that seat is taken, the next-best seat automatically floats to the top of the usher's list.

Now imagine the opposite: a horror cinema where, by law, the usher must always hand out the worst seat first — the one furthest from the screen, most cramped, freezing cold — so that premium seats are never wasted on people who leave early.

The first cinema is a Max-Heap. The second is a Min-Heap. In both cases, the magic is the same: finding the extreme element (best or worst) costs you nothing — it's always sitting at the top, ready to go. Inserting a new seat-holder and evicting the top one each cost only O(log n) — even if the cinema has a million seats.

A heap is a complete binary tree stored as an array, satisfying one invariant: every parent is either ≥ (max-heap) or ≤ (min-heap) than both its children. That single rule is what makes the top always the extreme — and the whole structure self-maintaining with a handful of swap operations.

💡
The Core Insight

A heap does not keep all elements sorted. It only guarantees the top is always the extreme. That constraint is weaker than full sorting — and that's precisely why it's faster. You pay O(log n) per insert/remove, not O(n log n) for a full sort. Use a heap whenever you only need the smallest or largest element repeatedly.


Section 02

Heap Structure & Array Representation

A heap lives in a plain Python list. The array indices encode the tree shape — no pointers needed. For a node at index i:

🌳 Index Arithmetic — The Whole Tree in Three Formulas
Parent
(i − 1) // 2 — integer-divide index by 2 after offsetting
Left Child
2i + 1 — always the left branch below node i
Right Child
2i + 2 — always the right branch below node i
Root
Always at index 0 — the min or max is always here
🗂️
Why Array, Not a Real Tree?

A complete binary tree stored in an array has zero pointer overhead. Every node's children and parent are a single arithmetic operation away. Cache performance is excellent because children are nearby in memory. This is why Python's heapq module stores everything in a plain list.


Section 03

Min-Heap — Live Animation

Watch how the heap array changes as you push and pop elements. The highlighted row shows which line of code is executing and what swap is occurring.

⚙ Heap Animator Mode: Min-Heap
Array:
OPERATION LOG
Push a value to begin.
⚠️
Heap ≠ Sorted Array

The array [3, 8, 5, 15, 12, 10, 7] is a valid min-heap — every parent ≤ both children — but it is not sorted. Elements at the same level have no ordering guarantee between them. Iterating the heap array does not give you sorted output; you must pop repeatedly for that (heap sort).


Section 04

Python heapq — Min-Heap in Practice

Python's standard library implements a min-heap only. All operations work on a plain list — the heap is the list, no wrapper object needed.

import heapq

# ── Build a heap ───────────────────────────────────────────
data = [15, 3, 10, 8, 12, 5, 7]
heapq.heapify(data)          # O(n) — transforms list in-place
print("Heap array :", data)
print("Root (min) :", data[0])  # always index 0

# ── Push a new element ─────────────────────────────────────
heapq.heappush(data, 1)       # O(log n) — sift up
print("After push 1 :", data)

# ── Pop the minimum ────────────────────────────────────────
minimum = heapq.heappop(data)  # O(log n) — sift down
print("Popped min :", minimum)
print("Heap after pop:", data)

# ── Peek without removing ──────────────────────────────────
print("Current min :", data[0])  # O(1) — just read index 0

# ── Push + Pop atomically (more efficient) ─────────────────
result = heapq.heappushpop(data, 2)
print("heappushpop(2):", result)  # returns min of (pushed, current top)
OUTPUT
Heap array : [3, 8, 5, 15, 12, 10, 7] Root (min) : 3 After push 1 : [1, 8, 3, 15, 12, 10, 7, 5] Popped min : 1 Heap after pop: [3, 8, 5, 15, 12, 10, 7] Current min : 3 heappushpop(2): 2
heapify is O(n), Not O(n log n)

Building a heap from an existing list using heapify() is O(n) — not O(n log n) as you might expect. The algorithm sifts down from the last internal node to the root. The math works out because most nodes are near the bottom of the tree where sift-down has very little work to do. Use heapify instead of pushing elements one at a time when you already have a list.


Section 05

Max-Heap in Python — The Negation Trick

Python's heapq is min-only. To simulate a max-heap, negate all values on insert and negate again on pop.

import heapq

# ── Max-heap via negation ──────────────────────────────────
max_heap = []

def max_push(h, val):
    heapq.heappush(h, -val)       # store negative

def max_pop(h):
    return -heapq.heappop(h)        # negate on way out

def max_peek(h):
    return -h[0]                    # O(1) peek

# Push cinema seat comfort scores
for score in [42, 91, 17, 68, 55]:
    max_push(max_heap, score)

print("Best seat (max):", max_peek(max_heap))

# Hand out best seats one by one
while max_heap:
    print("Seat assigned:", max_pop(max_heap))
OUTPUT
Best seat (max): 91 Seat assigned: 91 Seat assigned: 68 Seat assigned: 55 Seat assigned: 42 Seat assigned: 17
🔁
Tuples Work Too — Priority Queues

Push (priority, item) tuples to build a full priority queue. Python compares tuples lexicographically — the first element (priority) determines heap order. Negate the priority for max-heap behaviour. Use (priority, counter, item) as a tiebreaker when items aren't comparable.


Section 06

Building a Heap From Scratch — Animated Step-by-Step

This section shows exactly which comparisons and swaps happen, line-by-line, when you implement sift-up and sift-down manually.

🔬 Sift-Up Trace — step through line by line
def sift_up(heap, i):
    while i > 0:
        parent = (i - 1) // 2
        if heap[i] < heap[parent]:
            heap[i], heap[parent] = heap[parent], heap[i]
            i = parent
        else:
            break
STATE
Press ▶ to begin
HEAP
[3, 8, 5, 15, 12, 10, 2]
ACTION

Section 07

Full Min-Heap Implementation

class MinHeap:
    def __init__(self):
        self.heap = []

    def push(self, val):         # O(log n)
        self.heap.append(val)
        self._sift_up(len(self.heap) - 1)

    def pop(self):                 # O(log n)
        if not self.heap: raise IndexError("pop from empty heap")
        self._swap(0, len(self.heap) - 1)
        val = self.heap.pop()
        self._sift_down(0)
        return val

    def peek(self):                # O(1)
        return self.heap[0] if self.heap else None

    def _sift_up(self, i):
        while i > 0:
            p = (i - 1) // 2
            if self.heap[i] < self.heap[p]:
                self._swap(i, p); i = p
            else: break

    def _sift_down(self, i):
        n = len(self.heap)
        while True:
            smallest = i
            l, r = 2*i+1, 2*i+2
            if l < n and self.heap[l] < self.heap[smallest]: smallest = l
            if r < n and self.heap[r] < self.heap[smallest]: smallest = r
            if smallest != i:
                self._swap(i, smallest); i = smallest
            else: break

    def _swap(self, i, j):
        self.heap[i], self.heap[j] = self.heap[j], self.heap[i]

# ── Demo ───────────────────────────────────────────────────
mh = MinHeap()
for v in [15, 3, 10, 8, 12, 5]:
    mh.push(v)

print("Peek:", mh.peek())
print("Pop sequence:", [mh.pop() for _ in range(6)])
OUTPUT
Peek: 3 Pop sequence: [3, 5, 8, 10, 12, 15]
🎯
Pop Sequence Is Sorted!

Popping all elements from a min-heap gives them in ascending sorted order. This is the basis of Heap Sort — O(n log n) time, O(1) extra space. A max-heap gives descending order. Unlike merge sort, heap sort requires no auxiliary array, making it attractive when memory is constrained.


Section 08

Classic Patterns — K Largest & K Smallest Elements

🔽
K Smallest
Use a max-heap of size k. Push each element; if heap exceeds k, pop the max. At the end, the heap contains the k smallest elements. The root is the k-th smallest.
Time: O(n log k)
🔼
K Largest
Use a min-heap of size k. Push each element; if heap exceeds k, pop the min. The heap holds the k largest. The root is the k-th largest (the smallest among them).
Time: O(n log k)
import heapq

def k_largest(nums, k):
    # Min-heap of size k — root is k-th largest
    min_h = []
    for n in nums:
        heapq.heappush(min_h, n)
        if len(min_h) > k:
            heapq.heappop(min_h)   # evict the smallest so far
    return sorted(min_h, reverse=True)   # heap doesn't guarantee order

def k_smallest(nums, k):
    # heapq.nsmallest is optimised — uses heapify + nlargest internally
    return heapq.nsmallest(k, nums)

data = [7, 3, 19, 1, 15, 8, 4, 11]

print("3 largest :", k_largest(data, 3))
print("3 smallest:", k_smallest(data, 3))

# LeetCode 215: Kth Largest Element in an Array
def kth_largest(nums, k):
    return heapq.nlargest(k, nums)[-1]

print("2nd largest:", kth_largest(data, 2))
OUTPUT
3 largest : [19, 15, 11] 3 smallest: [1, 3, 4] 2nd largest: 15

Section 09

Advanced Pattern — Median of a Data Stream

One of the most elegant heap applications: maintain the running median of a stream using two heaps — a max-heap for the lower half and a min-heap for the upper half.

1
Two-Heap Structure
lo = max-heap (lower half, negated in Python) · hi = min-heap (upper half). Invariant: lo always has the same size as hi, or one more element.
2
Add a Number
Push to lo. If top of lo > top of hi, rebalance. Then if hi has more elements, move one back to lo.
3
Find Median
If sizes equal: median = average of both tops. Otherwise: median = top of lo (the bigger half). Both operations are O(1).
import heapq

class MedianFinder:
    def __init__(self):
        self.lo = []   # max-heap (negated) — lower half
        self.hi = []   # min-heap — upper half

    def addNum(self, num):            # O(log n)
        heapq.heappush(self.lo, -num)
        # ensure max(lo) <= min(hi)
        if self.hi and -self.lo[0] > self.hi[0]:
            heapq.heappush(self.hi, -heapq.heappop(self.lo))
        # balance sizes: |lo| == |hi| or |lo| == |hi|+1
        if len(self.lo) > len(self.hi) + 1:
            heapq.heappush(self.hi, -heapq.heappop(self.lo))
        elif len(self.hi) > len(self.lo):
            heapq.heappush(self.lo, -heapq.heappop(self.hi))

    def findMedian(self):             # O(1)
        if len(self.lo) == len(self.hi):
            return (-self.lo[0] + self.hi[0]) / 2
        return -self.lo[0]

mf = MedianFinder()
for n in [5, 15, 1, 3]:
    mf.addNum(n)
    print(f"Added {n:2d}  →  median = {mf.findMedian()}")
OUTPUT
Added 5 → median = 5 Added 15 → median = 10.0 Added 1 → median = 5 Added 3 → median = 4.0
🧮
Why Not Just Sort the Stream?

Sorting a stream of n numbers after every insertion costs O(n²) total — each insertion triggers an O(n log n) sort. The two-heap approach costs O(n log n) total across all insertions, and each median query is O(1). For a real-time price feed or sensor stream, this difference is enormous.


Section 10

Time & Space Complexity Reference

Operation Time Complexity Space heapq Method Notes
Build heap (heapify) O(n) O(1) heapify(list) In-place; better than n×push
Push (insert) O(log n) O(1) heappush(h, val) Sift-up after append
Pop (remove min/max) O(log n) O(1) heappop(h) Sift-down after root removal
Peek (read top) O(1) O(1) h[0] Just array index 0
Search arbitrary element O(n) O(1) Not provided Must scan full array
Delete arbitrary element O(n) O(1) Not provided Find O(n) + fix O(log n)
K largest / smallest O(n log k) O(k) nlargest / nsmallest Better than full sort for small k
Heap sort O(n log n) O(1) n×heappop In-place, not cache-friendly
Merge k sorted lists O(n log k) O(k) heapq.merge() k = number of lists
⚠️
When NOT to Use a Heap

Heaps do not support fast arbitrary element search, update, or deletion. If you need to frequently update priorities of existing elements, use a Fibonacci Heap (O(1) decrease-key) or simply use a "lazy deletion" pattern with a visited set. For fully sorted iteration, a sorted list or BST is faster to query.


Section 11

Real-World Use Cases

🗓️
Task Schedulers
Min-Heap on deadline
OS schedulers and job queues use a min-heap keyed by priority or deadline. The next task to run is always the root — O(1) selection, O(log n) insertion.
✓ Powers cron, Celery, AWS SQS priority queues
🗺️
Dijkstra's Algorithm
Min-Heap on distance
Shortest-path algorithms store (distance, node) pairs in a min-heap. Always expanding the closest unvisited node reduces time to O((V+E) log V).
✓ Google Maps, network routing, game AI pathfinding
🏥
Hospital Triage
Max-Heap on severity
Patients are served by severity score, not arrival time. A max-heap ensures the most critical patient is always at the top, regardless of when they checked in.
✓ ER systems, event-driven simulations
📈
Stock Market Feed
Min/Max-Heap on price
Order books maintain best bid (max-heap) and best ask (min-heap). Matching engine always pops the best offer against the best bid in O(log n).
✓ High-frequency trading, limit order books
🔗
Merge K Sorted Lists
Min-Heap on first elements
Push the head of each list. Pop the min, emit it, push the next element from that list. O(n log k) vs O(nk) for naive approach.
✓ External sort, merge phase in MapReduce
📊
Running Percentiles
Two-Heap median / p90
Extend the two-heap median pattern with size ratios to compute any percentile. Monitoring systems (Prometheus, Datadog) use this for P50/P95/P99 latency.
✓ APM tools, SLA dashboards, sensor analytics

Section 12

Before / After — Replacing a Sorted List

❌ Before — Sorted List (Slow Inserts)
OperationCost
Insert taskO(n) — must shift
Get next taskO(1)
1000 inserts~500,000 ops
MemoryO(n)
✅ After — Min-Heap (Fast Everything)
OperationCost
Insert taskO(log n)
Get next taskO(log n) pop
1000 inserts~10,000 ops
MemoryO(n)

Section 13

Golden Rules

🏆 Heaps (Min/Max) — 6 Non-Negotiable Rules
1
Python's heapq is min-heap only. To simulate a max-heap, negate values on push and un-negate on pop. Never try to use heapq directly for max without negation — there is no max_heap flag or mode.
2
Use heapify() over n×heappush(). Building a heap from an existing list via heapq.heapify() is O(n). Pushing n items one at a time is O(n log n). For any list you already have, always heapify.
3
Peek is O(1) — never pop just to look. The minimum (or maximum) is always at heap[0]. Reading heap[0] is instant. Popping and pushing back is O(log n) and corrupts performance in tight loops.
4
Use (priority, counter, item) tuples for stability. If two items have equal priority and the item type is not comparable, Python will raise a TypeError. Add a monotonically-increasing counter as a tiebreaker: (priority, next(counter), item).
5
Use lazy deletion for "decrease key". heapq has no update-in-place operation. The standard pattern is to push the updated entry and mark the old one as invalid in a visited set or with a REMOVED sentinel. Skip stale entries when popping.
6
Choose a heap when you only need the extreme repeatedly. If you need the min/max repeatedly but not a fully sorted structure, a heap beats sorting (O(n log n)) and beats a sorted list for insertions (O(n)). If you need both ends equally, use two heaps. If you need random access, use a BST.
You have completed Data Structures (Core). View all sections →