B-Trees: Structure, Search, Insert & Delete Guide

Section 01

The Story That Explains B-Trees

📖 Real World Analogy

The University Library Index System

Imagine a university library with 10 million books. There are two ways to find a book:

Option A — Binary Search Tree approach: One giant alphabetical list on a single long scroll. To find any book, you walk half the scroll, then half again… It works, but each "fold" requires a separate trip to a different storage room. With 10 million books, that's up to 23 trips.

Option B — B-Tree approach: A hierarchical cabinet system. The top cabinet has 500 labels (A–Z plus ranges). Each drawer has sub-labels. You navigate 2–3 cabinets and you're done — regardless of whether the library has 10,000 or 10 billion books.

That's the magic of B-Trees. Each node holds many keys, so the tree stays very shallow even with billions of records. Fewer levels = fewer disk reads = blazing speed.

A B-Tree (Balanced Tree) is a self-balancing search tree designed specifically for systems that read and write large blocks of data — like databases and file systems. Unlike a binary tree where each node has at most 2 children, a B-Tree node can have hundreds of children, which keeps the tree height tiny.

🌳

The Core Insight

Disk I/O is the bottleneck in databases — reading one value from disk takes ~10ms, but reading 1000 sequential values from the same disk block takes almost the same time. B-Trees exploit this by packing many keys into one node (one disk block), minimising the number of disk reads to find any record.

Section 02

B-Tree vs Binary Search Tree — The Key Difference

🌲 Binary Search Tree (BST)

Property	Value
Keys per node	1
Children per node	0 or 2
Height (1M records)	~20 levels
Disk reads per lookup	~20
Balance guarantee	❌ No (standard)

🏛️ B-Tree (order 100)

Property	Value
Keys per node	1 to 199
Children per node	2 to 200
Height (1M records)	3–4 levels
Disk reads per lookup	3–4
Balance guarantee	✅ Always

Section 03

B-Tree Structure — Anatomy

🏗️ Formal Rules for a B-Tree of Order t (minimum degree)

Rule 1

Every node has at most 2t − 1 keys (the node is "full" when it has exactly 2t−1 keys)

Rule 2

Every non-root node has at least t − 1 keys. The root has at least 1 key

Rule 3

A non-leaf node with k keys has exactly k + 1 children

Rule 4

All leaves are at the same depth — this is what keeps the tree perfectly balanced

Rule 5

Keys within each node are stored in sorted (ascending) order

🎨 Animated B-Tree Structure Diagram (Order t=2, max 3 keys per node)

▲ Each node appears with a fade-in animation. Yellow = root, Blue = internal, Green = leaf. All leaf nodes sit at the same depth — this is the balance guarantee.

Section 04

Operation 1 — Search

📖 Story

Finding a Student in a School Register

The school office has a cabinet. The top drawer says "A–G, H–P, Q–Z". You want student "Maya". You open the H–P drawer. Inside it says "H–L, M–N, O–P". You open M–N. In seconds you have Maya's record — without reading every student's name. That's B-Tree search: navigate, compare keys, descend, repeat.

🔍 B-Tree Search Algorithm — Step by Step for key k

Step 1

Start at the root node. Load the node into memory (one disk read).

Step 2

Scan the node's keys from left to right. Find the smallest index i such that k ≤ keys[i].

Step 3

If keys[i] == k, the key is found — return the record.

Step 4

If the current node is a leaf and key was not found → key does not exist.

Step 5

Otherwise, descend into child[i] (the subtree between keys[i-1] and keys[i]) and repeat from Step 2.

🎬 Animated Search: Find key 40 in the B-Tree above

▲ Watch the search path animate: root → internal node (35,44) → leaf with 40. Only 3 disk reads total.

Python Code — B-Tree Search

class BTreeNode:
    def __init__(self, leaf=False):
        self.keys     = []       # sorted list of keys in this node
        self.children = []       # child node references (len = len(keys)+1 for internal)
        self.leaf     = leaf     # True if this is a leaf node


class BTree:
    def __init__(self, t):
        self.root = BTreeNode(leaf=True)  # tree starts as single empty leaf
        self.t    = t                        # minimum degree: each node has t-1..2t-1 keys

    # ─── SEARCH ───────────────────────────────────────────────
    def search(self, node, k):
        """Return (node, index) if k found, else None."""
        i = 0
        while i < len(node.keys) and k > node.keys[i]:
            i += 1                   # advance past keys smaller than k

        if i < len(node.keys) and node.keys[i] == k:
            return (node, i)         # ✅ key found at index i

        if node.leaf:
            return None              # ❌ leaf reached, key not in tree

        return self.search(node.children[i], k)  # recurse into child[i]

Complexity

Time: O(log n) — proportional to tree height Space: O(1) — just a few pointers, no extra memory needed Disk reads: O(log_t n) — e.g. with t=500, 1 billion keys → ≤ 4 reads

Section 05

Operation 2 — Insert

📖 Story

Adding a New Student to the School Register

The school has a rule: each drawer holds at most 3 student files. When you add a 4th student to a full drawer, the librarian splits the drawer in two and promotes the middle student's name up to the parent index. If the parent's section is also full — split that too. This might cascade all the way to the root. If even the root splits, a new root is created — the tree grows taller by exactly one level, maintaining balance.

➕ B-Tree Insert Algorithm (minimum degree t)

Step 1

If root is full (has 2t−1 keys): create new empty root, make old root its child, then split the old root. Tree height increases by 1.

Step 2

Walk down from the root toward the correct leaf. Before descending into any full child, split it proactively (so we never need to backtrack).

Step 3

When we reach a non-full leaf, insert the key in sorted order.

Split

Splitting a full child: the median key (index t−1) is promoted to the parent. Left child keeps keys[0..t-2]. Right child keeps keys[t..2t-2].

🎬 Animated Node Split: inserting key 13 into a full node [10, 20, 30]

Python Code — B-Tree Insert with Split

    # ─── INSERT PUBLIC ENTRY POINT ───────────────────────────
    def insert(self, k):
        root = self.root

        if len(root.keys) == (2 * self.t - 1):   # root is FULL → must split first
            new_root          = BTreeNode()
            new_root.children = [self.root]           # old root becomes child
            self._split_child(new_root, 0)             # split old root, promote median
            self.root         = new_root               # tree grows one level taller

        self._insert_non_full(self.root, k)

    # ─── INSERT INTO A GUARANTEED NON-FULL NODE ──────────────
    def _insert_non_full(self, node, k):
        i = len(node.keys) - 1                         # start from rightmost key

        if node.leaf:
            node.keys.append(None)                    # make room
            while i >= 0 and k < node.keys[i]:
                node.keys[i + 1] = node.keys[i]        # shift right to keep sorted order
                i -= 1
            node.keys[i + 1] = k                       # insert k in correct position
        else:
            while i >= 0 and k < node.keys[i]:
                i -= 1                                   # find correct child to descend
            i += 1
            if len(node.children[i].keys) == (2 * self.t - 1):
                self._split_child(node, i)              # proactively split full child
                if k > node.keys[i]:
                    i += 1                               # decide which of the 2 new children to use
            self._insert_non_full(node.children[i], k)  # recurse down

    # ─── SPLIT A FULL CHILD ──────────────────────────────────
    def _split_child(self, parent, i):
        t       = self.t
        full    = parent.children[i]                   # the child to split
        sibling = BTreeNode(leaf=full.leaf)

        # Median key rises to parent
        parent.keys.insert(i, full.keys[t - 1])       # promote median
        parent.children.insert(i + 1, sibling)

        # Right half goes to sibling
        sibling.keys = full.keys[t:]                   # keys after median → new right sibling
        full.keys    = full.keys[:t - 1]              # keys before median stay in full

        if not full.leaf:                               # split children pointers too
            sibling.children = full.children[t:]
            full.children    = full.children[:t]

⚡

Why Split Proactively on the Way Down?

Classical B-Tree insertion splits on the way back up (after finding the leaf). The proactive approach splits on the way down, so we only need a single pass and never backtrack. This is the algorithm used in real database engines like PostgreSQL and SQLite — it's more I/O efficient.

Section 06

Operation 3 — Delete

📖 Story

Removing a Student from the Register

Removing a student is trickier than adding one, because each drawer must have at least 1 file (t−1 minimum). If removing leaves a drawer too empty, you either borrow a file from a neighbouring drawer (rotation), or merge two drawers back into one (the reverse of a split). This may cascade upward, but the tree always stays balanced.

🗑️ B-Tree Delete — 3 Cases

Case 1

Key is in a leaf node with ≥ t keys → simply remove the key. No restructuring needed.

Case 2a

Key is in an internal node; left child has ≥ t keys → Replace key with its in-order predecessor (largest key in left child), then delete that predecessor from the left subtree.

Case 2b

Key is in an internal node; right child has ≥ t keys → Replace key with its in-order successor (smallest key in right child), then delete that successor from the right subtree.

Case 2c

Both children have only t−1 keys → Merge the key + right child into the left child. Left child now has 2t−1 keys. Delete the key from the merged node.

Case 3

Key not in current node; child has only t−1 keys → Ensure child has ≥ t keys by either rotating from a sibling (borrow) or merging with a sibling, then recurse.

🎬 Animated Case 2c: Merge when deleting key 50 (both children have t-1 keys)

Python Code — B-Tree Delete

    # ─── DELETE PUBLIC ENTRY POINT ───────────────────────────
    def delete(self, node, k):
        t = self.t
        i = 0
        while i < len(node.keys) and k > node.keys[i]:
            i += 1

        if i < len(node.keys) and node.keys[i] == k:
            # Key found in this node
            if node.leaf:
                node.keys.pop(i)                      # Case 1: simply remove from leaf
            elif len(node.children[i].keys) >= t:
                # Case 2a: replace with in-order predecessor
                pred = self._get_pred(node, i)
                node.keys[i] = pred
                self.delete(node.children[i], pred)
            elif len(node.children[i+1].keys) >= t:
                # Case 2b: replace with in-order successor
                succ = self._get_succ(node, i)
                node.keys[i] = succ
                self.delete(node.children[i+1], succ)
            else:
                # Case 2c: both children have t-1 keys → merge
                self._merge(node, i)
                self.delete(node.children[i], k)
        else:
            # Key not in this node — go deeper
            if node.leaf:
                return                                  # key not in tree
            if len(node.children[i].keys) < t:
                self._fill(node, i)                     # Case 3: ensure child has ≥ t keys
            # After fill, child may have merged — re-check index
            if i >= len(node.keys) and i > 0:
                self.delete(node.children[i-1], k)
            else:
                self.delete(node.children[i], k)

    def _get_pred(self, node, i):
        """Largest key in left subtree (in-order predecessor)."""
        curr = node.children[i]
        while not curr.leaf:
            curr = curr.children[-1]
        return curr.keys[-1]

    def _get_succ(self, node, i):
        """Smallest key in right subtree (in-order successor)."""
        curr = node.children[i + 1]
        while not curr.leaf:
            curr = curr.children[0]
        return curr.keys[0]

    def _merge(self, parent, i):
        """Merge child[i+1] into child[i]; drop key[i] from parent."""
        child   = parent.children[i]
        sibling = parent.children[i + 1]
        child.keys.append(parent.keys[i])         # parent key drops down
        child.keys.extend(sibling.keys)            # sibling keys join child
        child.children.extend(sibling.children)   # sibling children move too
        parent.keys.pop(i)                         # remove key from parent
        parent.children.pop(i + 1)               # remove sibling from parent

Section 07

Complete Interactive Demo — Build a B-Tree Live

The code below builds a B-Tree step by step and shows the tree structure after each insertion. Each line is annotated to show what it is doing:

# ══════════════════════════════════════════════════════════
#  FULL B-TREE DEMO — Build, Search, Delete
# ══════════════════════════════════════════════════════════

class BTreeNode:
    def __init__(self, leaf=False):
        self.keys     = []          # keys stored in ascending order
        self.children = []          # child pointers (len = len(keys)+1 for internal)
        self.leaf     = leaf        # True for leaf nodes


class BTree:
    def __init__(self, t=2):       # t=2 → classic 2-3-4 tree (max 3 keys per node)
        self.root = BTreeNode(leaf=True)
        self.t    = t

    def insert(self, k):
        root = self.root
        if len(root.keys) == 2 * self.t - 1:  # root full → grow tree
            s = BTreeNode()
            s.children = [self.root]
            self._split_child(s, 0)
            self.root = s
        self._insert_non_full(self.root, k)

    def _insert_non_full(self, node, k):
        i = len(node.keys) - 1
        if node.leaf:
            node.keys.append(None)
            while i >= 0 and k < node.keys[i]:
                node.keys[i+1] = node.keys[i]   # shift right
                i -= 1
            node.keys[i+1] = k
        else:
            while i >= 0 and k < node.keys[i]: i -= 1
            i += 1
            if len(node.children[i].keys) == 2*self.t-1:
                self._split_child(node, i)
                if k > node.keys[i]: i += 1
            self._insert_non_full(node.children[i], k)

    def _split_child(self, parent, i):
        t = self.t
        y = parent.children[i]      # full child to split
        z = BTreeNode(leaf=y.leaf)   # new right sibling
        parent.keys.insert(i, y.keys[t-1])   # promote median
        parent.children.insert(i+1, z)
        z.keys = y.keys[t:]          # right half to sibling
        y.keys = y.keys[:t-1]        # left half stays
        if not y.leaf:
            z.children = y.children[t:]
            y.children = y.children[:t]

    def search(self, node, k):
        i = 0
        while i < len(node.keys) and k > node.keys[i]: i += 1
        if i < len(node.keys) and node.keys[i] == k: return (node, i)
        if node.leaf: return None
        return self.search(node.children[i], k)

    def print_tree(self, node=None, indent=0):
        if node is None: node = self.root
        print("  " * indent + f"[{', '.join(map(str, node.keys))}]")
        for child in node.children:
            self.print_tree(child, indent + 1)


# ── Build and demonstrate ──────────────────────────────────
bt = BTree(t=2)   # minimum degree 2 → each node: 1 to 3 keys

keys = [10, 20, 5, 6, 12, 30, 7, 17]
for k in keys:
    bt.insert(k)
    print(f"After inserting {k}:")
    bt.print_tree()
    print()

# Search
result = bt.search(bt.root, 12)
print(f"Search 12: {'FOUND at node ' + str(result[0].keys) if result else 'NOT FOUND'}")

OUTPUT — Step-by-step tree growth

After inserting 10: [10] After inserting 20: [10, 20] After inserting 5: [5, 10, 20] After inserting 6: ← node was full! Split occurred [10] [5, 6] [20] After inserting 12: [10] [5, 6] [12, 20] After inserting 30: [10] [5, 6] [12, 20, 30] After inserting 7: ← split again! [10, 20] [5, 6, 7] [12] [30] After inserting 17: [10, 20] [5, 6, 7] [12, 17] [30] Search 12: FOUND at node [12, 17]

Section 08

Complexity Summary

Operation	Time Complexity	Disk I/Os	Space	Notes
Search	O(log n)	O(log_t n)	O(1)	With t=500 and n=10⁹: only 4–5 I/Os
Insert	O(log n)	O(log_t n)	O(1)	Splits add a constant overhead per level
Delete	O(log n)	O(log_t n)	O(1)	Merges/rotations bounded by tree height
Space	—	—	O(n)	Each key stored once; up to 50% utilisation
In-order traversal	O(n)	O(n/B)	O(h)	h = tree height; B = keys per block

Tree Height

h ≤ log_t((n+1)/2)

With t=1000 and n=10⁹: h ≤ 3. Three disk reads to find any record in a billion-row table.

Node capacity

t − 1 ≤ keys ≤ 2t − 1

Minimum degree t controls the fanout. Higher t = shallower tree = fewer I/Os but larger nodes.

Max keys total

n ≤ (2t)^h+1 − 1

Maximum keys a B-Tree of height h can hold, given minimum degree t.

Space utilisation

50% minimum

Every non-root node is at least half full (t−1 keys). Average utilisation ~69% in practice.

Section 09

B-Tree vs B+ Tree vs B* Tree

Variant	Data in internal nodes?	Leaf linking	Min fill	Best used in
B-Tree	✅ Yes	❌ No leaf chain	50%	General key-value stores; Btrfs filesystem
B+ Tree	❌ Internal nodes: keys only	✅ Leaves linked as a list	50%	MySQL InnoDB, PostgreSQL — range queries blazing fast
*B Tree**	✅ Yes	❌ No	67%	Where space utilisation matters — delays splits, redistributes first

💡

Why Databases Almost Always Use B+ Trees

In a B+ Tree, all actual data records live in the leaf nodes. Internal nodes only store separator keys for navigation. Because the leaves are linked in order (leaf₁ → leaf₂ → leaf₃ → …), a range query like WHERE age BETWEEN 20 AND 30 can find the start position in O(log n) and then scan forward linearly — no tree traversal needed for the range portion.

🌿 B+ Tree Leaf Layer — Linked List for Range Queries

Section 10

Real-World Use Cases

🗄️

Relational Databases

MySQL InnoDB and PostgreSQL use B+ Trees for all primary key and secondary indexes. Every SQL query that uses an index navigates a B+ Tree behind the scenes.

MySQL · PostgreSQL · Oracle · SQLite

💽

File Systems

NTFS (Windows), Btrfs (Linux), HFS+ (macOS) use B-Tree or B+ Tree structures to index file metadata — letting the OS find any file among millions in milliseconds.

NTFS · Btrfs · HFS+ · ext4

🔑

Key-Value Stores

LevelDB, RocksDB (used by Facebook, LinkedIn, Instagram) and WiredTiger (MongoDB's engine) all use B-Tree variants as their core storage structure.

RocksDB · LevelDB · WiredTiger · LMDB

📦

NoSQL Databases

MongoDB (WiredTiger engine uses B-Trees), CouchDB, and many others use B-Tree variants to index documents for fast lookup and range queries.

MongoDB · CouchDB · ArangoDB

🔍

Search Indexes

Full-text search inverted indexes (like Elasticsearch's Lucene underneath) use B-Tree structures for their term dictionaries — mapping words to postings lists.

Elasticsearch · Lucene · Solr

📊

Columnar Storage

Apache Parquet and ORC files use B-Tree-like row-group indexes to enable predicate pushdown — skipping entire file sections without reading them.

Parquet · ORC · BigQuery · Redshift

Section 11

Benefits of B-Trees

⚖️

Always Balanced

O(log n) guaranteed

Unlike a BST that can degrade to O(n) on sorted input, a B-Tree's self-balancing guarantees worst-case O(log n) for all operations, always.

✅ Predictable performance

💿

Disk I/O Optimised

Node size = disk page

Nodes are sized to match disk pages (typically 4KB–16KB). One disk read fetches an entire node. High fanout minimises the number of reads per lookup.

✅ 100–1000× fewer I/Os than BST

📈

Scales to Billions

log growth

A B-Tree with t=1000 and height 3 can hold over 10⁹ keys. Adding 100× more data only adds 1–2 levels. The tree grows logarithmically, not linearly.

✅ 3 disk reads for 1 billion records

🔄

Dynamic Structure

No rebuild needed

Unlike a sorted array or static index, a B-Tree handles insertions and deletions without a full rebuild. The tree reorganises itself locally and efficiently.

✅ Online updates with no downtime

🔍

Range Queries

In-order traversal

Because keys are stored in sorted order, range queries (find all keys between A and B) are naturally efficient. B+ Tree variant makes this even faster via linked leaves.

✅ Perfect for SQL BETWEEN, ORDER BY

🧵

Sequential Access

Cache-friendly

An in-order traversal of a B-Tree accesses disk pages sequentially. Sequential disk reads are 100× faster than random reads on HDDs — B-Trees exploit this.

✅ Sequential scans are fast

Section 12

Comparison Table — Data Structures for Search

Structure	Search	Insert	Delete	Range Query	Disk Friendly	Balanced?
Array (sorted)	O(log n)	O(n)	O(n)	O(log n + k)	Medium	N/A
Hash Table	O(1) avg	O(1) avg	O(1) avg	❌ No	Poor	N/A
BST (unbalanced)	O(n) worst	O(n) worst	O(n) worst	O(n) worst	Very Poor	❌
AVL / Red-Black	O(log n)	O(log n)	O(log n)	O(log n + k)	Poor (1 key/node)	✅
Skip List	O(log n) avg	O(log n) avg	O(log n) avg	O(log n + k)	Moderate	Probabilistic
B-Tree ✅	O(log n)	O(log n)	O(log n)	O(log n + k)	Excellent	✅ Always
B+ Tree ✅	O(log n)	O(log n)	O(log n)	O(log n + k)*	Best	✅ Always

* B+ Tree range scan: after O(log n) to find start, remaining scan is O(k) — sequential leaf traversal, no tree re-traversal.

Section 13

Golden Rules for B-Trees

🌳 B-Tree — Key Rules to Remember

Choose order t based on disk page size. For a 4KB disk page with 8-byte keys and 8-byte pointers: t ≈ 4096 / (8 + 8) ≈ 256. This gives an enormous fanout and a tree height of ≤ 4 for a billion records.

Use B+ Tree for databases, not plain B-Tree. B+ Tree stores all data in leaves, internal nodes are navigation only. This maximises fanout (more keys per internal node) and enables blazing-fast range scans via the leaf linked list.

Every split and merge is local. A split promotes the median key up one level. A merge absorbs a parent key and one sibling. Neither operation affects more than O(1) nodes per level — the cascade stops quickly.

The tree only grows from the root, never from the leaves. This is what guarantees all leaves are always at the same depth — the balance invariant is trivially maintained because height increases happen at the top.

Avoid random inserts when bulk-loading. If you have N sorted records to insert at once, use bulk loading (fill pages 69% full from left to right). This avoids O(N log N) splits and builds the tree in O(N) time with better space utilisation.

Keep a small buffer pool (cache) for root and upper nodes. In real databases, the root node is accessed for every single query. Caching the top 2–3 levels of the B-Tree in RAM effectively eliminates all but the last disk read per lookup.

🎯

The One Sentence Summary

A B-Tree is a self-balancing tree where each node holds many keys and pointers — keeping the tree very short, so any record in a billion-row database can be found with just 3–4 disk reads, and the tree stays perfectly balanced after every insert and delete without any extra effort.