Data Structure 📂 Tree in Data Structure · 1 of 7 46 min read

Trees - Binary Tree: Structure, Traversals & Complexity

A complete embeddable tutorial for a Binary Search Tree covering node structure, insertion, search, in-order / pre-order / post-order / level-order traversals, height & balance analysis, time complexity, and real-world use cases — with a fully interactive step-by-step animation panel that highlights each code line alongside the live tree.

Section 01

What Is a Binary Tree?

The Cinema Seat Booking System
Picture a cinema with a seating chart organised by row number. The usher starts at the centre seat (the root). Every seat to its left has a smaller number; every seat to its right has a larger number. To find seat 42, the usher looks at the current seat, says "is 42 smaller or larger?" and walks in exactly one direction — never backtracking.

That instant, one-direction decision is the entire power of a binary tree. Instead of scanning every seat in the hall (O(n)), the usher eliminates half the remaining seats at every single step (O(log n)). A 1,000-seat theatre is solved in at most 10 steps. A 1,000,000-seat arena — still only 20 steps.

A Binary Search Tree (BST) is precisely this: a tree where every node stores one value, has at most two children, and obeys the rule — left child < parent < right child.

Every binary tree is made of nodes. Each node holds a value and up to two child pointers: left and right. The topmost node is the root; nodes with no children are leaves.

🌳
Why Bother with Trees?

Arrays give O(1) access but O(n) search. Hash maps give O(1) average search but no ordering. A balanced BST gives O(log n) search, insert, and delete — all three at once — while keeping elements in sorted order. This combination makes trees foundational in databases, file systems, compilers, and every autocomplete you've ever used.


Section 02

Building the Node — The Atom of Every Tree

Before building a tree you need one thing: a Node class. Every node stores its value and references to its left and right children (both default to None). That's it — no magic, no complexity.

# ── The building block of every binary tree ────────────────
class Node:
    def __init__(self, val):
        self.val   = val       # the data stored in this node
        self.left  = None     # pointer to left child
        self.right = None     # pointer to right child

# Manually build a small tree:
#        10
#       /  \
#      5    15
#     / \     \
#    3   7    20
root       = Node(10)
root.left  = Node(5)
root.right = Node(15)
root.left.left  = Node(3)
root.left.right = Node(7)
root.right.right = Node(20)

print(root.val, root.left.val, root.right.val)
OUTPUT
10 5 15
💡
Node Is Just a Container

The Node class has exactly three fields. All tree behaviour — insertion, search, traversal — lives in the BST class that orchestrates these nodes. Keeping them separate is a clean design pattern you'll see in production codebases.


Section 03

Insert & Search — The Core BST Operations

Both operations follow the same logic: compare the target value against the current node, then recurse left or right. Insert stops when it finds a None slot; search stops when it finds the value or hits None (not found).

class BST:
    def __init__(self):
        self.root = None

    # ── INSERT ────────────────────────────────────────────────
    def insert(self, val):
        self.root = self._insert(self.root, val)

    def _insert(self, node, val):
        if node is None:          # empty slot → place node here
            return Node(val)
        if val < node.val:
            node.left  = self._insert(node.left,  val)
        elif val > node.val:
            node.right = self._insert(node.right, val)
        # if val == node.val: ignore duplicate
        return node

    # ── SEARCH ────────────────────────────────────────────────
    def search(self, val):
        return self._search(self.root, val)

    def _search(self, node, val):
        if node is None: return False
        if val == node.val: return True
        if val < node.val:
            return self._search(node.left,  val)
        return self._search(node.right, val)

# ── Usage ──────────────────────────────────────────────────
bst = BST()
for v in [10, 5, 15, 3, 7, 20]:
    bst.insert(v)

print(bst.search(7))    # True
print(bst.search(99))   # False
OUTPUT
True False
Every Recursive Call Halves the Problem

At each node, exactly one branch is explored. With a balanced tree of 1,000,000 nodes, search completes in at most 20 comparisons — because log₂(1,000,000) ≈ 20. This is why trees underpin every database index you have ever benefited from.


Section 04

Tree Traversals — Three Ways to Walk a Tree

Traversal means visiting every node exactly once. The order you visit them determines what you get. All three classic traversals are DFS (Depth-First Search) and differ only in when you process the current node relative to its children.

⬅️
In-Order (L → Root → R)
Visits nodes in sorted ascending order. The go-to for BST — produces a sorted list with zero extra work.
Result on [10,5,15,3,7,20] → 3 5 7 10 15 20
⬆️
Pre-Order (Root → L → R)
Processes the root before children. Useful for copying a tree or serialising its structure.
Result → 10 5 3 7 15 20
⬇️
Post-Order (L → R → Root)
Processes the root after both children. Essential for deleting a tree (children first, then parent).
Result → 3 7 5 20 15 10
def inorder(node, result=[]):
    if node is None: return
    inorder(node.left,  result)   # 1. recurse LEFT
    result.append(node.val)       # 2. PROCESS root ← key line
    inorder(node.right, result)   # 3. recurse RIGHT
    return result

def preorder(node, result=[]):
    if node is None: return
    result.append(node.val)       # 1. PROCESS root first
    preorder(node.left,  result)
    preorder(node.right, result)
    return result

def postorder(node, result=[]):
    if node is None: return
    postorder(node.left,  result)
    postorder(node.right, result)
    result.append(node.val)       # 1. PROCESS root LAST
    return result

bst = BST()
for v in [10, 5, 15, 3, 7, 20]: bst.insert(v)

print("In-Order  :", inorder(bst.root,   []))
print("Pre-Order :", preorder(bst.root,  []))
print("Post-Order:", postorder(bst.root, []))
OUTPUT
In-Order : [3, 5, 7, 10, 15, 20] Pre-Order : [10, 5, 3, 7, 15, 20] Post-Order: [3, 7, 5, 20, 15, 10]
⚠️
Mutable Default Argument Trap

The result=[] default argument is a Python gotcha: the same list persists across calls. In production always call with an explicit empty list: inorder(root, []) — or use result=None and initialise inside the function body with if result is None: result = [].


Section 05

Live Animation — Watch Each Operation Step by Step

Select a phase below, then step through the code line-by-line. The highlighted line on the left matches the node being processed in the tree on the right.

Phase: Insert
Step 0 / 0
Press ▶ Next to begin.
10 5 15 3 7 20
🔍
Reading the Animation

The amber highlight marks the line currently executing. Lines that have already run appear dimmed. In the tree, amber nodes are actively being compared; green nodes are confirmed visited/found.


Section 06

Height, Balance & the Degenerate Tree Problem

A tree's height is the number of edges on the longest root-to-leaf path. Height determines worst-case complexity — it's the maximum number of comparisons any operation will ever make. An ideal balanced tree has height ⌊log₂ n⌋; a degenerate (sorted insertion) tree degrades to height n − 1 — as slow as a linked list.

⚠ Degenerate BST — Sorted Insert
Insert orderShape
1, 2, 3, 4, 5Right-skewed chain
Height4 (= n−1)
SearchO(n)
SpaceO(n) stack depth
✅ Balanced BST — Random/Shuffled Insert
Insert orderShape
3, 1, 4, 2, 5Wide, balanced shape
Height2 (≈ log₂ n)
SearchO(log n)
SpaceO(log n) stack depth
def height(node):
    if node is None: return -1         # empty tree height = -1
    return 1 + max(height(node.left), height(node.right))

def is_balanced(node):
    if node is None: return True
    lh = height(node.left)
    rh = height(node.right)
    balanced_here = abs(lh - rh) <= 1
    return balanced_here and is_balanced(node.left) and is_balanced(node.right)

# Balanced tree [10, 5, 15, 3, 7, 20]
bst_good = BST()
for v in [10, 5, 15, 3, 7, 20]: bst_good.insert(v)

# Degenerate tree — sorted insertion
bst_bad = BST()
for v in [1, 2, 3, 4, 5]: bst_bad.insert(v)

print(f"Good tree — height: {height(bst_good.root)}, balanced: {is_balanced(bst_good.root)}")
print(f"Bad  tree — height: {height(bst_bad.root)},  balanced: {is_balanced(bst_bad.root)}")
OUTPUT
Good tree — height: 2, balanced: True Bad tree — height: 4, balanced: False
⚠️
Always Check for Degenerate Input

Inserting an already-sorted list into a plain BST creates a linked list in disguise. If your data might arrive sorted, use a self-balancing tree (AVL, Red-Black) or shuffle before insertion. Python's sortedcontainers.SortedList handles this transparently in production code.


Section 07

Level-Order Traversal — Breadth-First Search

All three traversals above are DFS (using the call stack). Level-order traversal visits nodes row-by-row using an explicit queue. It's the foundation of BFS and essential for problems like "find the shortest path" or "print the tree level by level."

01
Enqueue root
Start with queue = [root]. The queue is the engine — it guarantees we always process a full level before descending.
02
Dequeue, process, enqueue children
Pop the front node, record its value, then push its left and right children (if they exist) to the back of the queue.
03
Repeat until queue is empty
When the queue empties, every node has been visited exactly once in level order. No recursion, no stack — just a queue loop.
04
Return result
The result list holds all values in breadth-first order: [10, 5, 15, 3, 7, 20] for our example tree.
from collections import deque

def level_order(root):
    if root is None: return []
    result, queue = [], deque([root])   # deque = efficient queue
    while queue:
        node = queue.popleft()           # O(1) pop from front
        result.append(node.val)
        if node.left:  queue.append(node.left)
        if node.right: queue.append(node.right)
    return result

def level_order_grouped(root):
    # Returns a list of lists — one per level
    if root is None: return []
    result, queue = [], deque([root])
    while queue:
        level_size = len(queue)             # snapshot current level width
        level = []
        for _ in range(level_size):
            node = queue.popleft()
            level.append(node.val)
            if node.left:  queue.append(node.left)
            if node.right: queue.append(node.right)
        result.append(level)
    return result

bst = BST()
for v in [10, 5, 15, 3, 7, 20]: bst.insert(v)

print("Level-order flat  :", level_order(bst.root))
print("Level-order grouped:", level_order_grouped(bst.root))
OUTPUT
Level-order flat : [10, 5, 15, 3, 7, 20] Level-order grouped: [[10], [5, 15], [3, 7, 20]]
Always Use deque, Never list, as a Queue

Python's list.pop(0) is O(n) — it shifts every remaining element. collections.deque.popleft() is O(1). For large trees, this difference is the boundary between a correct solution and a TLE on a coding interview.


Section 08

Time & Space Complexity — Full Reference

Operation Balanced BST Degenerate BST AVL / Red-Black Space (call stack) Guaranteed?
Insert O(log n) O(n) O(log n) O(h) Avg only
Search O(log n) O(n) O(log n) O(h) Avg only
Delete O(log n) O(n) O(log n) O(h) Avg only
In-Order Traversal O(n) O(n) O(n) O(h) Always
Level-Order (BFS) O(n) O(n) O(n) O(w)* Always
Height O(log n) O(n) O(log n) O(h) Avg only
Min / Max O(log n) O(n) O(log n) O(h) Avg only

h = tree height; w = max level width (BFS queue space). For a balanced tree h = ⌊log₂n⌋; worst case h = n−1.

📊
O(log n) Is Only Guaranteed with Balance

A plain BST's complexity is average-case O(log n), not guaranteed. Use an AVL tree or Red-Black tree (Python: sortedcontainers.SortedList) when you need worst-case O(log n) guarantees — for example, in real-time systems or when input order is adversarial.


Section 09

Real-World Use Cases

🗄️
Database Indexes
B-Tree / B+ Tree
Every SQL INDEX you create stores data in a B-tree variant. Range queries (WHERE age BETWEEN 20 AND 30) are in-order traversals over disk blocks.
🔎
Autocomplete & Spell Check
Trie (Prefix Tree)
Every character you type narrows the search to a subtree. A trie with 100,000 words returns all completions for "pyt" in microseconds by descending three levels.
📁
File Systems
Directory Tree
Your hard drive's directory structure is a tree. os.walk() in Python is a post-order traversal — it processes the deepest nested folders before their parents.
🖥️
HTML / DOM
Document Object Model
Every webpage is a tree. document.querySelector() is a DFS search; innerHTML renders via a pre-order traversal that writes parent tags before children.
📦
Priority Queues
Binary Heap
Python's heapq is a binary tree stored as an array. Dijkstra's algorithm, A*, and task schedulers all depend on O(log n) heap operations.
🤖
Machine Learning
Decision Tree / Random Forest
Each prediction traverses a binary tree from root to leaf — one comparison per node. A forest of 500 trees is 500 simultaneous tree traversals combined by majority vote.

Section 10

Golden Rules

🌳 Binary Trees — 6 Non-Negotiable Rules
1
Always handle the None base case first. Every recursive tree function must start with if node is None: return …. Forgetting this causes AttributeError on empty trees or after reaching leaves — the single most common binary tree bug in production and in interviews.
2
Use deque for BFS, never list.pop(0). Python's list pop-from-front is O(n). On a tree with 10,000 nodes, that turns an O(n) BFS into an O(n²) crawl. Import collections.deque and use popleft() for all queue-based tree problems.
3
Never insert sorted data into a plain BST. Sequential insertion (1, 2, 3, 4 …) degrades the tree to a linked list — O(n) for every operation. Either shuffle before inserting, or use a self-balancing structure like sortedcontainers.SortedList which maintains O(log n) always.
4
Know which traversal you need before writing a single line. In-order → sorted output from BST. Pre-order → copy/serialise a tree. Post-order → delete/free memory. Level-order → shortest paths, level-by-level output. Confusing them produces silent wrong answers, not exceptions.
5
Track height, not just depth. Height is measured from a node upward to the farthest leaf. A tree's height is the single number that controls all operation complexity. Any is_balanced function that computes height naively inside a loop runs O(n²) — use a combined helper that returns height and balance status in one pass.
6
Avoid mutable default arguments in recursive tree functions. def inorder(node, result=[]) reuses the same list across all calls in a Python session — a silent data corruption bug. Always pass an empty list explicitly or initialise inside the function body with if result is None: result = [].