What Is a Binary Tree?
That instant, one-direction decision is the entire power of a binary tree. Instead of scanning every seat in the hall (O(n)), the usher eliminates half the remaining seats at every single step (O(log n)). A 1,000-seat theatre is solved in at most 10 steps. A 1,000,000-seat arena — still only 20 steps.
A Binary Search Tree (BST) is precisely this: a tree where every node stores one value, has at most two children, and obeys the rule — left child < parent < right child.
Every binary tree is made of nodes. Each node holds a value and up to two
child pointers: left and right. The topmost node is the
root; nodes with no children are leaves.
Arrays give O(1) access but O(n) search. Hash maps give O(1) average search but no ordering. A balanced BST gives O(log n) search, insert, and delete — all three at once — while keeping elements in sorted order. This combination makes trees foundational in databases, file systems, compilers, and every autocomplete you've ever used.
Building the Node — The Atom of Every Tree
Before building a tree you need one thing: a Node class. Every node stores
its value and references to its left and right children (both default to None).
That's it — no magic, no complexity.
# ── The building block of every binary tree ────────────────
class Node:
def __init__(self, val):
self.val = val # the data stored in this node
self.left = None # pointer to left child
self.right = None # pointer to right child
# Manually build a small tree:
# 10
# / \
# 5 15
# / \ \
# 3 7 20
root = Node(10)
root.left = Node(5)
root.right = Node(15)
root.left.left = Node(3)
root.left.right = Node(7)
root.right.right = Node(20)
print(root.val, root.left.val, root.right.val)
The Node class has exactly three fields. All tree behaviour — insertion, search,
traversal — lives in the BST class that orchestrates these nodes.
Keeping them separate is a clean design pattern you'll see in production codebases.
Insert & Search — The Core BST Operations
Both operations follow the same logic: compare the target value against the current node, then
recurse left or right. Insert stops when it finds a None slot; search stops when it
finds the value or hits None (not found).
class BST:
def __init__(self):
self.root = None
# ── INSERT ────────────────────────────────────────────────
def insert(self, val):
self.root = self._insert(self.root, val)
def _insert(self, node, val):
if node is None: # empty slot → place node here
return Node(val)
if val < node.val:
node.left = self._insert(node.left, val)
elif val > node.val:
node.right = self._insert(node.right, val)
# if val == node.val: ignore duplicate
return node
# ── SEARCH ────────────────────────────────────────────────
def search(self, val):
return self._search(self.root, val)
def _search(self, node, val):
if node is None: return False
if val == node.val: return True
if val < node.val:
return self._search(node.left, val)
return self._search(node.right, val)
# ── Usage ──────────────────────────────────────────────────
bst = BST()
for v in [10, 5, 15, 3, 7, 20]:
bst.insert(v)
print(bst.search(7)) # True
print(bst.search(99)) # False
At each node, exactly one branch is explored. With a balanced tree of 1,000,000 nodes, search completes in at most 20 comparisons — because log₂(1,000,000) ≈ 20. This is why trees underpin every database index you have ever benefited from.
Tree Traversals — Three Ways to Walk a Tree
Traversal means visiting every node exactly once. The order you visit them determines what you get. All three classic traversals are DFS (Depth-First Search) and differ only in when you process the current node relative to its children.
def inorder(node, result=[]):
if node is None: return
inorder(node.left, result) # 1. recurse LEFT
result.append(node.val) # 2. PROCESS root ← key line
inorder(node.right, result) # 3. recurse RIGHT
return result
def preorder(node, result=[]):
if node is None: return
result.append(node.val) # 1. PROCESS root first
preorder(node.left, result)
preorder(node.right, result)
return result
def postorder(node, result=[]):
if node is None: return
postorder(node.left, result)
postorder(node.right, result)
result.append(node.val) # 1. PROCESS root LAST
return result
bst = BST()
for v in [10, 5, 15, 3, 7, 20]: bst.insert(v)
print("In-Order :", inorder(bst.root, []))
print("Pre-Order :", preorder(bst.root, []))
print("Post-Order:", postorder(bst.root, []))
The result=[] default argument is a Python gotcha: the same list persists
across calls. In production always call with an explicit empty list:
inorder(root, []) — or use result=None and initialise inside
the function body with if result is None: result = [].
Live Animation — Watch Each Operation Step by Step
Select a phase below, then step through the code line-by-line. The highlighted line on the left matches the node being processed in the tree on the right.
The amber highlight marks the line currently executing. Lines that have already run appear dimmed. In the tree, amber nodes are actively being compared; green nodes are confirmed visited/found.
Height, Balance & the Degenerate Tree Problem
A tree's height is the number of edges on the longest root-to-leaf path.
Height determines worst-case complexity — it's the maximum number of comparisons any operation
will ever make. An ideal balanced tree has height ⌊log₂ n⌋; a degenerate (sorted
insertion) tree degrades to height n − 1 — as slow as a linked list.
| Insert order | Shape |
|---|---|
| 1, 2, 3, 4, 5 | Right-skewed chain |
| Height | 4 (= n−1) |
| Search | O(n) |
| Space | O(n) stack depth |
| Insert order | Shape |
|---|---|
| 3, 1, 4, 2, 5 | Wide, balanced shape |
| Height | 2 (≈ log₂ n) |
| Search | O(log n) |
| Space | O(log n) stack depth |
def height(node):
if node is None: return -1 # empty tree height = -1
return 1 + max(height(node.left), height(node.right))
def is_balanced(node):
if node is None: return True
lh = height(node.left)
rh = height(node.right)
balanced_here = abs(lh - rh) <= 1
return balanced_here and is_balanced(node.left) and is_balanced(node.right)
# Balanced tree [10, 5, 15, 3, 7, 20]
bst_good = BST()
for v in [10, 5, 15, 3, 7, 20]: bst_good.insert(v)
# Degenerate tree — sorted insertion
bst_bad = BST()
for v in [1, 2, 3, 4, 5]: bst_bad.insert(v)
print(f"Good tree — height: {height(bst_good.root)}, balanced: {is_balanced(bst_good.root)}")
print(f"Bad tree — height: {height(bst_bad.root)}, balanced: {is_balanced(bst_bad.root)}")
Inserting an already-sorted list into a plain BST creates a linked list in disguise.
If your data might arrive sorted, use a self-balancing tree (AVL, Red-Black)
or shuffle before insertion. Python's sortedcontainers.SortedList handles this
transparently in production code.
Level-Order Traversal — Breadth-First Search
All three traversals above are DFS (using the call stack). Level-order traversal visits nodes row-by-row using an explicit queue. It's the foundation of BFS and essential for problems like "find the shortest path" or "print the tree level by level."
queue = [root]. The queue is the engine — it guarantees we always process a full level before descending.[10, 5, 15, 3, 7, 20] for our example tree.from collections import deque
def level_order(root):
if root is None: return []
result, queue = [], deque([root]) # deque = efficient queue
while queue:
node = queue.popleft() # O(1) pop from front
result.append(node.val)
if node.left: queue.append(node.left)
if node.right: queue.append(node.right)
return result
def level_order_grouped(root):
# Returns a list of lists — one per level
if root is None: return []
result, queue = [], deque([root])
while queue:
level_size = len(queue) # snapshot current level width
level = []
for _ in range(level_size):
node = queue.popleft()
level.append(node.val)
if node.left: queue.append(node.left)
if node.right: queue.append(node.right)
result.append(level)
return result
bst = BST()
for v in [10, 5, 15, 3, 7, 20]: bst.insert(v)
print("Level-order flat :", level_order(bst.root))
print("Level-order grouped:", level_order_grouped(bst.root))
Python's list.pop(0) is O(n) — it shifts every remaining
element. collections.deque.popleft() is O(1). For large trees,
this difference is the boundary between a correct solution and a TLE on a coding interview.
Time & Space Complexity — Full Reference
| Operation | Balanced BST | Degenerate BST | AVL / Red-Black | Space (call stack) | Guaranteed? |
|---|---|---|---|---|---|
| Insert | O(log n) | O(n) | O(log n) | O(h) | Avg only |
| Search | O(log n) | O(n) | O(log n) | O(h) | Avg only |
| Delete | O(log n) | O(n) | O(log n) | O(h) | Avg only |
| In-Order Traversal | O(n) | O(n) | O(n) | O(h) | Always |
| Level-Order (BFS) | O(n) | O(n) | O(n) | O(w)* | Always |
| Height | O(log n) | O(n) | O(log n) | O(h) | Avg only |
| Min / Max | O(log n) | O(n) | O(log n) | O(h) | Avg only |
h = tree height; w = max level width (BFS queue space). For a balanced tree h = ⌊log₂n⌋; worst case h = n−1.
A plain BST's complexity is average-case O(log n), not guaranteed.
Use an AVL tree or Red-Black tree (Python: sortedcontainers.SortedList)
when you need worst-case O(log n) guarantees — for example, in real-time systems or
when input order is adversarial.
Real-World Use Cases
INDEX you create stores data in a B-tree variant. Range queries (WHERE age BETWEEN 20 AND 30) are in-order traversals over disk blocks.os.walk() in Python is a post-order traversal — it processes the deepest nested folders before their parents.document.querySelector() is a DFS search; innerHTML renders via a pre-order traversal that writes parent tags before children.heapq is a binary tree stored as an array. Dijkstra's algorithm, A*, and task schedulers all depend on O(log n) heap operations.Golden Rules
None base case first.
Every recursive tree function must start with if node is None: return ….
Forgetting this causes AttributeError on empty trees or after reaching leaves — the
single most common binary tree bug in production and in interviews.
deque for BFS, never list.pop(0).
Python's list pop-from-front is O(n). On a tree with 10,000 nodes, that turns an O(n)
BFS into an O(n²) crawl. Import collections.deque and use
popleft() for all queue-based tree problems.
sortedcontainers.SortedList which maintains O(log n) always.
is_balanced function that computes height naively inside a loop runs
O(n²) — use a combined helper that returns height and balance status in one pass.
def inorder(node, result=[]) reuses the same list across all calls in a
Python session — a silent data corruption bug. Always pass an empty list explicitly
or initialise inside the function body with if result is None: result = [].