What Is a Graph?
Now think about your phone contacts. Every person is a point. Every friendship or follow between two people is a link. Instagram, Twitter, LinkedIn — all graphs.
A graph is simply a way to model anything that has things and relationships between those things. Almost every hard computer science problem — routing, scheduling, recommendation engines, compilers, AI pathfinding — reduces to a graph problem underneath.
Formally, a graph G = (V, E) is a pair of two sets: V (a set of vertices / nodes) and E (a set of edges / connections). That is the entire definition. Everything else is built on top of it.
Types of Graphs — A Visual Reference
Graphs come in many flavours. The type of graph determines which algorithms you can use and what guarantees they provide. Misidentifying your graph type is the #1 source of algorithm bugs.
Graph Vocabulary — Every Term You'll Encounter
Every data structure tree you know — binary tree, BST, AVL tree, heap, trie — is a connected, undirected, acyclic graph with exactly V−1 edges. BFS and DFS work on all of them. Level-order traversal of a BST is just BFS. Pre/in/post-order traversal is just DFS. The graph algorithms you learn here apply directly to every tree problem you'll ever encounter.
Representing Graphs in Python
Before you can run BFS or DFS, you need to store the graph in memory. The two main approaches — adjacency list and adjacency matrix — have wildly different performance characteristics.
# ══════════════════════════════════════════════════════════════
# METHOD 1 — Adjacency List (dict of lists) ← use this 99% of time
# ══════════════════════════════════════════════════════════════
# Undirected graph: A—B, A—C, B—D, C—D, D—E
undirected = {
'A': ['B', 'C'],
'B': ['A', 'D'],
'C': ['A', 'D'],
'D': ['B', 'C', 'E'],
'E': ['D'],
}
# Directed graph: A→B, A→C, B→D, C→D
directed = {
'A': ['B', 'C'], # A points TO B and C
'B': ['D'], # B points TO D only
'C': ['D'],
'D': [], # D has no outgoing edges
}
# Weighted graph: edge (A,B) has weight 4, (A,C) has weight 7
weighted = {
'A': [('B', 4), ('C', 7)],
'B': [('A', 4), ('D', 2)],
'C': [('A', 7), ('D', 9)],
'D': [('B', 2), ('C', 9)],
}
# ══════════════════════════════════════════════════════════════
# METHOD 2 — Adjacency Matrix (2D list) ← use for dense graphs
# ══════════════════════════════════════════════════════════════
# Nodes indexed: A=0, B=1, C=2, D=3
# matrix[i][j] = 1 if edge (i,j) exists, else 0
matrix = [
[0, 1, 1, 0], # A: connected to B, C
[1, 0, 0, 1], # B: connected to A, D
[1, 0, 0, 1], # C: connected to A, D
[0, 1, 1, 0], # D: connected to B, C
]
# ── Helper: build graph from edge list (common in interviews) ──
def build_graph(n, edges, directed=False):
"""Build adjacency list from edge list.
n: number of nodes (0-indexed)
edges: list of [u, v] pairs
"""
graph = {i: [] for i in range(n)}
for u, v in edges:
graph[u].append(v)
if not directed:
graph[v].append(u) # add reverse for undirected
return graph
# Usage:
g = build_graph(5, [[0,1],[0,2],[1,3],[2,3],[3,4]])
print(g)
# {0:[1,2], 1:[0,3], 2:[0,3], 3:[1,2,4], 4:[3]}
| Operation | Adjacency List | Adjacency Matrix |
|---|---|---|
| Space | O(V + E) | O(V²) |
| Check edge (u,v) exists | O(degree(u)) | O(1) |
| Iterate all neighbours of u | O(degree(u)) | O(V) |
| Add a node | O(1) | O(V²) rebuild |
| BFS / DFS performance | O(V+E) — optimal | O(V²) — scans all |
| Best for | Sparse (real world) | Dense or tiny graphs |
A graph with 1,000,000 nodes and 2,000,000 edges (very common in social networks) needs only ~16 MB as an adjacency list. The same graph as an adjacency matrix needs 8 TB — completely impossible. Real-world graphs are almost always sparse. In coding interviews, default to adjacency list unless the problem explicitly says "dense graph".
The Story That Explains Graph Traversal
Strategy A — BFS (Breadth-First): You check every seat in Row 1, then every seat in Row 2, then Row 3 — sweeping across the entire cinema level by level. Your friend is close to the entrance? You find them immediately. Far back? You visit every closer seat first.
Strategy B — DFS (Depth-First): You pick an aisle and sprint all the way to the back of the cinema, checking column by column as deep as possible before backtracking to the next aisle. Your friend is in the back corner? You find them fast. Near the front? You might walk the entire cinema first.
Neither strategy is wrong — they are optimised for different goals. BFS finds the shortest path. DFS explores the deepest structure.
BFS and DFS are the two fundamental algorithms for visiting every node in a graph — the backbone of hundreds of real-world applications. Both run in O(V + E) time. The difference is entirely in their order of exploration — which comes down to one data structure choice: queue vs stack.
Live Animation — BFS vs DFS on the Same Graph
Watch BFS sweep level-by-level and DFS dive deep before backtracking — on the identical 9-node graph. Press Step → to go one node at a time, or Auto Play to watch the full traversal.
Yellow nodes are discovered but not yet processed (in queue/stack). Blue = BFS visited. Purple = DFS visited. Green = currently being processed. BFS visits A → B → C → D → E → F → G → H → I (sweeps levels). DFS visits A → B → D → H → E → C → F → I → G (dives deep first).
BFS — Breadth-First Search
BFS uses a queue (FIFO). When you visit a node, you enqueue all its unvisited neighbours. Because the queue processes nodes in the order they were discovered, you always finish all nodes at distance d before visiting any node at distance d+1.
Queue: [A] · Visited: {A}Queue: [B, C]Queue: [C, D, E] — C still waiting from level 1!Queue: [D, E, F, G]from collections import deque
def bfs(graph: dict, start) -> list:
"""
Breadth-First Search — returns nodes in visit order.
graph: adjacency list {node: [neighbour, ...]}
"""
visited = {start} # mark on ENQUEUE, not dequeue
queue = deque([start]) # O(1) popleft — critical!
order = []
while queue:
node = queue.popleft() # FIFO — front of queue
order.append(node)
for neighbour in graph[node]:
if neighbour not in visited:
visited.add(neighbour)
queue.append(neighbour) # enqueue at BACK
return order
graph = {'A':['B','C'],'B':['A','D','E'],'C':['A','F','G'],'D':['B','H'],'E':['B','H'],'F':['C','I'],'G':['C','I'],'H':['D','E'],'I':['F','G']}
print("BFS order:", " → ".join(bfs(graph, 'A')))
list.pop(0) is O(n) — it shifts every remaining element left.
collections.deque.popleft() is O(1).
On a 10,000-node graph, using a list as a queue makes BFS run in O(V²) instead of O(V+E).
This single mistake turns a fast algorithm into an impossibly slow one.
BFS Superpower — Shortest Path on Unweighted Graphs
BFS guarantees the shortest path (fewest hops) between two nodes on an unweighted graph, because it explores all paths of length 1 before length 2, and so on. This is the algorithm behind social "degrees of separation", web crawl depth limits, and game pathfinding.
from collections import deque
def bfs_shortest_path(graph, start, end):
"""Returns shortest path list from start→end, or [] if unreachable."""
if start == end: return [start]
visited = {start}
queue = deque([(start, [start])]) # (node, path_so_far)
while queue:
node, path = queue.popleft()
for nb in graph[node]:
if nb not in visited:
new_path = path + [nb]
if nb == end: return new_path # ← guaranteed shortest
visited.add(nb)
queue.append((nb, new_path))
return [] # unreachable
def bfs_distances(graph, start):
"""Returns dict of shortest distances from start to every node."""
dist = {start: 0}
queue = deque([start])
while queue:
node = queue.popleft()
for nb in graph[node]:
if nb not in dist:
dist[nb] = dist[node] + 1 # one hop further
queue.append(nb)
return dist
path = bfs_shortest_path(graph, 'A', 'H')
print(f"Shortest path A→H: {' → '.join(path)} ({len(path)-1} hops)")
distances = bfs_distances(graph, 'A')
print("Distances from A:", distances)
BFS finds the path with the fewest edges, not the lowest total weight. If edges have costs (road distances, prices), a path with 2 expensive edges can be worse than a path with 5 cheap edges. Use Dijkstra's algorithm (priority queue instead of plain queue) for weighted shortest paths.
DFS — Depth-First Search
DFS uses a stack (LIFO). You push a node, pop it, visit it, and immediately push all its unvisited neighbours. The last pushed neighbour is the next one visited — so you always go deeper before going wider.
# ── Iterative DFS ─────────────────────────────────────────────
def dfs_iterative(graph, start):
visited = set()
stack = [start]
order = []
while stack:
node = stack.pop() # LIFO — last in, first out
if node in visited: continue
visited.add(node)
order.append(node)
for nb in reversed(graph[node]): # reverse → natural order
if nb not in visited:
stack.append(nb)
return order
# ── Recursive DFS ──────────────────────────────────────────────
def dfs_recursive(graph, node, visited=None, order=None):
if visited is None: visited = set()
if order is None: order = []
visited.add(node)
order.append(node) # PRE-ORDER: process before recursing
for nb in graph[node]:
if nb not in visited:
dfs_recursive(graph, nb, visited, order)
return order
# POST-ORDER: move order.append(node) to HERE (after the for loop)
print("Iterative:", dfs_iterative(graph, 'A'))
print("Recursive:", dfs_recursive(graph, 'A'))
DFS Power Applications
# ══════════════════════════════════════════════════════════════
# 1. CYCLE DETECTION — directed graph
# ══════════════════════════════════════════════════════════════
def has_cycle_directed(graph):
"""True if directed graph contains a cycle (back-edge detection)."""
visited = set()
rec_stack = set() # nodes in CURRENT DFS path
def dfs(node):
visited.add(node); rec_stack.add(node)
for nb in graph.get(node, []):
if nb not in visited:
if dfs(nb): return True
elif nb in rec_stack:
return True # back-edge = cycle
rec_stack.discard(node)
return False
return any(dfs(n) for n in graph if n not in visited)
# ══════════════════════════════════════════════════════════════
# 2. TOPOLOGICAL SORT — DFS post-order on a DAG
# ══════════════════════════════════════════════════════════════
def topo_sort(graph):
"""Topological ordering of a DAG using DFS post-order."""
visited = set()
result = []
def dfs(node):
visited.add(node)
for nb in graph.get(node, []):
if nb not in visited: dfs(nb)
result.append(node) # POST-ORDER: after all children
for node in graph:
if node not in visited: dfs(node)
return result[::-1] # reverse → topological order
# ══════════════════════════════════════════════════════════════
# 3. CONNECTED COMPONENTS
# ══════════════════════════════════════════════════════════════
def connected_components(graph):
"""Returns list of components; each component is a set of nodes."""
visited = set()
components = []
def dfs(node, component):
visited.add(node); component.add(node)
for nb in graph[node]:
if nb not in visited: dfs(nb, component)
for node in graph:
if node not in visited:
comp = set()
dfs(node, comp)
components.append(comp) # each DFS run = 1 component
return components
# ── Tests ──
dag = {'A':['C'],'B':['C','D'],'C':['E'],'D':['F'],'E':['F'],'F':[]}
print("Topo sort:", topo_sort(dag))
disc_graph = {'A':['B'],'B':['A'],'C':['D'],'D':['C'],'E':[]}
print("Components:", connected_components(disc_graph))
A node is appended to the result only after all nodes it depends on (all its
successors in the DAG) have been fully processed. Reversing this list gives you the
order in which nodes should be processed: dependencies always come before dependents.
This is exactly what pip install and build systems like make
compute under the hood.
Animated Code Walkthrough — Line by Line
Step through each algorithm's execution. The highlighted line shows what is running now; the state panel shows every variable at that exact moment.
Time & Space Complexity — Full Comparison
| Property | BFS | DFS | Notes |
|---|---|---|---|
| Time complexity | O(V + E) | O(V + E) | Both visit every node and edge exactly once. |
| Space — data structure | O(V) worst | O(h) depth | BFS queue can hold entire level. DFS stack holds only current path. |
| Shortest path (unweighted) | Guaranteed ✓ | Not guaranteed | BFS guarantees fewest hops; DFS finds A path, not THE shortest. |
| Wide / bushy graphs | More memory | Less memory | Social networks with millions of friends-of-friends favour DFS. |
| Deep / narrow graphs | Less memory | Stack overflow risk | Recursive DFS crashes on depth > ~1000 in Python. |
| Cycle detection | Yes (with tracking) | Natural via back-edge | DFS back-edge approach is more elegant and direct. |
| Topological sort | Use Kahn's instead | Yes — post-order | DFS post-order is the textbook topological sort algorithm. |
| Connected components | Yes | Yes | Either works; pick whichever is simpler for your code. |
| Complete traversal guarantee | Yes | Yes | Both visit every reachable node from the start — given no bugs. |
| Suitable for trees | Level-order | Pre/In/Post order | Both are essential in tree interview problems. |
Both algorithms are identical on the time axis. The choice almost always comes down to space and what you need from the result. Wide graphs (social networks) favour DFS for memory. Shortest-path demands BFS regardless of width. Topological sort and cycle detection demand DFS.
BFS vs DFS — When to Use Which
✦ Level-order traversal of a tree.
✦ Target is close to start — BFS finds nearby nodes fast.
✦ Web crawl with depth limit.
✦ "Minimum steps to reach X" problems.
✦ Topological sort of a DAG.
✦ Path exists problems (does path exist? — not shortest).
✦ Memory limited on wide graphs.
✦ Backtracking, maze solving, all-paths enumeration.
✦ Checking if graph is connected (A reach B?).
✦ Marking all reachable nodes.
✦ Flood fill (image processing, paint bucket tool).
Real-World Use Cases
Complete Graph Type × Algorithm Reference
| Graph Type | BFS Works? | DFS Works? | Notes |
|---|---|---|---|
| Undirected, unweighted | Yes | Yes | BFS for shortest path; DFS for components/cycles. |
| Directed, unweighted | Yes | Yes | DFS needed for topo sort and directed cycle detection. |
| Weighted (non-negative) | Traversal only | Traversal only | Use Dijkstra's for shortest path. |
| Weighted (negative edges) | Not for shortest path | Not for shortest path | Use Bellman-Ford. |
| DAG | Yes (Kahn's for topo) | Yes (post-order topo) | No cycle detection needed — DAGs are by definition acyclic. |
| Disconnected | Outer loop needed | Outer loop needed | Loop over all nodes; start BFS/DFS on each unvisited one. |
| Tree | Level-order | Pre/In/Post-order | Trees are just acyclic connected graphs — both work natively. |
Golden Rules
collections.deque for BFS — never list.pop(0).
list.pop(0) is O(n); deque.popleft() is O(1).
On a 100,000-node graph, the wrong choice turns an O(V+E) algorithm into O(V²) —
hours instead of milliseconds.
sys.setrecursionlimit() for quick fixes, or convert to an explicit stack
for production code. Iterative DFS is always safer.