What Is Breadth-First Search? The Level-by-Level Algorithm Behind Shortest Paths

- BFS guarantees the shortest path in unweighted graphs because it exhausts every path of length N before trying any path of length N+1.
- A queue is what makes BFS work: swap it for a stack and you have DFS — same code, opposite traversal order.
- Mark nodes visited on enqueue, not dequeue or dense graphs cascade into redundant visits and degraded performance.
- Time complexity is O(V+E): every vertex visited once, every edge traversed once; space is O(V) worst case for the frontier.
- The level-tracking template snapshots
len(queue)before the inner loop to process exactly one level per outer iteration without storing depth in each tuple. - Multi-source BFS seeds all starting nodes before the loop begins — the right tool when spreading starts simultaneously from multiple points.
- "Minimum steps" in an unweighted setting is the clearest signal to reach for BFS over DFS.
Drop a stone in still water. Ripples spread outward in rings, one at a time, each a little farther from where the stone hit. That's basically it. That's BFS. You start at one node, visit every neighbor before any of their neighbors, then every one of those neighbors before going deeper. Layer by layer, like a very patient flood.
And here's the part that matters for interviews: the first time BFS arrives at any node, it used the shortest possible route. Not approximately shortest. Not pretty short if you squint. Shortest, full stop.
That single property is why BFS shows up constantly on interview whiteboards.
Why DFS Gets the Wrong Answer Here
Imagine you're modeling a social network. Every person is a node, every friendship is an edge. You want the shortest chain of connections between two people.
Depth-first search would charge down one path until it hit a dead end, backtrack, charge down another one. It might find a valid chain. It will not find the shortest one unless you get lucky or run it exhaustively, at which point you've just done more work for the same result.
BFS finds the shortest path in an unweighted graph, guaranteed, because it exhausts all paths of length one before trying any path of length two.
DFS doesn't make that promise. It tunnels. BFS floods. Different shapes, different guarantees.
A Queue Is the Whole Secret
What separates BFS from DFS at the implementation level comes down to one data structure.
BFS uses a queue (first-in, first-out). When you enqueue a node's neighbors, they wait behind every node already in line. That waiting is what creates the level-by-level behavior. You finish processing every node at depth one before anything at depth two ever gets dequeued.
DFS uses a stack (last-in, first-out). The most recently discovered node goes first. That's how it ends up tunneling deep into one branch before it considers the others.
Swap the queue for a stack and you have DFS. Same algorithm structure, different data structure, completely different traversal order. This is worth understanding at the level of your bones, not just being able to recite it.
BFS in Slow Motion
Here's a graph simple enough to trace by hand:
A
/ \
B C
/ \ \
D E F
Starting from A:
- Level 0: Enqueue A. Visit A.
- Level 1: Dequeue A, enqueue B and C. Visit B. Visit C.
- Level 2: Dequeue B, enqueue D and E. Dequeue C, enqueue F. Visit D. Visit E. Visit F.
Order visited: A, B, C, D, E, F.
If you were searching for F, BFS finds it in exactly two steps. You can also reach D or E in two steps. There is no shorter route to F. BFS knows this without trying anything else.
The Implementation
from collections import deque def bfs(graph: dict[str, list[str]], start: str, target: str) -> int: visited = {start} queue = deque([(start, 0)]) # (node, distance from start) while queue: node, distance = queue.popleft() if node == target: return distance for neighbor in graph[node]: if neighbor not in visited: visited.add(neighbor) queue.append((neighbor, distance + 1)) return -1 # target not reachable
Use collections.deque for O(1) popleft. A regular Python list's pop(0) is O(n) because it shifts every remaining element left. For a grid problem instead of an adjacency list, the structure is identical: replace graph[node] with a four-directional neighbor generator.
Notice that nodes are marked visited when enqueued, not when dequeued. This is where most people get burned.
The Classic Bug: Mark Visited on Enqueue, Not Dequeue
The most common BFS mistake is marking a node visited when you pop it, not when you add it to the queue.
Here's why that explodes. Say node X has two neighbors, both pointing to node Y. You only mark Y visited on dequeue. Before Y is ever dequeued, both of X's neighbors have already enqueued it. Now Y is in the queue twice. In a dense graph with many such overlapping paths, this cascades. Nodes accumulate in the queue over and over. Something that should run in O(V + E) quietly degrades into something much worse, often visiting the same node hundreds or thousands of times.
The worst part: this bug doesn't show up on small test cases. On a tree it works fine. On a five-node graph it works fine. It only surfaces when the graph is large enough or cyclic enough to create real contention, which is exactly the kind of input an interviewer will try.
The fix is two adjacent lines:
for neighbor in graph[node]: if neighbor not in visited: visited.add(neighbor) # mark here queue.append(neighbor) # then enqueue
Mark it, then queue it. In that order. Every time.
O(V + E), and Why Space Is the Gotcha
BFS visits every vertex once and traverses every edge once, so time is O(V + E).
V is vertices, E is edges. In a tree, E = V - 1, so this simplifies to O(n). In a dense graph where everything connects to everything, E approaches V², and time approaches O(V²).
Space is O(V) worst case, because the queue can hold an entire level at once. For a balanced binary tree, the widest level has about n/2 nodes. For a general graph, you can enqueue every vertex before dequeuing a single one.
This is where BFS and DFS actually diverge in practice. DFS only keeps a single path in memory at a time. BFS holds an entire frontier. For very wide graphs, that's a real cost. For very deep ones, the situation flips: DFS can blow the call stack or require an explicit stack of the full depth, while BFS's memory is bounded by width. Neither is universally better. They're just shaped differently.
When BFS Is the Right Call
Shortest path in an unweighted graph. The canonical use. BFS finds it in one pass by construction. DFS cannot make this guarantee without exhaustive search.
Level-order traversal. Printing a binary tree level by level, finding all nodes at distance K from a root, anything that requires processing nodes layer by layer. BFS gives you those layers for free. DFS requires extra bookkeeping to recover them.
Multi-source BFS. Seed the queue with multiple starting nodes simultaneously and BFS finds the shortest path from any of those sources at once. This is the trick in "Rotting Oranges" (LeetCode 994), where infection spreads from every rotten orange at once. You enqueue all starting positions before the loop begins, and the algorithm handles the rest.
DFS is usually the right call when you need cycle detection, full traversal, or path enumeration where order doesn't matter. The one-question test: do you care about shortest distance? BFS. Otherwise, DFS is probably fine.
How to Spot BFS in a Coding Interview
Interviewers expect you to recognize BFS from the problem structure, not from keywords in the problem statement. The signals to watch for:
- "Minimum number of steps / moves / turns"
- "Shortest path" in a maze or grid
- "Level order" traversal of a tree
- Anything spreading outward from a starting set (infection, fire, water rising, signal propagation)
When you see "minimum steps" in an unweighted setting, BFS should be your first thought.
The level-tracking template worth memorizing:
queue = deque([start]) visited = {start} steps = 0 while queue: for _ in range(len(queue)): # process exactly one level node = queue.popleft() if node == target: return steps for neighbor in get_neighbors(node): if neighbor not in visited: visited.add(neighbor) queue.append(neighbor) steps += 1
The for _ in range(len(queue)) snapshot processes exactly one level per outer loop iteration. It's slightly cleaner than storing distance in the queue tuple and makes the level structure explicit to anyone reading your code under time pressure, including you.
You can practice recognizing and implementing these patterns under timed interview conditions at SpaceComplexity, which runs voice-based mock interviews with rubric feedback on exactly this kind of problem.
Further Reading
- BFS on Wikipedia - formal definition, pseudocode, and history
- GeeksforGeeks: BFS for Graphs - worked examples with adjacency list and matrix representations
- Python deque documentation - why deque is O(1) at both ends where list is not
- LeetCode: Binary Tree Level Order Traversal - the canonical BFS tree problem
- LeetCode: Rotting Oranges - the canonical multi-source BFS problem
Want to practice BFS under real interview conditions? See the full BFS algorithm deep-dive for implementation details in every major language, BFS vs DFS for the decision framework, BFS pattern recognition to train problem identification, and top BFS problems for a ranked list of the most interview-relevant problems to drill.