Depth-First Search: The Algorithm Behind Half Your Graph Problems

Depth-first search commits to a direction and follows it until there is nowhere left to go. Then it backs up and tries the next option. That is the whole algorithm. One sentence. The reason it solves hundreds of different problems is that "commit to a direction until stuck, then backtrack" turns out to be the right model for an enormous class of questions: can I reach X from Y, how many connected regions exist, does this dependency graph have a cycle, what are all possible paths?

If a problem involves exhaustively exploring a space, tracing reachability, or analyzing the structure of a graph, DFS is almost certainly what the interviewer expects. Understanding it deeply means understanding the memory model, the iteration-order trap, and the two-set rule. Get any of them wrong in an interview and your code looks reasonable but produces incorrect results on the cases that matter.

How Depth-First Search Commits to a Path

Start at a node. Mark it visited. Pick an unvisited neighbor and move to it. Keep doing that until no unvisited neighbors remain. Then back up to the nearest ancestor that still has unexplored neighbors. Continue from there.

On a tree this is clean because there are no cycles. On a graph you need a visited set to avoid spinning forever on back edges.

def dfs(graph, node, visited=None):
    if visited is None:
        visited = set()
    visited.add(node)
    print(node, end=" ")
    for neighbor in graph[node]:
        if neighbor not in visited:
            dfs(graph, neighbor, visited)

The code is almost insultingly short. The complexity is all in understanding what "the call stack holds" means.

DFS explores A completely via B before ever touching C. The amber numbers show visit order; the dashed line shows backtracking.

The Call Stack Is the Data Structure

In recursive DFS, the call stack is the traversal state, not just an implementation detail. Each frame stores the current node, the position in the neighbor list (implicitly, via where the for-loop is paused), and any accumulated local variables like the path so far.

This is why recursive DFS has O(h) space on trees and O(V) space on general graphs: the call stack depth equals the longest unbroken path DFS is currently exploring.

For a balanced binary tree with n nodes, that depth is O(log n), shallow and fast. For a graph that is just a long chain (A -> B -> C -> ... -> Z with 10,000 nodes), depth is O(n), and Python will crash with RecursionError around depth 1000. It will not warn you first. No deprecation notice, no friendly suggestion to try the iterative version. Just a stack trace that is, ironically, too long to scroll through. You can call sys.setrecursionlimit(100000), but the CPython docs warn that too high a limit "can lead to a crash" because you are fighting the C stack underneath. That stack is whatever the OS gave the thread, typically 2 MB on most Linux architectures and 8 MB by default for the main thread. The crash moves, it does not disappear.

Iterative DFS moves the state to the heap by using an explicit stack:

def dfs_iterative(graph, start):
    visited = set()
    stack = [start]
    while stack:
        node = stack.pop()
        if node in visited:
            continue
        visited.add(node)
        print(node, end=" ")
        for neighbor in reversed(graph[node]):
            if neighbor not in visited:
                stack.append(neighbor)

Heap memory is orders of magnitude larger than the call stack. Iterative DFS handles millions of nodes without complaint.

Same algorithm, same state. Left: call frames eating OS stack. Right: a Python list on the heap that does not care how deep you go.

The Neighbor-Order Trap

Naive iterative DFS does not produce the same traversal order as recursive DFS. This trips up everyone who writes iterative DFS for the first time and then tests against expected output. Yes, everyone.

The reason is the stack's LIFO behavior. If you push neighbors [B, C] for node A, you push B first, then C. Popping gives C before B. Recursive DFS visits B before C because it iterates in forward order. To match recursive order, push neighbors in reverse so the first neighbor ends up on top.

That is what the reversed() call in the iterative version above does. It is only mandatory if you need order to match. For most graph problems (connectivity, cycle detection, flood fill) the traversal order does not matter, and you can skip it.

O(V + E) Time, O(V) Space: Where the Numbers Come From

Operation	Time	Space	Notes
DFS traversal on tree	O(n)	O(h)	h = height; no visited set needed
DFS traversal on graph	O(V + E)	O(V)	visited set + call stack
Iterative DFS on graph	O(V + E)	O(V)	explicit stack lives on heap
Cycle detection, undirected	O(V + E)	O(V)	back edge = cycle
Cycle detection, directed	O(V + E)	O(V)	two-set approach required
Topological sort	O(V + E)	O(V)	postorder, then reverse
Connected components	O(V + E)	O(V)	outer loop over unvisited nodes
Path existence check	O(V + E)	O(V)	stop on first reach of target

Why O(V + E)? DFS visits each vertex exactly once because visited nodes are skipped. Each edge is processed at most once (for directed) or twice (for undirected, once from each endpoint). The total work is proportional to vertices plus edges, not vertices squared.

Why O(V) space? The visited set holds at most V entries. The call stack (or explicit stack) holds at most V entries at peak depth, because DFS cannot revisit nodes. Both structures are bounded by vertex count.

Note what is absent from this table: amortized analysis. There are no amortized bounds in DFS. Every node is visited once, every edge crossed once. The math is exact, not amortized-over-time.

For trees the space simplifies further to O(h) because no visited set is needed (trees have no cycles) and the maximum stack depth equals tree height, not vertex count. A balanced tree with n nodes has h = O(log n), which is why binary tree DFS runs in O(log n) auxiliary space in the average case.

Cycle Detection: Why One Visited Set Is Never Enough

This is the most common DFS bug in interviews. You want to detect cycles in a directed graph. You use a single visited set. You return true if you encounter a node already in the set. It looks right. It passes all your tests. You push it up.

It is wrong.

Consider this graph:

A -> B -> D
A -> C -> D

DFS from A: visits A, B, D. D has no neighbors, returns. Backtracks to B, finishes, backtracks to A. Now visits C. C tries to visit D. D is already in visited. Bug: you report a cycle. There is no cycle.

The problem is that "visited in a previous DFS branch" looks identical to "visited in the current path" if you only use one set. A cycle requires an edge back to a node on the current path, specifically an ancestor. A node from a completed branch is not an ancestor.

For directed graph cycle detection, you need two structures: visited for all nodes ever seen, and rec_stack for nodes on the active DFS path.

When DFS enters a node, it goes into both sets. When DFS finishes a node (all neighbors explored), remove it from rec_stack but keep it in visited. A cycle exists only when you find an edge to a node still in rec_stack.

def has_cycle_directed(graph, node, visited, rec_stack):
    visited.add(node)
    rec_stack.add(node)
    for neighbor in graph[node]:
        if neighbor not in visited:
            if has_cycle_directed(graph, neighbor, visited, rec_stack):
                return True
        elif neighbor in rec_stack:
            return True
    rec_stack.remove(node)
    return False

This is the white/gray/black coloring made famous by CLRS. White is unvisited, gray is in rec_stack (active), black is finished. A gray-to-gray edge (back edge) is a cycle.

For undirected graphs, the two-set approach is unnecessary. Any back edge is a cycle because edges are bidirectional. You only need to exclude the parent node from "visited neighbor = cycle" to avoid counting the edge you came from.

$Two directed graphs side by side. Left: diamond graph A-B-D, A-C-D with no cycle. rec_stack = {A, C} when C reaches D, showing D is not in rec_stack. Right: cycle A-B-C-A with rec_stack = {A, B, C} when the back edge from C to A is found.$ Left: D is in visited but not rec_stack. No cycle. Right: A is in rec_stack when C processes the edge back to A. Cycle found.

One Structure, Every Language

Each implementation shows recursive DFS and iterative DFS on a directed graph represented as an adjacency list. The graph A -> [B, C], B -> [D, E], C -> [F] is used as the example throughout.

from collections import defaultdict
from typing import Optional

def dfs_recursive(graph: dict[str, list[str]], start: str, visited: Optional[set[str]] = None) -> None:
    if visited is None:
        visited = set()
    visited.add(start)
    print(start, end=" ")
    for neighbor in graph[start]:
        if neighbor not in visited:
            dfs_recursive(graph, neighbor, visited)

def dfs_iterative(graph: dict[str, list[str]], start: str) -> None:
    visited: set[str] = set()
    stack = [start]
    while stack:
        node = stack.pop()
        if node in visited:
            continue
        visited.add(node)
        print(node, end=" ")
        for neighbor in reversed(graph[node]):
            if neighbor not in visited:
                stack.append(neighbor)

graph: dict[str, list[str]] = defaultdict(list, {
    "A": ["B", "C"],
    "B": ["D", "E"],
    "C": ["F"],
    "D": [], "E": [], "F": [],
})
dfs_recursive(graph, "A")   # A B D E C F
print()
dfs_iterative(graph, "A")   # A B D E C F

What DFS Is Actually Good For

Most DFS problems in interviews fall into one of three categories.

Exhaustive exploration. When you need to visit every node in a connected region, DFS is the simplest tool. Flood fill (repainting a pixel region), counting islands in a grid, and finding all paths between two nodes all reduce to "start DFS, collect everything reachable." BFS works here too, but DFS usually needs less boilerplate.

Backtracking and constraint satisfaction. DFS's commit-and-backtrack pattern is the skeleton of every backtracking algorithm. N-queens, Sudoku solving, generating all permutations, word search on a grid. The pattern is: try a choice, recurse, undo the choice when returning. The undo step is what separates backtracking DFS from plain DFS.

Graph structure analysis. Cycle detection, topological sort, finding strongly connected components (Tarjan's 1972 algorithm, Kosaraju's algorithm), articulation points, and bridges all rely on properties visible in the DFS tree. Tarjan's trick is the lowlink value: during DFS, each node tracks the smallest index it can reach through its subtree, and an SCC root is found exactly when a node's index equals its own lowlink. Entry and exit timestamps from DFS also let you answer ancestor queries in O(1): node i is an ancestor of j if entry[i] < entry[j] and exit[i] > exit[j]. The whole tree structure is encoded in those timestamps.

BFS wins for shortest-path on unweighted graphs because it explores level by level. For everything else, DFS is usually the right default. If you are unsure which to reach for, start with DFS. You can always switch when someone asks for the shortest path.

Two Problems That Show the Pattern

Does the problem ask you to explore an entire region? Does it ask about connectivity, reachability, or cycles?

If yes to either, DFS is the tool.

Number of Islands (LeetCode 200)

You have a 2D grid of '1' (land) and '0' (water). Count the distinct islands.

Think of each land cell as a node. Two cells are connected if adjacent horizontally or vertically. The problem is asking for the number of connected components in this implicit graph.

DFS fits because you want to fully explore each island before counting it as distinct. Start at any unvisited '1', run DFS to visit every land cell reachable from it (marking each visited), then increment the island count. Scan for the next unvisited '1' and repeat.

def numIslands(grid: list[list[str]]) -> int:
    rows, cols = len(grid), len(grid[0])
    count = 0

    def dfs(r: int, c: int) -> None:
        if r < 0 or r >= rows or c < 0 or c >= cols or grid[r][c] != '1':
            return
        grid[r][c] = '0'  # mark visited by sinking the cell
        dfs(r + 1, c)
        dfs(r - 1, c)
        dfs(r, c + 1)
        dfs(r, c - 1)

    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == '1':
                dfs(r, c)
                count += 1
    return count

Time: O(m * n) because every cell is visited at most once. Space: O(m * n) in the worst case where the entire grid is land and DFS depth equals the number of cells.

DFS sinks cells to '0' as it goes. The outer loop finds the next unvisited '1' and starts a new island count.

Course Schedule (LeetCode 207)

You have n courses and a list of [course, prerequisite] pairs. Can you finish all courses?

This is asking: does the prerequisite graph contain a cycle? A cycle means "A requires B requires A." Impossible to satisfy.

Build a directed graph (course -> list of prerequisites). Run the two-set DFS cycle detection on it. If any back edge exists (a node on the current path points to an ancestor in rec_stack), return false.

from collections import defaultdict

def canFinish(numCourses: int, prerequisites: list[list[int]]) -> bool:
    graph = defaultdict(list)
    for course, prereq in prerequisites:
        graph[course].append(prereq)

    visited: set[int] = set()
    rec_stack: set[int] = set()

    def has_cycle(node: int) -> bool:
        visited.add(node)
        rec_stack.add(node)
        for neighbor in graph[node]:
            if neighbor not in visited:
                if has_cycle(neighbor):
                    return True
            elif neighbor in rec_stack:
                return True
        rec_stack.remove(node)
        return False

    for course in range(numCourses):
        if course not in visited:
            if has_cycle(course):
                return False
    return True

The outer loop handles disconnected graphs. If some courses have no prerequisites and no dependents, DFS from course 0 never reaches them. The loop ensures every node is eventually visited.

The amber glow marks nodes still on the active DFS path (rec_stack). A red back edge to an amber node is a cycle.

The Mental Model, Compressed

DFS commits to a path until it cannot proceed, then backtracks. The call stack (recursive) or explicit stack (iterative) holds the current path.
Time complexity is O(V + E): each vertex visited once, each edge crossed once. No amortization needed.
Space complexity is O(h) for trees (h = height), O(V) for graphs (visited set + stack).
Recursive DFS on deep graphs crashes. Python's default limit is 1000 frames. Use iterative DFS when graph depth is unbounded.
Iterative DFS does not match recursive traversal order unless neighbors are pushed in reverse. For most problems, order does not matter.
Cycle detection in directed graphs requires two structures: visited for all seen nodes and rec_stack for nodes on the active path. One set is insufficient.
Topological sort is DFS postorder, reversed. When a node finishes (all neighbors explored), append it to the result, then reverse at the end.
Undirected graphs produce only tree edges and back edges during DFS. Forward edges and cross edges only exist in directed graphs.
Back edges mean cycles. Gray-to-gray in CLRS coloring, rec_stack member in code.
The outer loop trick (iterate over all nodes, call DFS only if unvisited) handles disconnected graphs.

Practice applying DFS under realistic time pressure with SpaceComplexity, which runs voice-based mock interviews with rubric-based feedback on your explanation, not just your code. Knowing the algorithm is table stakes. Narrating your reasoning while you implement it is the actual skill.

For the communication layer, how you explain your approach matters as much as the solution itself. And when a problem is ambiguous about whether the graph is directed or undirected, or whether cycles are possible, asking the right clarifying questions before you start changes what algorithm you reach for.