Graph Data Structure Interview: Adjacency Lists, BFS, DFS, and Thinking in Graphs

You have a set of cities and roads between them. Or a set of courses and prerequisites. Or a grid of cells where some are land and some are water. These are completely different problems. They're all graphs.

The graph is the most flexible data structure in your interview toolkit: anything that can be described as "things and relationships between things" is a graph. Once you can recognize when a problem is secretly a graph, you already know which algorithm to reach for. The hard part is seeing it.

This deep dive covers how graphs are stored in memory, what BFS and DFS actually do to that memory (not just "queue" and "stack"), when each representation wins, and how to spot graph problems in disguise.

Two Ways to Store a Graph

A graph is a set of vertices V and edges E. You need to store that somewhere. You have two main choices.

The Adjacency Matrix

An adjacency matrix is a V×V grid. If there's an edge from vertex i to vertex j, you set matrix[i][j] = 1 (or the edge weight). That's it.

Vertices: 0, 1, 2, 3
Edges: 0-1, 0-2, 1-3

     0  1  2  3
  0 [0, 1, 1, 0]
  1 [1, 0, 0, 1]
  2 [1, 0, 0, 0]
  3 [0, 1, 0, 0]

Edge lookup is O(1). "Is there a road from city 2 to city 7?" Just check matrix[2][7]. Done.

The cost: you always allocate V² space. For a graph with 10,000 vertices, that's 100 million cells. Whether you have 10 edges or 10 billion, you pay the same. It's like booking a 10,000-seat venue every time you call a meeting. The venue has no opinions about your attendance.

The Adjacency List

An adjacency list stores, for each vertex, the list of its neighbors.

0 -> [1, 2]
1 -> [0, 3]
2 -> [0]
3 -> [1]

Space is O(V + E). For undirected graphs, each edge appears twice (once in each direction), so the total neighbor-list entries is 2E. You only allocate what you actually use.

Edge lookup is O(degree(v)), meaning you scan the neighbor list of vertex v. For sparse graphs where each vertex has a handful of neighbors, this is fast in practice. Worst case for a star graph (one hub connected to every other vertex), the hub has degree V-1, making lookup O(V).

Which One to Use

The deciding factor is density: how many edges do you have relative to V²?

Condition	Use
Dense graph (E close to V²)	Adjacency matrix
Sparse graph (E much smaller than V²)	Adjacency list
Need O(1) edge lookup frequently	Adjacency matrix
Need to iterate over neighbors efficiently	Adjacency list
Unknown or varying graph size	Adjacency list

Most real-world graphs are sparse. Social networks, road maps, dependency graphs, web link graphs: they all have far fewer edges than V². The default for almost every interview problem is the adjacency list.

What Actually Lives in RAM

Adjacency Matrix Layout

The matrix is a flat 2D array (or array of arrays). All cells live in contiguous memory. When you scan a row (iterating all neighbors of a vertex), you get beautiful cache locality: the CPU prefetches the next cell before you ask for it.

The problem is the wasted space. A matrix representing a graph with 1,000 vertices and 2,000 edges still allocates one million cells. 998,000 of them are zero.

Memory:
[Row 0: 0,1,1,0,...][Row 1: 1,0,0,1,...][Row 2: ...]
 contiguous, cache-friendly, but mostly empty for sparse graphs

Adjacency matrix for a 4-vertex graph: 6 edge cells highlighted in blue, 10 zero cells dimmed. O(V²) space allocated regardless of edge count. 16 cells for 4 vertices. Scale to 10,000 vertices and you're allocating 100 million cells before you've processed a single edge.

Adjacency List Layout

In most language implementations, an adjacency list is an array (or hash map) where each entry points to a dynamic array (or linked list) of neighbors.

heads array:    [ptr0] [ptr1] [ptr2] [ptr3]
                  |      |      |      |
                 [1,2]  [0,3]  [0]   [1]

Each neighbor list may live at a different heap location. When you traverse from one vertex to its neighbors, then to those neighbors' lists, you're pointer-chasing across memory. This is cache-unfriendly compared to the matrix scan. For very hot traversal code (Dijkstra on a dense graph), this pointer-chasing cost shows up in profiles.

In practice, the memory savings of the adjacency list win decisively for sparse graphs. The extra cache misses are a small constant against the savings from not allocating V² space.

Adjacency list memory layout: a heads array with pointer slots for vertices 0-3, each pointing to a linked chain of neighbor nodes scattered in the heap. Head pointers live in one array. The neighbor lists are scattered across the heap. Cache misses are real but cheap compared to not allocating V² cells.

Operation Costs and Why They Hold

Complexity Table

Operation	Adjacency List	Adjacency Matrix
Space	O(V + E)	O(V²)
Add edge	O(1)	O(1)
Remove edge	O(degree(v))	O(1)
Edge lookup (u, v)	O(degree(u))	O(1)
Iterate all neighbors of v	O(degree(v))	O(V)
Add vertex	O(1)	O(V²)
Remove vertex	O(V + E)	O(V²)

Why the Bounds Hold

Space: The adjacency list stores one list entry per edge endpoint, so 2E entries total for an undirected graph. Plus V entries for the heads array. Hence O(V + E). The matrix is always V rows by V columns, so O(V²) regardless of E.

Edge lookup in the list: To check if vertex u has an edge to v, you scan u's neighbor list until you find v or exhaust it. That list has exactly degree(u) entries. Average degree for a graph with E edges is 2E/V, so the average case is O(E/V).

Iterating neighbors: The list gives you exactly degree(v) entries to read, so O(degree(v)). The matrix must scan the entire row of V cells to find non-zero entries, so O(V) regardless of actual degree.

Add vertex to matrix: You need a (V+1)×(V+1) matrix. You have to allocate new memory and copy the old data. That copy touches all V² existing cells.

BFS vs DFS: What They Actually Do to Memory

Both traversals visit every vertex once and process every edge once: time is O(V + E) for both. The difference is in which memory they use and how much of it.

BFS: The Queue

BFS explores level by level. Put the start node in a queue. Dequeue a node, enqueue all unvisited neighbors, repeat.

Graph:          BFS from 0:
0 -- 1          Level 0: [0]
|    |          Level 1: [1, 2]
2 -- 3          Level 2: [3]

The queue holds the current "frontier": all nodes you know about but haven't processed yet. The queue size is bounded by the maximum number of nodes at any one level. For a perfect binary tree, the last level has n/2 nodes, so the queue hits O(n) at that level. For a star graph (one hub, V-1 leaves), BFS from the hub enqueues all V-1 leaves at once.

Worst-case space for BFS: O(V). You might hold nearly every vertex in the queue simultaneously.

BFS traversal: five nodes colored by level (amber for level 0, blue for level 1, green for level 2), with queue contents shown below at each step. BFS explores level by level. The queue holds the entire frontier at each step.

DFS: The Stack (or Call Stack)

DFS dives as deep as it can before backtracking. In the recursive implementation, the call stack is the stack. In iterative form, you use an explicit stack.

Graph (path): 0 -- 1 -- 2 -- 3 -- 4

DFS from 0 recurses: explore(0) -> explore(1) -> explore(2) -> explore(3) -> explore(4)
Call stack depth: 5 frames

The call stack depth equals the depth of the DFS tree, which in the worst case (a path graph) equals V. For a path graph with 10,000 vertices, recursive DFS will maintain 10,000 stack frames. In Python, the default recursion limit is 1,000. You'll hit RecursionError before you finish traversing a moderately long chain. Python will not explain itself. It will just stop.

This is why you switch to iterative DFS with an explicit stack for large inputs. You can also raise the limit with sys.setrecursionlimit(), which feels like the right answer until you wonder why you're arguing with the runtime about how deep you're allowed to go.

Worst-case space for DFS: O(V). But the memory shape is different from BFS: DFS uses a deep narrow stack (one path at a time), while BFS uses a wide shallow queue (one level at a time).

DFS call stack: a 5-node path graph with 5 nested stack frames, each level showing explore(n). The deepest frame, explore(4), is highlighted in amber. Each recursive call adds a frame. On a path graph, every node is someone else's only neighbor, so the stack reaches depth V.

When BFS Memory Blows Up vs. When DFS Memory Blows Up

Wide, shallow graph (like a social network hub): BFS queues the entire first level. DFS recurses only one step deep. DFS wins.
Narrow, deep graph (like a path or a deep tree): DFS recurses to depth V. BFS holds at most a few nodes at each level. BFS wins.
Complete graph: Both are O(V). BFS queues all neighbors of the start node; DFS recurses V deep in the worst case.

Neither algorithm is universally better on memory. The graph's shape determines which hurts more.

Side by side: a star graph (BFS nightmare, all 6 leaves queue up at once) and a vertical path graph (DFS nightmare, recursion reaches depth V=5). Wide graph kills BFS. Deep graph kills DFS. Know your graph before you pick your traversal.

Implementations

from collections import defaultdict, deque

class Graph:
    def __init__(self):
        self.adjacency = defaultdict(list)

    def add_edge(self, u, v):
        self.adjacency[u].append(v)
        self.adjacency[v].append(u)

    def bfs(self, start):
        visited = {start}
        queue = deque([start])
        order = []
        while queue:
            node = queue.popleft()
            order.append(node)
            for neighbor in self.adjacency[node]:
                if neighbor not in visited:
                    visited.add(neighbor)
                    queue.append(neighbor)
        return order

    def dfs(self, start):
        visited = set()
        order = []

        def explore(node):
            visited.add(node)
            order.append(node)
            for neighbor in self.adjacency[node]:
                if neighbor not in visited:
                    explore(neighbor)

        explore(start)
        return order

Use collections.deque for BFS. A regular list's pop(0) is O(n) because it shifts every element; deque.popleft() is O(1).

What Graphs Are Actually Good For

Connectivity. Is vertex A reachable from vertex B? How many connected components exist? A single DFS or BFS answers these in O(V + E).

Shortest paths in unweighted graphs. BFS finds the shortest path by level: the first time you reach a node, you've reached it in the fewest steps. Add weights and you need Dijkstra (non-negative weights) or Bellman-Ford (negative weights allowed).

Cycle detection. In an undirected graph, DFS finds a cycle when it encounters a visited node that isn't the current node's parent. In a directed graph, you look for back edges: edges pointing to a node that's currently on the recursion stack.

Topological ordering. For a directed acyclic graph (DAG), topological sort gives you a linear ordering where all edges go forward. DFS-based topological sort, or Kahn's algorithm (BFS-based), both run O(V + E). This is the mechanism behind dependency resolution.

Flood fill and connected regions. Any grid-based problem about connected regions is a graph problem where each cell is a vertex and edges connect adjacent cells.

Recognizing Graph Problems in Coding Interviews

This is the real skill. Many interview problems don't announce themselves as graph problems.

Signal: "Things with relationships"

If a problem describes entities and connections between them, you're looking at a graph. The signals:

"Adjacent" cells in a grid
"Prerequisite" or "depends on" relationships between tasks
"Can transform into" relationships between states
"Connected" or "reachable" in any sense
Anything involving paths, routes, or sequences of steps

Worked Example 1: Number of Islands

Problem: Given a 2D grid of '1's (land) and '0's (water), count the number of islands.

Why it's a graph: Each cell is a vertex. An edge exists between two cells if they're adjacent (up, down, left, right) and both are land. An island is a connected component.

Solution: Iterate over every cell. When you find an unvisited '1', run BFS or DFS from it, marking all reachable land cells as visited. Increment the island count by one. Each BFS/DFS call visits one complete connected component.

def num_islands(grid):
    if not grid:
        return 0
    rows, cols = len(grid), len(grid[0])
    visited = set()
    count = 0

    def bfs(row, col):
        queue = deque([(row, col)])
        visited.add((row, col))
        while queue:
            r, c = queue.popleft()
            for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
                nr, nc = r + dr, c + dc
                if 0 <= nr < rows and 0 <= nc < cols and (nr, nc) not in visited and grid[nr][nc] == '1':
                    visited.add((nr, nc))
                    queue.append((nr, nc))

    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == '1' and (r, c) not in visited:
                bfs(r, c)
                count += 1
    return count

The grid is never explicitly converted into an adjacency list. The graph is implicit: the for dr, dc loop is just "iterate neighbors."

Worked Example 2: Course Schedule (Cycle Detection in a Directed Graph)

Problem: There are n courses (0 to n-1) and a list of prerequisites [a, b] meaning "take b before a." Can you finish all courses?

Why it's a graph: Each course is a vertex. Each prerequisite pair [a, b] is a directed edge from b to a. You can finish all courses if and only if the directed graph has no cycle (circular dependencies make it impossible).

Solution: Build a directed adjacency list. Run DFS. Maintain a "currently on the recursion stack" set in addition to the "visited" set. If DFS reaches a node that's currently on the stack, there's a cycle.

def can_finish(num_courses, prerequisites):
    adjacency = defaultdict(list)
    for course, prereq in prerequisites:
        adjacency[prereq].append(course)

    visiting = set()
    visited = set()

    def has_cycle(node):
        if node in visiting:
            return True
        if node in visited:
            return False
        visiting.add(node)
        for neighbor in adjacency[node]:
            if has_cycle(neighbor):
                return True
        visiting.remove(node)
        visited.add(node)
        return False

    for course in range(num_courses):
        if has_cycle(course):
            return False
    return True

The visiting set tracks nodes on the current path. The visited set is a memo: if you've already fully explored a node and found no cycle, don't explore it again. This gives O(V + E) instead of exponential.

More Signals That Suggest a Graph

"Minimum number of steps/moves": BFS on a state graph where each state is a vertex and each valid move is an edge.
"Can you reach X from Y": reachability query, DFS or BFS.
"All paths from X to Y": DFS with backtracking.
"Ordering with constraints": topological sort.
"Are these two things in the same group": union-find or connected components.

If the dynamic programming framework is about recognizing overlapping subproblems, the graph framework is about recognizing hidden connectivity. Many DP problems (counting paths in a DAG, shortest path with constraints) are graph problems underneath.

The Non-Obvious Things People Get Wrong

Forgetting to mark visited before enqueuing. If you mark a node visited only when you dequeue it, you can enqueue the same node multiple times before it's processed. For a dense graph, this causes redundant work and can blow up the queue size. Mark visited when you enqueue. This bug passes every small test case and then silently doubles your queue in production.

Using recursive DFS on large inputs in Python. The default recursion limit is 1,000. A path graph with 1,001 nodes will hand you a RecursionError mid-traversal with no further explanation. Use sys.setrecursionlimit() if you need a quick fix, or write iterative DFS with an explicit stack if you want something you can actually defend.

Iterating neighbors of the adjacency matrix. Beginners sometimes build an adjacency list and then check matrix[u][v] for every possible v. That's O(V) per vertex, turning O(V + E) algorithms into O(V²). Use the list if you have it.

Not handling disconnected graphs. BFS and DFS from a single source only visit the connected component containing that source. If you need to visit every vertex, loop over all vertices and start a new traversal for any that haven't been visited. You test with a connected graph. Your interviewer has a disconnected one ready.

Recap

Adjacency list: O(V + E) space, O(degree(v)) neighbor iteration, default for sparse graphs.
Adjacency matrix: O(V²) space, O(1) edge lookup, use only when the graph is dense or constant-time edge queries matter.
BFS uses a queue, explores level by level, finds shortest paths in unweighted graphs, worst-case O(V) queue memory.
DFS uses the call stack (or explicit stack), explores depth-first, enables cycle detection and topological sort, worst-case O(V) stack depth.
Both traversals run O(V + E) time.
The queue memory in BFS balloons on wide graphs; the stack depth in DFS blows up on deep graphs.
Mark nodes visited at enqueue time, not dequeue time.
Recursive DFS in Python hits the default 1,000-frame limit; use iterative DFS for large inputs.
Grid problems, dependency problems, and transformation problems are often graph problems in disguise.

Graph problems show up constantly in technical interviews, and the ones that trip people up are rarely the ones labeled "graph." They're the grid flood-fill, the dependency cycle, the word transformation. Practicing recognition matters as much as practicing the algorithms themselves.

If you want to practice spotting graph problems under interview pressure, SpaceComplexity runs voice-based mock interviews with rubric-based feedback covering exactly this kind of pattern recognition. The goal is to internalize the "this is a graph" instinct so it fires automatically.

For more on interview problem patterns, see Nested Loops Will Cost You the Offer for a similar deep dive into the sliding window pattern, and Ask These Clarifying Questions First for how to handle the modeling step before you write any code.