Graph Algorithms Cheat Sheet: Every Formula Before Your Interview

You have three hours. Maybe two. The interview is real, the recruiter's calendar invite does not care about your feelings, and you just realized you've been fuzzy on Bellman-Ford for six months. This page is the one you skim right before you log on: every graph algorithm, its complexity, the one condition that breaks it, and the single thing that bites people mid-interview. Bookmark it. Print it. Tattoo it somewhere useful.

Representations First

Before you pick an algorithm, pick a representation.

Your default is adjacency list. The only reason to switch to a matrix is O(1) edge lookup or a graph so dense that V² and E are roughly the same.

Representation	Space	Add Edge	Check Edge	Iterate Neighbors
Adjacency list	O(V + E)	O(1)	O(degree)	O(degree)
Adjacency matrix	O(V²)	O(1)	O(1)	O(V)

Grid problems are adjacency lists wearing a costume. Each cell (r, c) has at most four neighbors. The adjacency list is implicit: [(r-1,c), (r+1,c), (r,c-1), (r,c+1)] filtered for bounds. No separate graph object needed.

BFS: Shortest Path on Unweighted Graphs

Use BFS when the graph is unweighted and you need the shortest path, or when you need to process nodes level by level.

Time: O(V + E)
Space: O(V) for the queue
Guarantee: nodes are visited in non-decreasing distance order from the source

from collections import deque

def bfs(graph, start):
    visited = {start}
    queue = deque([(start, 0)])  # (node, distance)
    while queue:
        node, dist = queue.popleft()
        for neighbor in graph[node]:
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append((neighbor, dist + 1))

Classic mistake: list.pop(0) instead of deque.popleft(). The former is O(n) per call. You've just turned your O(V + E) BFS into O(V * E). The algorithm still produces correct answers. It's just slow enough that the interviewer has time to order lunch and finish it before you're done.

DFS: Traversal, Cycle Detection, Paths

Use DFS for cycle detection, connected components, topological sort, and any problem asking about reachability or paths through a graph.

Time: O(V + E)
Space: O(V) on the call stack (the deepest path, not total nodes)

def dfs(graph, node, visited):
    visited.add(node)
    for neighbor in graph[node]:
        if neighbor not in visited:
            dfs(graph, neighbor, visited)

Python's default recursion limit is 1000 frames. On a deep graph, your code will throw a RecursionError and die before it finishes. Switch to iterative DFS when the graph is deep enough to matter.

def dfs_iterative(graph, start):
    visited = {start}
    stack = [start]
    while stack:
        node = stack.pop()
        for neighbor in graph[node]:
            if neighbor not in visited:
                visited.add(neighbor)
                stack.append(neighbor)

For a detailed comparison of when to reach for each, see BFS vs DFS: One Question Decides It Every Time.

Topological Sort: Ordering Dependencies

Only valid on DAGs. Two approaches, same complexity.

If Kahn's output has fewer than V nodes, the graph has a cycle. Start with all nodes of in-degree 0. Process them, decrement neighbor in-degrees, add new zeros to the queue.

from collections import deque

def topological_sort(graph, in_degree, n):
    queue = deque([i for i in range(n) if in_degree[i] == 0])
    order = []
    while queue:
        node = queue.popleft()
        order.append(node)
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    return order if len(order) == n else []  # empty list means cycle

DFS-based: Process nodes in postorder (add to stack after all descendants are visited). Reverse the stack. Cycle detection uses three colors: white (unvisited), gray (in current DFS path), black (fully processed). A back edge to a gray node means a cycle.

Time: O(V + E)
Space: O(V)

See What Is Topological Sort? The Coding Interview Guide for the full walkthrough.

Dijkstra's Algorithm: Weighted Shortest Path

Shortest path from a single source. Works only with non-negative edge weights. The priority queue is the entire algorithm.

Time: O((V + E) log V) with a binary heap
Space: O(V)

import heapq

def dijkstra(graph, start, n):
    dist = [float('inf')] * n
    dist[start] = 0
    heap = [(0, start)]
    while heap:
        d, node = heapq.heappop(heap)
        if d > dist[node]:
            continue  # stale entry, skip
        for neighbor, weight in graph[node]:
            new_dist = dist[node] + weight
            if new_dist < dist[neighbor]:
                dist[neighbor] = new_dist
                heapq.heappush(heap, (new_dist, neighbor))
    return dist

The if d > dist[node]: continue line is lazy deletion. You push duplicate entries rather than updating the heap in place; this guard skips the stale ones. Without it, correctness holds but you do extra work.

One negative edge anywhere breaks Dijkstra's correctness entirely. The algorithm assumes that once a node is popped from the heap, its distance is final. A negative edge can create a shorter path through a node you already closed. It won't catch this. It will return wrong answers with complete confidence and zero apology.

Bellman-Ford: Negative Weights and Cycle Detection

Shortest path from a single source. Handles negative edge weights. Detects negative-weight cycles.

Time: O(V * E)
Space: O(V)

Yes, O(V * E). That's the tax you pay for handling negative weights. Dijkstra is faster, but Dijkstra lies to you when edges go negative. Bellman-Ford is slower and honest.

Relax all edges V-1 times. Any shortest path visits at most V-1 edges. Run one more pass after that. Any distance that still improves means a negative-weight cycle is reachable.

def bellman_ford(edges, n, start):
    dist = [float('inf')] * n
    dist[start] = 0
    for _ in range(n - 1):
        for u, v, w in edges:
            if dist[u] != float('inf') and dist[u] + w < dist[v]:
                dist[v] = dist[u] + w
    for u, v, w in edges:
        if dist[u] != float('inf') and dist[u] + w < dist[v]:
            return None  # negative cycle detected
    return dist

For the full decision between these two, see Dijkstra vs Bellman-Ford: One Negative Edge Changes Everything.

Floyd-Warshall: All-Pairs Shortest Path

Every shortest path between every pair of nodes. Handles negative weights, not negative cycles.

Time: O(V³)
Space: O(V²)

Three nested loops. You see them, you wince, and then you remember this is the price for all-pairs. There is no faster general solution.

The outer loop k is the "relay node." For every pair (i, j), you ask: is there a shorter path from i to j that routes through k?

def floyd_warshall(n, edges):
    dist = [[float('inf')] * n for _ in range(n)]
    for i in range(n):
        dist[i][i] = 0
    for u, v, w in edges:
        dist[u][v] = w
    for k in range(n):
        for i in range(n):
            for j in range(n):
                if dist[i][k] + dist[k][j] < dist[i][j]:
                    dist[i][j] = dist[i][k] + dist[k][j]
    return dist

To detect negative cycles after running: check if any dist[i][i] < 0.

Union-Find: Connectivity in Near-O(1)

Are two nodes in the same component? With path compression and union by rank, each operation costs O(α(n)), where α is the inverse Ackermann function. Call it constant.

The inverse Ackermann function grows so slowly it never exceeds 4 for any input you will encounter in real life, on real hardware, before the heat death of the universe. The complexity proof is genuinely wild. The result is practically O(1).

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py:
            return False  # already connected
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1
        return True

Use Union-Find when: you're merging components online (edge by edge), detecting cycles in undirected graphs, or answering repeated connectivity queries. BFS/DFS also detect cycles, but Union-Find handles the merge-and-query pattern more cleanly.

Minimum Spanning Tree

Two algorithms. Same result. Different performance profiles.

Algorithm	Time	When to Use
Kruskal's	O(E log E)	Sparse graphs
Prim's	O(E log V) with binary heap	Dense graphs

Kruskal's: Sort all edges by weight. Add the cheapest edge that doesn't form a cycle. Use Union-Find for the cycle check. Stops when you have V-1 edges.

Prim's: Start from any node. Greedily add the cheapest edge connecting the existing tree to a new node. Use a min-heap to find the next cheapest edge.

Both are greedy, both are correct, and the choice is purely about graph density. The proof rests on the cut property: the minimum-weight edge crossing any cut must belong to some MST. See What Is a Minimum Spanning Tree? for the proof walkthrough.

Strongly Connected Components

A strongly connected component (SCC) is a maximal set of nodes where every node is reachable from every other node. Directed graphs only.

Kosaraju's (two-pass DFS):

Run DFS on the original graph. Push nodes to a stack in finish order.
Transpose the graph (reverse all edges).
Pop from the stack, run DFS on the transposed graph. Each DFS tree is one SCC.

Time: O(V + E). Pick Kosaraju's in an interview. The two-pass logic is easy to explain out loud.

Tarjan's (one-pass DFS):

Tracks disc[u] (discovery time) and low[u] (lowest discovery time reachable from u's subtree via back/cross edges). A node u is an SCC root when low[u] == disc[u].

Time: O(V + E). One pass, but harder to implement under pressure. Save it for when you want to impress someone who's already impressed.

Graph Algorithms Complexity Cheat Sheet

Algorithm	Time	Space	Condition
BFS	O(V + E)	O(V)	Unweighted shortest path
DFS	O(V + E)	O(V)	General traversal
Topological sort	O(V + E)	O(V)	DAG only
Dijkstra	O((V+E) log V)	O(V)	Non-negative weights
Bellman-Ford	O(VE)	O(V)	Handles negative weights
Floyd-Warshall	O(V³)	O(V²)	All-pairs
Union-Find	O(α(n))	O(V)	Undirected connectivity
Kruskal's MST	O(E log E)	O(V)	Sparse graphs
Prim's MST	O(E log V)	O(V)	Dense graphs
Kosaraju's SCC	O(V + E)	O(V)	Directed graphs
Tarjan's SCC	O(V + E)	O(V)	Directed graphs

The Decision Tree

When a problem hands you a graph and stares at you expectantly, run through this.

Shortest path algorithm selection flowchart

Unweighted shortest path? BFS.
Weighted, non-negative weights, one source? Dijkstra.
Negative weights, one source? Bellman-Ford.
All-pairs shortest path? Floyd-Warshall.
Ordering with dependencies? Topological sort. If Kahn's returns fewer than V nodes, there's a cycle.
Cycle detection, undirected? Union-Find (or DFS).
Cycle detection, directed? DFS with three-color marking.
Minimum spanning tree, sparse? Kruskal's.
Minimum spanning tree, dense? Prim's.
Strongly connected components? Kosaraju's if you need clarity, Tarjan's if you want one pass.
Repeated "are these connected?" queries? Union-Find over repeated BFS.

Five Things That Will Bite You Mid-Interview

1. Dijkstra with a negative edge. No error. No warning. Just confidently wrong answers delivered with full conviction. If the problem mentions negative weights at all, switch to Bellman-Ford before writing a single line.

2. Forgetting the if d > dist[node]: continue guard in Dijkstra. Without it, you process stale heap entries and redo work. Still correct, just slower. Your interviewer will notice; they have a rubric for this.

3. Using list.pop(0) instead of deque.popleft(). Silent O(n) per call. Your BFS is now O(V * E) and you have no idea why the test cases are timing out.

4. Topological sort on a cyclic graph. Kahn's catches this: len(order) != n means there was a cycle. DFS-based topo sort just quietly produces a wrong order with no complaint. Always check the output length.

5. Skipping path compression in Union-Find. Without it, the tree degenerates into a linked list and find runs in O(n). You need both path compression and union by rank to get the O(α(n)) guarantee. Drop either one and you're back in linear time.

If you want to practice applying these under real interview pressure with voice-based feedback on your explanations, SpaceComplexity runs mock interviews that score both your algorithm choices and how you communicate them.