Shortest Path Algorithm: BFS, Dijkstra, Bellman-Ford, and When to Use Each

- BFS finds the shortest path in unweighted graphs in O(V+E); no priority queue needed
- Dijkstra handles weighted non-negative graphs in O((V+E) log V) using a min-heap with lazy deletion
- Bellman-Ford runs O(V·E) but works with negative edges and detects negative cycles
- Floyd-Warshall solves all-pairs shortest paths in O(V³) with O(V²) space for dense graphs
- The decision chain: no weights → BFS; positive weights → Dijkstra; negative weights → Bellman-Ford; all pairs → Floyd-Warshall
- The K stops constraint breaks vanilla Dijkstra; extend state to (cost, node, hops_used) or use Bellman-Ford with fixed passes
You open Maps, tap an address, and a blue line appears in under a second. Somewhere in that second, a shortest path algorithm did its thing. And somewhere in your next technical interview, someone is going to ask you to do what Maps does, except from scratch, in 45 minutes, while explaining your reasoning out loud.
The good news: there are really only four algorithms you need, and picking the right one is a two-question flowchart. The bad news: most candidates pick wrong because they read the problem description too fast.
What "Shortest" Actually Means
A graph is a set of nodes (vertices) connected by edges. Edges can be unweighted (every hop costs the same) or weighted (each edge carries a number representing cost, distance, or time).
The shortest path is the one that minimizes total cost to get from a source node to a destination. In an unweighted graph, that means fewest edges. In a weighted graph, it means the minimum sum of edge weights along the route.
Here's a small concrete example. Four cities, connected like this:
A --1-- B
| |
4 2
| |
D --3-- C
The path A → B → C uses two edges and costs 3. The path A → D → C also uses two edges but costs 7. Same number of hops, wildly different costs. Hop count alone stops telling the full story the moment weights show up.
Scale that to millions of nodes, add negative-weight edges, or ask for shortest paths between every pair simultaneously. Each variation requires a different algorithm. Interviewers specifically test whether you can tell them apart before you write any code.
Four Shapes of the Problem
Getting the shape right is more than half the battle.
- Unweighted graph: every edge costs 1, or you only care about minimum hops
- Weighted graph, non-negative edges: edges carry positive weights
- Weighted graph with negative edges: some edges have negative cost
- All-pairs shortest path: you need the shortest distance between every pair of nodes, not just from one source
Each shape has a preferred algorithm. Picking the wrong one isn't just inefficient. It can silently produce wrong answers and you'll never know because the code compiles fine.
Unweighted Graphs: BFS Is Enough
If every edge has equal cost, or you just want the fewest steps, Breadth-First Search is both correct and optimal. It processes nodes level by level from the source. The first time BFS reaches the destination, the path length is guaranteed minimal because it exhausts all paths of length k before exploring anything of length k+1.
BFS finds the shortest path in O(V + E) time using O(V) space. No priority queue, no relaxation, no complexity overhead.
from collections import deque def shortest_path_bfs(graph: dict, start: str, end: str) -> list: queue = deque([(start, [start])]) visited = {start} while queue: node, path = queue.popleft() if node == end: return path for neighbor in graph[node]: if neighbor not in visited: visited.add(neighbor) queue.append((neighbor, path + [neighbor])) return []
A very common mistake: seeing "shortest path" and immediately reaching for Dijkstra. Then spending 10 minutes implementing a min-heap for a problem that has no edge weights. BFS is simpler, equally correct, and faster. If the problem says "minimum steps," "minimum moves," or "shortest path in a maze" with no mention of weights, BFS is almost certainly the intended solution.
Classic problems that want BFS: shortest path in a binary matrix, word ladder, minimum knight moves on a chessboard.
Weighted Non-Negative Graphs: Dijkstra
Once edges carry different positive weights, BFS breaks. A path with fewer edges can cost more than a longer path with smaller weights. You need something that tracks accumulated cost, not just hop count.
Dijkstra's algorithm fixes this with a min-heap of (cost, node) pairs. It always processes the cheapest unfinished node first. When it visits a node, it tries to relax all neighbors: if reaching a neighbor through the current node is cheaper than any previously known route, update the neighbor's cost and push it onto the heap.
import heapq def dijkstra(graph: dict, start: str) -> dict: distances = {node: float('inf') for node in graph} distances[start] = 0 heap = [(0, start)] while heap: cost, node = heapq.heappop(heap) if cost > distances[node]: continue for neighbor, weight in graph[node]: new_cost = cost + weight if new_cost < distances[neighbor]: distances[neighbor] = new_cost heapq.heappush(heap, (new_cost, neighbor)) return distances
Dijkstra runs in O((V + E) log V) with a binary heap, but every edge weight must be non-negative. One negative edge breaks the whole thing. Dijkstra marks nodes "finalized" once they're popped from the heap. If a negative edge could later reduce the cost of a finalized node, Dijkstra never revisits it, so it happily returns wrong answers with full confidence.
The if cost > distances[node]: continue line is lazy deletion. A node can be pushed onto the heap multiple times as shorter paths are discovered. Stale heap entries are discarded here rather than removed, because removing arbitrary elements from a heap is expensive.
Negative Weights: Bellman-Ford
Some graphs have negative edge weights. Currency exchange is the textbook example: some trades are profitable arbitrages, and if you model them as negative-weight edges, you're asking whether a profitable loop exists in the graph. Dijkstra cannot answer that. It will either give wrong results or crash your assumptions.
Bellman-Ford works completely differently. Instead of greedily selecting the cheapest next node, it relaxes every edge in the graph V-1 times. After k iterations, it has correctly computed shortest paths using at most k edges. Since any simple path in a V-node graph uses at most V-1 edges, V-1 passes is always sufficient.
def bellman_ford(edges: list, num_vertices: int, start: int) -> list | None: distances = [float('inf')] * num_vertices distances[start] = 0 for _ in range(num_vertices - 1): for u, v, weight in edges: if distances[u] + weight < distances[v]: distances[v] = distances[u] + weight for u, v, weight in edges: if distances[u] + weight < distances[v]: return None # negative cycle detected return distances
Bellman-Ford costs O(V * E), significantly slower than Dijkstra, but it handles negative edges and can detect negative cycles. The final loop is the negative cycle check: if after V-1 passes you can still relax any edge, a cycle exists whose weights sum to negative. Shortest paths are undefined in that case because you could loop that cycle forever, shaving off cost each time.
A classic interview variant: "Cheapest Flights Within K Stops." Standard Dijkstra has no concept of hop count. Solve it with Bellman-Ford limited to K+1 relaxation passes, or by encoding hops into the Dijkstra state as (cost, node, stops_remaining).
All Pairs: Floyd-Warshall
Sometimes there is no fixed source. You need the shortest distance between every pair of nodes, like a routing table that has to know every possible path. One option is running Dijkstra from every vertex, costing O(V * (V + E) log V). For dense graphs, Floyd-Warshall is often cleaner: it solves all-pairs in O(V³) with O(V²) space.
The idea: for each candidate intermediate node k, check whether routing through k improves the known distance between every pair (i, j).
def floyd_warshall(dist: list[list[float]]) -> list[list[float]]: n = len(dist) for k in range(n): for i in range(n): for j in range(n): dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j]) return dist
Initialize dist with direct edge weights, 0 on the diagonal, and inf where no edge exists. Floyd-Warshall handles negative edges but not negative cycles. For sparse graphs, running Dijkstra from each source typically wins on time. Floyd-Warshall shines when the graph is dense and you genuinely need every pairwise distance.
Picking the Right Algorithm (the Whole Decision in One Place)

| Condition | Algorithm | Time | Space |
|---|---|---|---|
| Unweighted, minimum hops | BFS | O(V + E) | O(V) |
| Weighted, non-negative, single source | Dijkstra | O((V + E) log V) | O(V) |
| Weighted, negative edges, single source | Bellman-Ford | O(V * E) | O(V) |
| Weighted, all pairs, dense graph | Floyd-Warshall | O(V³) | O(V²) |
The decision follows a short chain: does the problem mention weights? No: BFS. Yes: can any weight be negative? No: Dijkstra. Yes: Bellman-Ford. Need all pairs? Floyd-Warshall.
Two questions. That is the entire decision. The reason candidates freeze is not that the logic is hard. It is that they try to remember the decision tree under pressure instead of deriving it from first principles.
What Trips Candidates Up in Interviews
The most common error is misidentifying the graph shape. Read the constraints before picking an algorithm.
The "K stops" constraint on cheapest flights is a trap specifically designed to break vanilla Dijkstra. Standard Dijkstra optimizes cost alone. Adding a hop-count constraint means you need extra state. Two candidates can arrive at the same node with the same cost but different remaining hops, and both states may produce different optimal solutions for different destinations. Either extend the Dijkstra state to (cost, node, hops_used) or use Bellman-Ford with a fixed number of passes.
The second common trap: missing the negative cycle case entirely. If a problem involves circular dependencies, currency conversions, or rewards that compound across loops, ask explicitly whether cycles with negative total weight are possible. The answer changes your algorithm choice completely.
Third: forgetting disconnected graphs. BFS and Dijkstra naturally return infinity or simply fail to reach the destination. Bellman-Ford leaves unreachable nodes at infinity. Check for that in your return value, and mention it to the interviewer before they have to ask.
Practice verbalizing the decision before touching the keyboard: "This is a weighted graph with no mention of negative weights. I'll use Dijkstra." That one sentence earns partial credit even if your code has a bug, and it is what separates a strong hire from a borderline call. If you want to get that sentence right under actual interview pressure, SpaceComplexity runs voice-based mock interviews for exactly these DSA problems, with rubric feedback on your reasoning and your code.