Dijkstra vs Bellman-Ford: One Negative Edge Changes Everything

You have a weighted graph and you need shortest paths. Most engineers reach for Dijkstra by default, having heard it was fast. That works fine until a single negative edge appears, at which point Dijkstra quietly returns the wrong answer with no warning whatsoever. No exception. No stack trace. Just confident, silent incorrectness.

The Dijkstra vs Bellman-Ford decision comes down to one question: can any edge weight be negative? Yes means Bellman-Ford. No means Dijkstra. The rest is context worth having, because it explains why each algorithm exists, where exactly it breaks, and what to reach for when neither handles the job alone.

The Counterexample That Breaks Dijkstra

Before the proofs, a concrete example. Three nodes: S, A, B, with edges S→A (weight 2), S→B (weight 5), B→A (weight −4).

Graph with three nodes S, A, B where B→A has weight −4, showing how Dijkstra finalizes A too early The greedy approach locks in d[A]=2 before discovering the cheaper path through B. It never looks back.

The shortest path from S to A is S→B→A with total weight 5 + (−4) = 1, not the direct edge with weight 2.

Dijkstra processes S first, then pops A from the priority queue because d[A] = 2 is the smallest tentative distance. It marks A as finalized. When it later discovers the cheaper path through B, it won't reconsider A. Finalized vertices never get updated. Dijkstra returns 2. The correct answer is 1.

This isn't a bug you can patch. It's a structural limitation of the greedy approach.

Why Dijkstra Is Correct Without Negative Edges

Dijkstra is a greedy algorithm. When it pops vertex u from the min-heap, it declares d[u] finalized. That claim rests on one assumption: any path not yet processed must cost at least as much as d[u], so it can't improve d[u].

Every vertex still in the heap has a tentative distance >= d[u], because we just extracted the minimum. Any path to u through those unprocessed vertices passes through at least one of them, paying at least that cost. d[u] is already optimal.

That argument collapses the moment a negative edge exists. A path through an expensive intermediate vertex could dip below d[u] via a negative shortcut, arriving cheaper than what's already recorded. Dijkstra has no mechanism to reopen finalized vertices. It committed too early and there is no un-committing.

The proof is induction on extraction order. Every popped vertex has a correct distance at the moment of popping, because the heap minimum lower-bounds all remaining paths and all weights are non-negative. Remove that last condition and the induction falls apart at the base case.

How Bellman-Ford Actually Works

Bellman-Ford isn't greedy. It's dynamic programming parameterized on edge count. No priority queue, no finalization, no hubris.

Bellman-Ford running three passes on the S/A/B graph, showing d[A] correcting from ∞ to 2 to 1 Each pass extends the frontier by one hop. By pass V−1, every simple shortest path is covered.

Define OPT(k, v) as the length of the shortest path from the source to v using at most k edges. The recurrence is:

OPT(k, v) = min(
    OPT(k-1, v),                          # don't use the k-th edge
    min over all edges (u→v): OPT(k-1, u) + weight(u, v)
)

Base case: OPT(0, source) = 0, OPT(0, v) = ∞ for all v ≠ source.

Any simple path (no repeated vertices) has at most V−1 edges. So OPT(V−1, v) gives the true shortest path for every vertex, assuming no negative cycles.

In practice, this is V−1 passes over all E edges, relaxing d[v] = min(d[v], d[u] + w) for every edge (u→v). No heap. No sorted structure. Just edge iteration.

def bellman_ford(n, edges, src):
    dist = [float('inf')] * n
    dist[src] = 0

    for _ in range(n - 1):
        for u, v, w in edges:
            if dist[u] + w < dist[v]:
                dist[v] = dist[u] + w

    # Negative cycle detection: one more pass
    for u, v, w in edges:
        if dist[u] + w < dist[v]:
            return None  # negative cycle reachable from src

    return dist

The inductive proof: after k passes, d[v] holds the shortest path using at most k edges. Each pass extends the frontier by one edge. By pass V−1, every shortest simple path is covered. Bellman-Ford doesn't care about edge sign because it never makes irreversible commitments.

Two Completely Different Mental Models

The algorithms look similar on the surface. Both find shortest paths from a source. Both maintain a distance array. But they solve the problem in fundamentally different ways.

Side-by-side showing Dijkstra's min-heap extraction model versus Bellman-Ford's edge-relaxation DP model Dijkstra: commit fast and never look back. Bellman-Ford: grind through every edge, every pass, no shortcuts.

Dijkstra extracts the minimum-distance vertex and finalizes it. One shot. Bellman-Ford relaxes every edge in every pass, V−1 times, with no concept of a finalized vertex. It's the methodical approach versus the confident one. The confident one breaks on negatives.

Detecting Negative Cycles

If you run a V-th pass and any edge still relaxes, a negative-weight cycle is reachable from the source. A path through that cycle can be made arbitrarily short by looping around it, so "shortest path" becomes undefined. You can make the distance as negative as you want by going around the loop more times.

Bellman-Ford is the standard tool for negative cycle detection. Floyd-Warshall can detect them globally (any negative cycle, not just source-reachable ones) by checking if dist[i][i] < 0 after the algorithm finishes.

Complexity Comparison

Algorithm	Time	Space	Negative edges	Negative cycle detection
Dijkstra (binary heap)	O((V + E) log V)	O(V + E)	No	No
Dijkstra (Fibonacci heap)	O(E + V log V)	O(V + E)	No	No
Bellman-Ford	O(VE)	O(V)	Yes	Yes
SPFA (optimized BF)	O(E) avg, O(VE) worst	O(V)	Yes	Yes

For a sparse graph where E ≈ V, Dijkstra's O(V log V) beats Bellman-Ford's O(V²) comfortably. For a dense graph where E ≈ V², both reach O(V³) and Dijkstra's constant is smaller.

The gap is most dramatic on large sparse graphs, the typical case for road networks, routing tables, and game maps. GPS systems and game pathfinders use Dijkstra or A*, not Bellman-Ford. Bellman-Ford shows up when the problem specifically involves negative weights or cycle detection, not as a general-purpose workhorse.

The In-Place Trick (and Why It's Fine)

The DP formulation uses OPT(k−1, u) for relaxation, which technically requires preserving the previous iteration's distances. Most implementations relax in-place, updating d[v] immediately without a snapshot.

This is fine. In-place updates can only converge faster, never produce incorrect results. If a vertex gets a better distance mid-pass, using that value for subsequent relaxations in the same pass can only help. The V−1 pass upper bound still holds, and many graphs converge in far fewer passes in practice.

For the constrained variant (shortest path using exactly K edges), you must use a copy. See LeetCode 787.

SPFA: Queue-Based Bellman-Ford

SPFA (Shortest Path Faster Algorithm) is Bellman-Ford with a queue. The observation: there's no point relaxing an edge (u→v) unless d[u] changed since the last attempt. Maintain a queue of recently-updated vertices and only process those.

Average case drops to O(E). In competitive programming, SPFA is common. In production, it's avoided: an adversary can construct a graph that forces O(VE) behavior. Plain Bellman-Ford is predictable. If you need speed, use Dijkstra.

When You Don't Need Either: DAGs

If the graph has no cycles, you don't need Bellman-Ford for negative edges or Dijkstra's heap for efficiency. Topological sort gives the correct processing order in one pass; one relaxation sweep over that order runs in O(V + E).

def dag_shortest_path(n, adj, src):
    # adj[u] = list of (v, weight)
    order = topological_sort(n, adj)
    dist = [float('inf')] * n
    dist[src] = 0
    for u in order:
        if dist[u] == float('inf'):
            continue
        for v, w in adj[u]:
            dist[v] = min(dist[v], dist[u] + w)
    return dist

This handles negative edges without Bellman-Ford's V−1 iterations, because cycles are impossible. It's the fastest option for dependency graphs, build systems, and critical-path scheduling. If you're in a DAG and reaching for Bellman-Ford, stop.

All Pairs With Negative Edges: Johnson's Algorithm

Bellman-Ford gives single-source shortest paths in O(VE). Running it from every vertex gives all-pairs in O(V²E), which becomes O(V⁴) on dense graphs. Floyd-Warshall does all-pairs in O(V³). Johnson's does better than both for sparse graphs.

Johnson's reweighting: adding phantom source q, running Bellman-Ford once for potentials, then making all edges non-negative The potential h[v] from Bellman-Ford telescopes out of any path length, so shortest-path structure is preserved after reweighting.

Johnson's runs Bellman-Ford once from a phantom source to compute a potential function h[v], then reweights every edge as w'(u,v) = w(u,v) + h[u] − h[v]. This reweighting makes all edges non-negative while preserving shortest-path structure, after which Dijkstra runs from every vertex. Total: O(VE + V² log V). For sparse graphs this beats Floyd-Warshall significantly.

The proof that reweighted distances are non-negative uses the Bellman-Ford optimality condition: for any edge (u,v), h[v] ≤ h[u] + w(u,v), so w'(u,v) = w(u,v) + h[u] − h[v] ≥ 0.

Currency Arbitrage: Negative Cycles as a Feature

One case where negative cycles are the point, not the problem: currency exchange.

Given exchange rates between currencies, you want to know if a profitable cycle exists. Buy USD, convert to EUR, convert to JPY, convert back to USD, end up with more than you started. Take the negative logarithm of each exchange rate. A profitable cycle in the original graph becomes a negative cycle in the transformed graph. Run Bellman-Ford's V-th pass check to detect it in O(VE).

import math

def has_arbitrage(rates):
    # rates[i][j] = exchange rate from currency i to j
    n = len(rates)
    dist = [0.0] * n  # log-space, start from any vertex

    for _ in range(n - 1):
        for i in range(n):
            for j in range(n):
                if dist[i] - math.log(rates[i][j]) < dist[j]:
                    dist[j] = dist[i] - math.log(rates[i][j])

    for i in range(n):
        for j in range(n):
            if dist[i] - math.log(rates[i][j]) < dist[j]:
                return True  # negative cycle = arbitrage opportunity
    return False

Probably not how hedge funds actually work in 2026, but it's a genuinely elegant use of the algorithm.

When to Use Dijkstra vs Bellman-Ford

Need shortest paths from a single source?
│
├── Graph is a DAG?
│   └── YES → Topological sort + 1 pass. O(V+E). Handles negative edges.
│
├── Any negative edge weights?
│   ├── YES → Any negative cycles? Need to detect them?
│   │         └── Use Bellman-Ford. O(VE).
│   └── NO  → Use Dijkstra with binary heap. O((V+E) log V).
│
Need shortest paths between all pairs?
│
├── No negative edges → Dijkstra from every vertex. O(V(V+E) log V).
├── Negative edges, sparse graph → Johnson's Algorithm. O(VE + V² log V).
└── Negative edges, dense graph → Floyd-Warshall. O(V³).

The Part People Get Wrong

A common interview mistake: running Bellman-Ford from every vertex to get all-pairs shortest paths and calling the total cost O(V²E). For dense graphs where E ≈ V², that's O(V⁴). Floyd-Warshall at O(V³) is strictly better. Reach for Floyd-Warshall on dense graphs with negatives; reach for Johnson's on sparse ones.

A second mistake: using Dijkstra when the graph might have negative weights, then being confused when it returns wrong answers without erroring. There is no exception. No warning. Dijkstra will produce incorrect results with the same confidence it produces correct ones. If your problem might involve negative weights (costs, profits, adjusted distances), check the constraints before reaching for the priority queue. Seriously. Check the constraints.

Recap

Dijkstra is greedy: when a vertex is popped, its distance is final. Requires non-negative weights. O((V+E) log V) with a binary heap.
Bellman-Ford is DP: V−1 passes relax all edges. Handles negative weights. Detects negative cycles on the V-th pass. O(VE).
One negative edge breaks Dijkstra structurally. No fix exists short of switching algorithms.
DAGs are a special case: topological order plus one pass equals O(V+E), handles negatives, no Bellman-Ford needed.
Johnson's algorithm reweights edges to non-negative via Bellman-Ford potentials, then runs Dijkstra from each vertex, giving O(VE + V² log V) all-pairs on sparse graphs.
SPFA is queue-based Bellman-Ford with O(E) average but O(VE) worst case. Avoid in production.
Negative cycles are detectable with Bellman-Ford's V-th pass, and useful for applications like currency arbitrage detection.

If you want to practice explaining this decision under pressure, SpaceComplexity runs voice-based mock interviews where a real rubric scores your reasoning, not just your answer.