Maximum Flow: The O(VE²) Algorithm That Finds Every Bottleneck

May 27, 202630 min read
dsaalgorithmsinterview-prepgraphs
Maximum Flow: The O(VE²) Algorithm That Finds Every Bottleneck
TL;DR
  • Residual graph is the core idea: forward capacity (how much more you can push) plus backward capacity (how much existing flow you can cancel).
  • Edmonds-Karp (1972) fixes Ford-Fulkerson by using BFS to always pick the shortest augmenting path, giving O(VE²) time independent of capacity values.
  • Max-flow min-cut theorem: when no augmenting path remains in the residual graph, the accumulated flow exactly equals the minimum cut capacity.
  • O(VE²) proof rests on two lemmas: BFS distances are monotone non-decreasing, and each edge becomes the bottleneck at most V/2 times.
  • Back-edge index trick: store the reverse edge index at construction time so augmentation updates both forward and backward residuals in O(1).
  • Bipartite matching, disjoint paths, and minimum vertex or edge cuts all reduce to a single max flow computation.

You have a network. Water pipes, shipping lanes, internet links. Each connection has a maximum throughput, and you want to push as much as possible from one end to the other. The answer is always limited by some set of edges whose combined capacity is the real bottleneck. Maximum flow finds exactly that limit and, in the same computation, tells you where the bottleneck lives.

Edmonds-Karp is the standard implementation you reach for: a BFS-based variant of Ford-Fulkerson, O(VE²) time. The termination proof is one of the cleanest in graph algorithms.

Mental model: you repeatedly find the shortest available path from source to sink in a "residual" graph and push as much flow through it as the tightest edge allows. When no such path exists, you're done, and the flow you've accumulated is maximum.

Expanding brain meme: bottom tier says "find and eliminate bottlenecks", which is exactly what max flow min cut does You could eyeball the bottleneck. Or you could prove it's minimum with max-flow min-cut.

The Problem, Precisely Stated

A flow network is a directed graph G = (V, E) with a source s and a sink t. Every edge (u, v) has a capacity c(u, v) ≥ 0. A flow f assigns a value to each edge satisfying two constraints.

Capacity: 0 ≤ f(u, v) ≤ c(u, v) for every edge.

Conservation: for every vertex v that is neither s nor t, the total flow entering v equals the total flow leaving v.

The goal is to maximize |f|, the total flow leaving s (which equals the total flow entering t by conservation).

Directed flow network with nodes s, a, b, t and labeled edge capacities: s to a cap 10, s to b cap 8, a to b cap 4, a to t cap 10, b to t cap 9 A typical flow network. Find the maximum flow from s to t.

The Residual Graph: The Only Real Idea

Everything else in this algorithm is bookkeeping. The residual graph is the one insight.

For every edge (u, v) with capacity c and current flow f:

  • Forward residual: r(u, v) = c(u, v) − f(u, v). How much more can you push.
  • Backward residual: r(v, u) = f(u, v). How much existing flow you can cancel.

The backward edge is the non-obvious part. If you've already pushed 7 units along (u, v), you can effectively "undo" 3 of those units by pushing 3 units along the backward edge (v, u). The algorithm doesn't literally move flow backward. It just adjusts the accounting to reach an equivalent flow that might be globally better.

This cancellation mechanism is what lets greedy-looking augmentations still produce a globally optimal answer. Without backward edges, bad early choices could leave you stuck below the true maximum.

Two-panel diagram: left shows a flow network before any flow; right shows the same network after pushing 8 units along s to a to t, with forward and backward residual edges labeled Before and after pushing 8 units. The backward edges (orange, dashed) represent flow you could "un-send" if a better path needs it.

The Back-Edge Index Trick

When you add edge (u, v) with capacity cap to the adjacency list, immediately add the reverse edge (v, u) with capacity 0. Store the index of each edge's reverse so you can update both in O(1) during augmentation.

def add_edge(graph, u, v, cap): # graph[u][i] = [destination, capacity, reverse_edge_index] graph[u].append([v, cap, len(graph[v])]) graph[v].append([u, 0, len(graph[u]) - 1])

After augmenting by f units along edge at graph[u][i]:

rev = graph[u][i][2] graph[u][i][1] -= f # reduce forward capacity graph[v][rev][1] += f # restore backward capacity

Memory layout diagram showing graph[u] and graph[v] adjacency list arrays, with arrows showing the rev index linking forward and backward edges, plus the two-line augmentation update The rev index is set once at construction time. Augmentation is two array writes.

Ford-Fulkerson: Right Idea, Wrong Path-Finder

The Ford-Fulkerson method (Ford and Fulkerson, 1956) is:

  1. Start with zero flow.
  2. Find any augmenting path from s to t in the residual graph G_f.
  3. Let bottleneck = minimum residual capacity along the path.
  4. Push bottleneck units: decrease each forward edge's residual by bottleneck, increase each backward edge's residual by bottleneck.
  5. Repeat until no augmenting path exists.

Correct? Yes. Efficient? Depends entirely on how you find the augmenting path.

With DFS, the algorithm can oscillate. With integer capacities you're guaranteed termination, but it can take O(E × f) iterations where f is the max flow value. If the max flow is 10⁹, that's 10⁹ DFS calls. Your laptop will be busy.

With irrational capacities the algorithm can fail to terminate entirely, converging toward the true maximum as a limit but never reaching it in finite steps. Ford and Fulkerson constructed an explicit example. The smallest known non-terminating instance has 6 vertices. Six. Six vertices are enough to break a correct algorithm forever.

The problem is that DFS has no bound on path length, so there's no bound on how many augmentations you need.

Edmonds-Karp: BFS Changes Everything

Jack Edmonds and Richard Karp published the fix in 1972 (JACM 19:248-264): always find the shortest augmenting path (fewest edges) using BFS.

This one change gives O(VE²) time, regardless of capacity values. The bound no longer depends on the flow magnitude at all. Infinite-capacity edges? Fine. Irrational capacities? Also fine.

Side-by-side diagram: left shows DFS picking a long four-hop winding path through the residual graph; right shows BFS picking the direct two-hop shortest path, with annotation showing BFS monotone distance guarantee DFS gets distracted by the scenic route. BFS takes the highway and can prove it's the shortest.

The BFS shortest-path property is what makes the proof work. It gives you a key monotonicity: the shortest-path distance from s to any vertex can only increase as the algorithm runs. That monotonicity is what bounds the number of augmentations.

Why No Augmenting Path Means Maximum Flow

Before the complexity proof, we need to know the algorithm is correct.

A cut (S, T) is a partition of V where s ∈ S and t ∈ T. The capacity of the cut is the sum of capacities of all edges going from S to T (backward edges don't count).

Max-Flow Min-Cut Theorem: the value of the maximum flow equals the capacity of the minimum cut.

Proof:

  1. Suppose BFS finds no augmenting path. Let S = all vertices reachable from s in the residual graph G_f. Since no augmenting path exists, t ∉ S. Let T = V \ S.
  2. For every edge (u, v) with u ∈ S and v ∈ T: the residual capacity r(u, v) = 0 (otherwise v would be reachable). So f(u, v) = c(u, v). Every forward edge crossing the cut is saturated.
  3. For every edge (u, v) with u ∈ T and v ∈ S: the backward residual r(v, u) = f(u, v) = 0, so no flow crosses in the wrong direction.
  4. Total flow = Σ f(u,v) across (S,T) edges = Σ c(u,v) = the cut capacity.
  5. Every flow is bounded above by every cut. So this flow is maximum and this cut is minimum.

The algorithm stops exactly when the flow saturates a minimum cut. The set of saturated edges at termination is the bottleneck you were looking for.

Flow network with S/T partition line drawn: S-side nodes on the left, T-side on the right. Edges crossing S to T are highlighted in red labeled f=c FULL. The min-cut partition line is drawn in gold. Every edge crossing the cut is saturated. That's your minimum cut. That's your bottleneck.

What Each Operation Costs

OperationTimeSpaceNotes
add_edge(u, v, cap)O(1)O(1)Append forward + reverse entry
BFS (one augmentation)O(V + E)O(V)BFS queue + parent array
Full flow(s, t)O(VE²)O(V + E)O(VE) augmentations × O(E) each
Build graphO(V + E)O(V + E)Adjacency list with back-edge indices

The O(V + E) space is tight. The graph itself takes O(V + E). BFS uses O(V) for the queue and the parent array. The residual capacities live in the same adjacency list as the graph, so no extra storage.

The O(VE²) Proof, No Hand-Waving

The proof has two lemmas. The second uses the first.

BFS Distances Are Monotone Non-Decreasing

Let d_f(v) = BFS distance from s to v in residual graph G_f (counting edges, not capacities).

Lemma 1: after any augmentation producing flow f', d_{f'}(v) ≥ d_f(v) for all v.

Proof by contradiction. Suppose some vertex v has d_{f'}(v) < d_f(v). Pick the vertex closest to s in G_{f'} with this property. Let (u, v) be the last edge on the shortest path to v in G_{f'}, so d_{f'}(u) + 1 = d_{f'}(v) and by our choice of v, d_{f'}(u) ≥ d_f(u).

Case 1: Edge (u, v) existed in G_f. Then d_f(v) ≤ d_f(u) + 1 ≤ d_{f'}(u) + 1 = d_{f'}(v). Contradiction.

Case 2: Edge (u, v) is new in G_{f'} (didn't exist in G_f). New residual edges appear only when the augmentation traveled the reverse direction (v, u). Since that augmenting path was a BFS shortest path: d_f(u) = d_f(v) + 1. Then d_{f'}(v) = d_{f'}(u) + 1 ≥ d_f(u) + 1 = d_f(v) + 2 > d_f(v). Contradiction again.

Both cases fail. Distances are monotone non-decreasing. This is the engine behind everything.

Each Edge Becomes Critical at Most V/2 Times

Call an edge (u, v) critical on an augmenting path if it is the bottleneck edge: r(u, v) = path bottleneck. After augmentation, r(u, v) drops to zero and the edge disappears from the residual graph.

For (u, v) to reappear, the algorithm must later augment along the reverse edge (v, u). At that moment (v, u) is on a BFS shortest path, so d(v) = d(u) + 1, meaning d(u) = d(v) + 1.

But the last time (u, v) was critical, it was on a BFS path going forward: d(u) + 1 = d(v), so d(u) = d(v) − 1.

After the reverse augmentation restores (u, v), d(v) has increased since the last time (u, v) was critical (by Lemma 1). Concretely: the next time (u, v) can be critical, d(u) has increased by at least 2 compared to the previous time it was critical.

Since d(u) ≤ V − 1, each edge can be critical at most (V − 1)/2 times, giving O(VE) total augmentations.

Each augmentation runs one BFS: O(E). Total time: O(VE) × O(E) = O(VE²).

One Structure, Every Language

All implementations below share the same interface:

  • addEdge(u, v, cap): add a directed edge with given capacity.
  • flow(s, t): return the maximum flow from s to t.

Nodes are zero-indexed integers from 0 to n−1.

from collections import deque class MaxFlow: def __init__(self, n: int) -> None: self.n = n self.graph: list[list[list[int]]] = [[] for _ in range(n)] def add_edge(self, u: int, v: int, cap: int) -> None: # [destination, capacity, reverse_edge_index] self.graph[u].append([v, cap, len(self.graph[v])]) self.graph[v].append([u, 0, len(self.graph[u]) - 1]) def _bfs(self, s: int, t: int, parent: list) -> bool: visited = [False] * self.n visited[s] = True queue = deque([s]) while queue: u = queue.popleft() for i, (v, cap, _) in enumerate(self.graph[u]): if not visited[v] and cap > 0: visited[v] = True parent[v] = (u, i) if v == t: return True queue.append(v) return False def flow(self, s: int, t: int) -> int: total = 0 while True: parent: list = [None] * self.n if not self._bfs(s, t, parent): break path_flow = float('inf') v = t while v != s: u, i = parent[v] path_flow = min(path_flow, self.graph[u][i][1]) v = u v = t while v != s: u, i = parent[v] rev = self.graph[u][i][2] self.graph[u][i][1] -= path_flow self.graph[v][rev][1] += path_flow v = u total += path_flow return total

Where Maximum Flow Shows Up

Network Throughput

Any network with a source, a sink, and limited connections is a max flow problem. Internet routing, oil pipelines, logistics networks, supply chains. The domain doesn't matter; the structure does. If it can be represented as a directed graph with capacities, you can find the exact maximum.

Bipartite Matching

Maximum bipartite matching reduces to a single max flow computation. Given a bipartite graph with left set L and right set R:

  • Add super-source s connected to every vertex in L with capacity 1.
  • Add super-sink t connected from every vertex in R with capacity 1.
  • Give each original bipartite edge capacity 1.

Max flow = maximum matching. The unit capacities on s-to-L and R-to-t edges enforce the "at most one" constraint: each vertex can be matched at most once.

Edge-Disjoint and Vertex-Disjoint Paths

Maximum number of paths from s to t that share no edges: assign capacity 1 to every edge and run max flow. This is Menger's theorem stated algorithmically.

For vertex-disjoint paths: split each vertex v (except s and t) into v_in and v_out with a single edge of capacity 1. Connect all original edges as v_out → w_in. Max flow on this expanded graph gives the maximum number of vertex-disjoint paths.

Project Selection and Max Weight Closure

Some problems ask: given projects with profits or costs and dependency constraints ("if you take A, you must take B"), find the subset with maximum net value. This maps to a minimum cut problem. Source connects to positive-value projects (capacity = value), negative-value projects connect to sink (capacity = |cost|), dependencies become infinite-capacity edges. Min cut = the projects you exclude to maximize net profit. And min cut equals max flow.

Image Segmentation

Minimum cut partitions image pixels into foreground and background. Energy functions on pixel pairs become edge capacities. Graph cuts dominated computer vision for years before deep learning, and maximum flow was the engine.

How to Spot a Max Flow Problem

Five signals that a problem is asking for maximum flow:

  1. "Maximum number of X that can occur simultaneously" where different choices compete for shared resources.
  2. Bipartite assignment: match items from set A to items from set B with one-to-one constraints.
  3. "Minimum number of edges/vertices to remove" to disconnect two nodes. Min cut equals max flow.
  4. The problem literally involves throughput: bandwidth, pipelines, flow rates.
  5. The feasibility version is easy ("can we route k units?") but you need the maximum.

Signal 3 is the one that trips people up most often. Min-cut problems don't look like flow problems until you know the theorem. Once you do, you see them everywhere.

Worked Example 1: Maximum Bipartite Matching

Problem: N workers and M tasks. Worker i can do task j if edge (i, j) exists. Each worker handles at most one task, each task needs exactly one worker. Find the maximum number of assignments.

Why max flow: Build a flow network with N + M + 2 nodes. Source s connects to each worker with capacity 1. Each worker connects to all their compatible tasks with capacity 1. Each task connects to sink t with capacity 1.

Run max flow. Each unit of flow represents one assignment. The capacity-1 edges from s enforce that no worker is matched twice. The capacity-1 edges to t enforce that no task is assigned twice.

The max flow equals the max matching. Check which original bipartite edges carry flow 1 at termination to read off the actual assignments.

Worked Example 2: Minimum Edge Cuts

Problem: A computer network has N servers and E connections. Find the minimum number of connections you must cut to isolate server 0 from server N−1.

Why max flow: By Menger's theorem (equivalent to max-flow min-cut), the minimum number of edge-disjoint paths from 0 to N−1 equals the minimum edge cut. Assign capacity 1 to every edge (make it bidirectional by adding edges in both directions), run max flow from 0 to N−1. The result is both the maximum number of edge-disjoint paths and the minimum number of edges you need to cut.

The structural insight: any set of k edge-disjoint paths means you need to cut at least k edges to disconnect the endpoints. A cut of size k means at most k edge-disjoint paths can exist. These two bounds meet, so they're equal.

For vertex-disjoint paths instead (minimum vertex cut), use the node-splitting trick: each vertex v becomes v_in → v_out with capacity 1.

The Quick Recap

  • A flow network is a directed graph with capacities; you want the maximum flow from source to sink.
  • The residual graph tracks remaining forward capacity and cancellable backward capacity for every edge.
  • Ford-Fulkerson (1956) finds augmenting paths and pushes flow until none remain. With DFS it can take O(Ef) iterations and may not terminate on irrational capacities. Six vertices is enough to break it forever.
  • Edmonds-Karp (1972) uses BFS to always pick the shortest augmenting path, giving O(VE²) time regardless of capacity values.
  • Proof: BFS distances are monotone non-decreasing. Each edge is critical at most V/2 times. Total augmentations: O(VE). Time per augmentation: O(E). Total: O(VE²).
  • Max-flow min-cut theorem: when no augmenting path exists, you have found both the maximum flow and the minimum cut simultaneously.
  • Main interview patterns: bipartite matching (unit-capacity construction), disjoint paths (Menger's theorem), minimum vertex/edge cuts, project selection via min-cut.
  • Implementation key: store each edge's reverse index at construction time. Augmentation updates both in O(1).

If you want to practice recognizing these patterns under realistic interview conditions, SpaceComplexity runs voice-based DSA mock interviews where you explain your reduction aloud and get rubric-based feedback on your reasoning, not just your code.

For graph algorithm foundations, BFS vs DFS: One Question Decides It Every Time covers why path-finding strategy matters, and Dijkstra's Algorithm shows another instance where BFS-flavored relaxation gives clean complexity bounds. Graph Data Structure Interview covers adjacency list representation, which is the substrate all these algorithms run on.

Further Reading