Dutch National Flag Problem: Sort Three Buckets in One Pass, O(1) Space

May 26, 202619 min read
dsaalgorithmsinterview-prepdata-structures
Dutch National Flag Problem: Sort Three Buckets in One Pass, O(1) Space
TL;DR
  • The Dutch national flag problem maintains four zones at all times: confirmed-0s, confirmed-1s, unknown, confirmed-2s; the unknown zone shrinks by exactly 1 per iteration
  • Time is O(n) exact, not amortized: each element enters and permanently exits the unknown zone once, so the loop runs exactly n iterations
  • Space is O(1): three integer pointers (low, mid, high), no auxiliary array, no recursion
  • Critical bug to avoid: when swapping from high, do not advance mid; the arriving element came from the unknown zone and must be inspected
  • Three-way quicksort applies this partition to handle duplicate-heavy inputs optimally, permanently placing the equal-pivot zone and excluding it from both recursive sub-arrays
  • Not stable: relative order within each group is not preserved; use counting sort with prefix sums and backward placement if stability matters
  • Pattern signal: exactly three distinct categories, in-place rearrangement required, O(1) space constraint (canonical example: LeetCode 75 Sort Colors)

Edsger Dijkstra invented this one in 1973 using colored pebbles on a beach. The insight he wrote up in EWD398 now lives inside the standard library quicksort of several languages. That's a pretty good pebble.

The problem: given an array of elements that fall into exactly three categories, rearrange them in-place so all category-A elements come first, then category-B, then category-C. No counting pass. No extra array. One loop.

The mental model: maintain four zones at once, shrink the middle "unknown" zone one step at a time, and stop when nothing is left to inspect. One pass, three pointers, zero extra allocations.

Reach for this whenever you see an array with exactly three distinct categories and need O(n) time and O(1) space, or whenever you're writing a quicksort partition that must handle lots of duplicates without degrading to O(n²).


The Flag Is the Invariant

The Dutch flag has three horizontal bands: red on top, white in the middle, blue at the bottom. The algorithm's job is to impose that structure on an arbitrary array.

The swaps are not the clever part. Anyone can write a swap. The clever part is the four-zone invariant that makes every swap provably correct before you even run the code.

At any point during the algorithm, the array looks like this:

Four-zone invariant: array divided into confirmed 0s, confirmed 1s, unknown, and confirmed 2s zones with low/mid/high pointer positions

The unknown zone [mid..high] is the only part still in play. Everything outside it is already settled.

In code:

[ all 0s | all 1s | unknown | all 2s ]
  0..low-1  low..mid-1  mid..high  high+1..n-1

Four zones, maintained at all times:

  • Everything before low is a confirmed 0.
  • Everything from low to mid - 1 is a confirmed 1.
  • Everything from mid to high is unknown (not yet inspected).
  • Everything after high is a confirmed 2.

At the start, low = 0, mid = 0, high = n - 1. Zones 1, 2, and 4 are empty. Zone 3 is the entire array. The invariant holds trivially because all three boundary zones are empty. The loop terminates when mid > high, which means the unknown zone has collapsed to nothing. At that point the invariant says the array is done.


Walking Through a Live Example

Six elements. Six moves. Let's trace [2, 0, 1, 2, 0, 1] with low = 0, mid = 0, high = 5 and watch the unknown zone drain to empty.

Step-by-step pointer movement through [2,0,1,2,0,1] showing each swap and pointer update

Each row is one state. The amber cells are still in the unknown zone. Once a cell gets a color, it never gets re-examined.

Here's the same trace in text form:

Before:   [ 2  0  1  2  0  1 ]
Pointers:   ^                ^
           mid,low          high

Step 1. arr[mid] = 2. Swap arr[mid] with arr[high]. Decrement high. Do NOT advance mid.

After:    [ 1  0  1  2  0 | 2 ]
            ^            ^
           mid,low       high=4

We don't move mid because the element that just landed at mid came from the unknown zone. We have no idea what it is.

Step 2. arr[mid] = 1. Advance mid only.

After:    [ 1  0  1  2  0 | 2 ]
            ^  ^          ^
           low mid=1     high=4

Step 3. arr[mid] = 0. Swap arr[low] with arr[mid]. Advance both low and mid.

Why is it safe to advance mid here? Because arr[low] was in zone 2 (all 1s), so after the swap arr[mid] = 1, which is correct for zone 2. We already know what it is.

After:    [ 0 | 1  1  2  0 | 2 ]
               ^  ^       ^
            low=1 mid=2  high=4

Step 4. arr[mid] = 1. Advance mid.

             ^     ^     ^
          low=1  mid=3  high=4

Step 5. arr[mid] = 2. Swap arr[mid] with arr[high]. Decrement high.

After:    [ 0 | 1  1  0 | 2  2 ]
               ^  ^  ^
            low=1 mid=3 high=3

Step 6. arr[mid] = 0. Swap arr[low] with arr[mid]. Advance both.

After:    [ 0  0 | 1  1 | 2  2 ]
                  ^     ^
               low=2   high=3, mid=4

mid > high. Done. Result: [0, 0, 1, 1, 2, 2].


Why mid Doesn't Advance When You Swap From the Right

This is the bug. Not "a common bug." THE bug. Every first implementation either gets it or it doesn't, and most don't.

When arr[mid] == 2, you swap arr[mid] with arr[high] and decrement high. The element now sitting at mid came from the unknown zone. It could be 0, 1, or 2. You have no idea. You must inspect it before moving on.

When arr[mid] == 0, you swap arr[low] with arr[mid]. Since low..mid-1 is entirely 1s, the element at arr[low] is definitely a 1. After the swap, arr[mid] = 1, which belongs in zone 2. Safe to advance.

The invariant tells you what you already know. A right-side swap imports an unknown element into mid. A left-side swap puts a known 1 there. Those are not symmetric and the code treats them differently for exactly that reason.

The wrong version:

# WRONG: this one will haunt you at 2am before an interview if nums[mid] == 2: nums[mid], nums[high] = nums[high], nums[mid] high -= 1 mid += 1 # skips the newly arrived element, which hasn't been inspected

Core Operations

OperationTimeSpaceNotes
Three-way partition (one call)O(n)O(1)Single pass, each element inspected exactly once
Best caseO(n)O(1)Already sorted, mid advances every step
Worst caseO(n)O(1)No worst case degradation; bounded by invariant

The O(n) bound is exact, not amortized. Define Φ = high - mid + 1, the size of the unknown zone. Initially Φ = n. Each iteration shrinks Φ by exactly 1: mid increases in two of the three cases, and high decreases in the third. One pointer always moves. The loop runs exactly n times.

Every element is inspected exactly once: elements in zones 1, 2, and 4 are never re-examined after placement. Elements in zone 3 are examined once and then exit zone 3 permanently, either by advancing mid (join zone 2) or by being swapped to zone 1 (0s) or zone 4 (2s). The element arriving at mid from a right-side swap was itself in zone 3 and has never been inspected before, so it counts as a first inspection on the next iteration, not a repeat.

Space is O(1): three integer variables, no auxiliary array, no recursion stack.


One Structure, Every Language

The implementations below partition arr in-place around a pivot value: elements less than pivot move left, elements equal to pivot collect in the middle, and elements greater than pivot move right.

def dutch_national_flag(arr: list[int], pivot: int) -> None: low, mid, high = 0, 0, len(arr) - 1 while mid <= high: if arr[mid] < pivot: arr[low], arr[mid] = arr[mid], arr[low] low += 1 mid += 1 elif arr[mid] == pivot: mid += 1 else: arr[mid], arr[high] = arr[high], arr[mid] high -= 1

What Problems It Solves

Sorting three-value arrays in one pass

The canonical form: given an array of exactly three distinct values (0, 1, 2 or red/white/blue or any three categories), group them in order. No counting, no auxiliary array, single scan. This is LeetCode 75: Sort Colors verbatim. Call dutchNationalFlag(arr, 1) with the implementations above and you're done.

Quicksort pivot partitioning with duplicates

Standard Lomuto and Hoare partitions degrade to O(n²) on arrays with many identical elements. Feed [3,3,3,3,3,3] to a naive quicksort and watch it pick a pivot, put nothing on the left, put everything-minus-one on the right, and recurse n times. You just built an O(n²) loop out of an algorithm that promises O(n log n). Fun!

Three-way partitioning fixes this: all elements equal to the pivot land in the middle zone in a single pass, and neither sub-array contains any copies of the pivot.

Standard partition vs. three-way partition on a duplicate-heavy input. Standard recurses n times; three-way terminates immediately.

On an all-same-value array, the three-way partition finishes in one pass with zero recursive calls. The Lomuto version does n recursive calls.

Bentley and McIlroy applied this idea in their 1993 qsort implementation (now used as the basis of several standard library sort functions). Sedgewick and Bentley later proved in 1997 that three-way quicksort is entropy-optimal: on inputs where keys have known frequencies, it achieves the information-theoretic lower bound in terms of comparisons.

Binary classification (two-zone degenerate case)

You can degenerate the algorithm to two zones by treating one boundary as unreachable. If you want to partition even numbers before odd numbers: use low = mid = 0, high = n - 1, swap even values left (case < pivot branch), and skip odd values (case == pivot branch), never triggering the > pivot branch. You get the standard two-pointer partition for free as a special case.

In-place stream categorization

Any problem that says "rearrange so that type-A items come before type-B items before type-C items, in place" maps directly to the Dutch flag. Color of nodes in a graph traversal, priority buckets in a task scheduler, categories in an in-place filter: same invariant, same three-pointer structure.


Recognizing the Dutch National Flag Problem

The signal isn't "sort". The signal is "three categories, one pass, no extra space."

More concretely:

  1. The problem names exactly three distinct values or categories. The array contains only 0, 1, and 2, or "low/medium/high", or "before/at/after threshold." You're in Dutch flag territory.

  2. The problem asks for in-place rearrangement. If O(n) extra space were allowed, you'd use counting sort. The in-place constraint is the hint.

  3. You're writing a quicksort partition and the input has many duplicates. Standard partitions scatter equal elements across both halves and cause O(n²) on nearly-sorted or all-same-value inputs. Three-way partition fixes it.

  4. The problem uses "group" rather than "sort". Grouping implies categories; sorting implies total order. When there are only three categories, grouping IS sorting, and the Dutch flag is optimal for it.

Worked Example 1: LeetCode 75, Sort Colors

Problem: Given an array of integers where each value is 0, 1, or 2, sort it in-place so that all 0s come first, then all 1s, then all 2s. You may not use the library sort function. Do it in a single pass.

Why this structure fits: Three distinct values. In-place constraint. One-pass requirement written explicitly. This is the Dutch flag problem with the metaphor dropped. Call dutchNationalFlag(nums, 1) and the partition logic places 0s left of low, 1s between low and mid, and 2s right of high.

The alternative approach (count 0s, 1s, and 2s in one pass, then overwrite) also works in O(n) time and O(1) space, but it requires two passes and destroys any satellite data attached to each element (relevant if the problem involves objects rather than integers). Dutch flag does one pass and keeps satellite data with each element, but it is NOT stable. Relative order within each group is not preserved. The two-pass counting approach (overwrite positions with the counted values) also destroys element identity entirely. If stability genuinely matters, use a stable counting sort with prefix sums and backward placement. If you need single-pass in-place and stability doesn't matter, use Dutch flag.

Worked Example 2: Three-Way Quicksort on a Duplicate-Heavy Input

Problem: You have an array of n integers where many values repeat. You want to sort it efficiently. A standard quicksort on [3, 3, 3, 3, 3, 3] (all same value) picks a pivot of 3, partitions into [] and [3, 3, 3, 3, 3], recurses on size n-1, and degrades to O(n²).

Why this structure fits: The pivot value is the "middle" category. Everything less goes left; everything equal stays in the middle; everything greater goes right. After partitioning with Dutch flag, the equal-pivot zone needs no further recursion. On the all-same array, the unknown zone collapses in one pass and the algorithm terminates immediately. On any input, each pivot value can appear in recursive calls only once (elements equal to the pivot are permanently placed and excluded from both sub-arrays), so the algorithm handles duplicate-heavy inputs optimally.

The recurrence for three-way quicksort on distinct values is the same as standard quicksort: O(n log n) expected. On inputs with k distinct values and the k-th value appearing with frequency f_k, the comparison count approaches the information-theoretic lower bound: sum of -f_k * log(f_k) * n. For all-same input that's 0 * n = 0 comparisons beyond the pivot check, which is why the one-pass termination above is exact.

Worked Example 3: Partition Array Around a Pivot (LeetCode 2161)

Problem: Given an integer array and a pivot, rearrange the array so that all elements less than pivot come before elements equal to pivot, which come before elements greater than pivot. Relative order within each group must be preserved.

Where Dutch flag falls short here: Standard Dutch flag is not stable. It can change the relative order of elements within the less-than and greater-than zones. LeetCode 2161 requires stability. The fix: allocate three buckets, scan into them, then write back. O(n) time, O(n) space. The Dutch flag trades stability for the O(1) space bound. Knowing which guarantee you actually need before reaching for a tool is 90% of engineering.


The Generalization: More Than Three Categories

For k categories, the algorithm generalizes with k-1 boundary pointers. k=2 reduces to a single-pointer scan (the standard partition). k=3 is the Dutch flag. k=4 or higher: maintain k-1 boundaries and k zones, advancing or retreating based on which zone the current element belongs to.

Each generalization still runs O(n) per pass. The constant factor grows with k, and the invariant gets harder to reason about, but the structural argument is the same: the unknown zone shrinks by one per iteration.

In practice, for large k, just use counting sort. Better cache behavior, no repeated swapping, simpler invariant. The Dutch flag wins specifically at k=3, where its in-place O(1) space beats anything counting-based. Past that point, counting sort wins on all fronts.


The Non-Obvious Mistake: Counting Swaps

Beginners sometimes count total swaps to verify the O(n) bound. Understandable instinct. Wrong quantity.

The right measure is loop iterations, which is exactly n. Swaps happen between 0 and n-1 times. Some elements get swapped twice: a 2 arrives at mid from the right, then later gets swapped again into its final zone-4 slot. But every iteration shrinks the unknown zone by exactly 1, so the loop runs n times regardless of swap count.

If an interviewer asks "how many swaps does this do?", the correct answer is: "at most n-1, possibly fewer, but the loop always runs exactly n iterations." Swap count is a derived fact, not the fundamental measure.


Recap

  • Invariant: four zones at all times. Unknown zone [mid..high] shrinks by exactly 1 per iteration.
  • Time: O(n) exact, not amortized. The unknown zone starts at size n and hits zero monotonically.
  • Space: O(1). Three integers. That's it.
  • The bug: when swapping from high, do not advance mid. The incoming element is unknown.
  • Not the bug: when swapping from low, advancing mid is safe because arr[low] was a confirmed 1.
  • Not stable. Relative order within each group is not preserved.
  • Degenerates gracefully. Two distinct values: one boundary disappears, standard partition. One distinct value: n iterations, zero swaps.
  • Quicksort application: three-way partition makes quicksort entropy-optimal on duplicate-heavy inputs. Equal-pivot elements are placed in one pass and excluded from both recursive calls.
  • Pattern signal: exactly three categories, in-place rearrangement, O(1) space, one pass.

If you want to practice explaining the invariant out loud under pressure (which is exactly when most people flip the mid-advance logic), SpaceComplexity runs voice-based DSA interviews with rubric feedback. Not just "did you get the right output" but "did you explain why the invariant holds."

The Dutch flag is a common medium. Interviewers reach for it because the code is short but the justification separates candidates who understand it from candidates who memorized it. Fifteen lines of code. A whole interview's worth of signal.


For three-way partitioning in the context of a full sorting algorithm, see the merge sort vs quicksort analysis for how three-way quicksort compares against merge sort on duplicate-heavy inputs. The two pointer technique covers the family of algorithms where two boundaries converge or diverge, of which the Dutch flag is a specialized three-boundary case. And for the invariant-based reasoning style Dijkstra used to derive this algorithm, the binary search invariant deep dive shows the same methodology applied to a simpler problem.


Further Reading