Cyclic Sort Algorithm: O(n) Sorting When the Array Is Its Own Index

Imagine you're a postal worker handed a stack of 1,000 envelopes. Each envelope already has a slot number printed on it. You don't sort the stack by comparing envelopes to each other. You just pick up each envelope and drop it in its slot. One pass, done.

That is the cyclic sort algorithm. When every element in an array has a deterministic address computed directly from its value, comparison-based sorting is overkill. You can route each element home in O(n) time without touching a comparison function once.

The prerequisite: the array contains n integers from a known bounded range, typically [1, n] or [0, n]. That sounds narrow until you realize it describes exactly the "missing number," "find duplicates," and "first missing positive" problem families that appear constantly in interviews. For those problems, cyclic sort cuts right through them: sort in O(n), scan in O(n), done.

The Array Already Knows Where Everything Goes

Most sorting algorithms are general-purpose. They receive an array of arbitrary values and figure out the order through comparisons. The best any comparison-based sort can do is O(n log n), because the decision tree has n! leaves and needs at least log₂(n!) ≈ n log n levels.

Cyclic sort cheats that lower bound because it never uses comparisons. It uses indexing. If the problem tells you the array holds integers in [1, n], then the integer 7 belongs at index 6. The integer 3 belongs at index 2. The address is part of the value. That extra information puts us outside the comparison model entirely, the same reason counting sort beats O(n log n): it reads values as indices, not as things to compare.

The O(n) bound follows from a simple observation. There are n positions. Each swap places at least one element into its permanent home. Once there, it is never disturbed again. So the total swaps across the entire run is at most n. The outer pointer moves forward exactly n times. Total work: 2n. O(n).

Array elements routing directly to their correct indices by value. No comparisons needed. Every value v already knows its home is index v-1. The array is its own hash table.

How the Cyclic Sort Algorithm Works

The loop has one job: keep the invariant that every position i with i < current_index already holds the value i + 1.

i = 0
while i < n:
    j = nums[i] - 1        // where nums[i] belongs (1-indexed)
    if nums[i] != nums[j]: // that slot doesn't have the right value yet
        swap(nums[i], nums[j])
        // DON'T advance i: the new nums[i] might also need to move
    else:
        i += 1             // slot is correct (or duplicate), advance

Two things matter here. First, i never decrements. It either stays while a swap runs or steps forward. The pointer only goes right. Second, the condition is nums[i] != nums[j], not nums[i] != i + 1. That distinction is the most common bug in interview code. Your code will look correct, run instantly, and then hang forever on duplicates. More on that shortly.

Tracing [3, 1, 4, 2]

Here is a complete trace. Each row shows the array state and what the algorithm decides.

Array         i   nums[i]  j    nums[j]  Decision
-----------  ---  -------  ---  -------  -----------------------
[3, 1, 4, 2]  0     3       2     4      3 ≠ 4  → swap(0, 2)
[4, 1, 3, 2]  0     4       3     2      4 ≠ 2  → swap(0, 3)
[2, 1, 3, 4]  0     2       1     1      2 ≠ 1  → swap(0, 1)
[1, 2, 3, 4]  0     1       0     1      1 == 1 → i++
[1, 2, 3, 4]  1     2       1     2      2 == 2 → i++
[1, 2, 3, 4]  2     3       2     3      3 == 3 → i++
[1, 2, 3, 4]  3     4       3     4      4 == 4 → i++
Done.

Three swaps for four elements. That is the worst case: n - 1 swaps. The outer pointer visited index 0 four times (three swaps plus one final check) and each of indices 1, 2, 3 once. Seven total loop-body executions for n = 4, confirming the 2n - 1 bound.

Notice what each swap accomplishes. The first swap moves 3 to position 2 (its home). The second swap moves 4 to position 3. The third swap moves 2 to position 1, and 1 falls into position 0 as a side effect. After those three swaps, every element is home. The remaining four i-increments just verify this.

Array states during the trace

Initial:  [3, 1, 4, 2]
           ^               i=0, value 3 should be at index 2

After 1:  [4, 1, 3, 2]
           ^               i=0, value 4 should be at index 3

After 2:  [2, 1, 3, 4]
           ^               i=0, value 2 should be at index 1
                    ✓✓     positions 2 and 3 are now correct

After 3:  [1, 2, 3, 4]
           ^               i=0, value 1 is now home
           ✓               position 0 is correct, advance

Final:    [1, 2, 3, 4]
           ✓  ✓  ✓  ✓      scan confirms, i runs to n

Step-by-step trace of cyclic sort on [3,1,4,2], showing each swap, the pointer i, and the total iteration count. 3 swaps + 4 pointer advances = 7 iterations. The pointer never goes backward.

Why the Nested Loops Don't Nest

The code looks O(n²). You see a while loop. Inside it, a swap that doesn't advance i. Your brain screams "this is going to be quadratic." It's not. Here's why.

Track two quantities: the pointer i and a counter placed for the number of elements permanently settled.

Every swap sends nums[i] to position j = nums[i] - 1. After that swap, position j holds the value j + 1. It is correct. Because the algorithm only targets a position when it holds the wrong value, position j will never be targeted again. So placed increases by 1 for each swap.

placed is bounded by n. That bounds the total swaps at n.

i advances by 1 in every non-swap iteration. i starts at 0 and stops at n. So the number of non-swap iterations is exactly n.

Total iterations = swaps + non-swaps ≤ n + n = 2n. O(n) time, O(1) space.

The pointer only goes right

Iterations where i advances (no swap needed):
 ← n total increments, one per index →
 i: 0  1  2  3  4  ...  n-1

Swap iterations: counted by elements placed
 Total swaps ≤ n (one per element placed)

Grand total ≤ 2n

The nested appearance of the loops is a red herring. The inner "loop" is bounded by the global swap budget, not by n per outer step.

Why It's Called Cyclic

The "cyclic" in cyclic sort is not marketing. It refers to a classical result from combinatorics: every permutation decomposes uniquely into disjoint cycles.

Given an array, the permutation σ maps each position i to the target position of the element currently sitting there. For [3, 1, 4, 2]:

Position 0 holds 3, which belongs at position 2: σ(0) = 2
Position 1 holds 1, which belongs at position 0: σ(1) = 0
Position 2 holds 4, which belongs at position 3: σ(2) = 3
Position 3 holds 2, which belongs at position 1: σ(3) = 1

Starting from position 0 and following σ: 0 → 2 → 3 → 1 → 0. One cycle of length 4.

A cycle of length k requires exactly k - 1 swaps to resolve. The first swap places the displaced element, the second places the one that was displaced, and so on until the last element falls into the original position.

For [2, 1, 4, 3]:

σ(0) = 1, σ(1) = 0: cycle (0, 1) of length 2
σ(2) = 3, σ(3) = 2: cycle (2, 3) of length 2

Two cycles of length 2. Total swaps: 1 + 1 = 2. In general: total swaps = n - (number of cycles). An already-sorted array has n cycles of length 1 (every element is a fixed point) and needs zero swaps.

The two-cycle case: [2, 1, 4, 3]

Position:  0    1    2    3
Value:     2    1    4    3

Cycles (where each value wants to go):
   0 ←→ 1        2 ←→ 3
   (swap once)   (swap once)

After cycle (0,1) resolved: [1, 2, 4, 3]
After cycle (2,3) resolved: [1, 2, 3, 4]

Permutation cycle decomposition: [3,1,4,2] forms one 4-cycle (cost: 3 swaps) while [2,1,4,3] forms two 2-cycles (cost: 2 swaps). Total swaps = n minus the number of cycles. A fully sorted array has n fixed-point cycles. Zero swaps needed.

This decomposition is also why the original (general) cycle sort minimizes memory writes. Each element is placed directly in its final position in one write, never touched again. For write-expensive hardware like flash memory or EEPROM, this matters. For the interview pattern, the write-optimal property is a bonus, not the goal.

The Duplicate Trap

The condition nums[i] != nums[j] is not a coincidence. It is the only condition that handles duplicates safely.

Suppose the array is [1, 2, 3, 2]. When i = 3: nums[3] = 2, j = 1, nums[1] = 2. Both positions hold the value 2.

If the condition were nums[i] != i + 1, the check would be 2 != 4, which is true. The algorithm would swap position 3 with position 1, getting [1, 2, 3, 2] back. Still true. Still swapping. You would be there for a while.

The condition nums[i] != nums[j] evaluates to 2 != 2, which is false. The algorithm increments i and moves on.

This skip is information. In missing-number problems, when you see nums[i] == nums[j] but nums[i] != i + 1, that position holds a duplicate instead of the expected value. The missing value is i + 1. The duplicate is nums[i].

# WRONG: loops forever on duplicates
if nums[i] != i + 1:
    swap(nums[i], nums[j])

# CORRECT: breaks safely when duplicate detected
if nums[i] != nums[j]:
    swap(nums[i], nums[j])

Side-by-side comparison of the wrong condition (infinite loop) vs the correct condition (safe skip) on a duplicate input. One character of difference. One sends you home at normal time. The other is still running.

This is also the meme at the heart of sorting algorithm humor.

rateMySortingAlgorithm from r/ProgrammerHumor (7.8k upvotes) When you write nums[i] != i + 1 and submit.

Two Variants: 0-Indexed vs. 1-Indexed

The implementation differs slightly depending on the problem's range.

Range [1, n] (1-indexed): correct position for value v is index v - 1. This is the most common interview variant.

j = nums[i] - 1
if nums[i] != nums[j]:
    swap

Range [0, n] (0-indexed), with n+1 elements including one missing: correct position for value v is index v. The value n has no valid slot (indices only go to n-1), so guard against it.

j = nums[i]
if nums[i] < len(nums) and nums[i] != nums[j]:
    swap

Range arbitrary, find first missing positive (LC 41): elements outside [1, n] are noise. The guard becomes two conditions.

j = nums[i] - 1
if 1 <= nums[i] <= n and nums[i] != nums[j]:
    swap

The only two things you adjust between variants are the index formula (nums[i] - 1 vs nums[i]) and the out-of-range guard. The swap condition nums[i] != nums[j] never changes.

Cyclic Sort Time Complexity

Operation	Time	Space	Notes
Sort, range [1, n]	O(n)	O(1)	At most n swaps total
Sort, range [0, n]	O(n)	O(1)	Guard: `nums[i] < n` before indexing
Sort, general (arbitrary array)	O(n²)	O(1)	Count smaller elements to find rank
Find missing number (post-sort scan)	O(n)	O(1)	One pass after sort
Find all disappeared numbers	O(n)	O(1)	Collect all mismatch indices
Find first missing positive	O(n)	O(1)	Needs out-of-range guard on sort
Detect duplicate	O(n)	O(1)	Duplicate detected during sort when `nums[i] == nums[j]`

The time complexity is O(n) for constrained variants because the total swap count is bounded by n regardless of array size. The two-loop structure is misleading: those loops share a single swap budget.

For the general cycle sort (arbitrary arrays): finding each element's correct position requires counting all smaller elements, which costs O(n) per element and O(n²) overall. This is write-optimal (each element written exactly once to its final location) but impractical for speed.

One Structure, Every Language

All implementations below sort an array of integers in [1, n] in-place. The 1-indexed variant is shown because it covers the widest class of interview problems.

def cyclic_sort(nums: list[int]) -> list[int]:
    i = 0
    while i < len(nums):
        j = nums[i] - 1
        if nums[i] != nums[j]:
            nums[i], nums[j] = nums[j], nums[i]
        else:
            i += 1
    return nums

Cyclic Sort LeetCode Problems

Cyclic sort is the right tool for one specific category: bounded-range integer arrays where the question asks what's missing, what's duplicated, or what's the smallest missing positive. The workflow is always the same: sort in O(n), then scan for mismatches in O(n).

Missing Number (LeetCode 268)

Array has n integers from [0, n], one is missing. Use the 0-indexed variant since values equal their correct index.

def missing_number(nums: list[int]) -> int:
    i = 0
    n = len(nums)
    while i < n:
        j = nums[i]
        if nums[i] < n and nums[i] != nums[j]:
            nums[i], nums[j] = nums[j], nums[i]
        else:
            i += 1
    for i in range(n):
        if nums[i] != i:
            return i
    return n

All Disappeared Numbers (LeetCode 448)

Array has n integers from [1, n], some repeated (so others are absent). After sort, every index i where nums[i] != i + 1 names a missing value.

def find_disappeared_numbers(nums: list[int]) -> list[int]:
    i = 0
    while i < len(nums):
        j = nums[i] - 1
        if nums[i] != nums[j]:
            nums[i], nums[j] = nums[j], nums[i]
        else:
            i += 1
    return [i + 1 for i in range(len(nums)) if nums[i] != i + 1]

Find the Duplicate Number (LeetCode 287)

Array has n + 1 integers in [1, n], exactly one is duplicated. During the sort, when nums[i] == nums[j] but nums[i] != i + 1, the current value is the duplicate.

def find_duplicate(nums: list[int]) -> int:
    i = 0
    while i < len(nums):
        j = nums[i] - 1
        if nums[i] != nums[j]:
            nums[i], nums[j] = nums[j], nums[i]
        else:
            if nums[i] != i + 1:
                return nums[i]
            i += 1
    return -1

Set Mismatch (LeetCode 645)

One value is duplicated, one is missing. After sort, the single index i where nums[i] != i + 1 gives both: nums[i] is the duplicate, i + 1 is the missing.

All Duplicates in an Array (LeetCode 442)

After sort, collect every nums[i] where nums[i] != i + 1. Each is a duplicate.

First Missing Positive (LeetCode 41)

The array can contain negatives, zeros, and values greater than n. Only values in [1, n] can be placed; everything else is treated as noise with the out-of-range guard.

def first_missing_positive(nums: list[int]) -> int:
    n = len(nums)
    i = 0
    while i < n:
        j = nums[i] - 1
        if 1 <= nums[i] <= n and nums[i] != nums[j]:
            nums[i], nums[j] = nums[j], nums[i]
        else:
            i += 1
    for i in range(n):
        if nums[i] != i + 1:
            return i + 1
    return n + 1

The guard 1 <= nums[i] <= n ensures j never goes out of bounds. A value like -3 would compute j = -4, an illegal index. A value like 1000 in an array of length 5 would compute j = 999, also illegal. The guard skips both cases.

Post-sort scan: mismatches name the missing values

Array [4, 3, 2, 7, 8, 2, 3, 1] after cyclic sort:

Index: 0   1   2   3   4   5   6   7
Value: 1   2   3   4   3   2   7   8
Check: ✓   ✓   ✓   ✓   ✗   ✗   ✓   ✓
             nums[4]=3 ≠ 5       nums[5]=2 ≠ 6

Missing: 5, 6

Post-sort scan on [4,3,2,7,8,2,3,1]: positions 4 and 5 hold duplicates instead of 5 and 6, revealing the missing values in one pass. Duplicates squatted in the slots that 5 and 6 should have occupied. One O(n) scan reads the answer directly.

How to Spot It in the Wild

Interview problems don't announce themselves. Nobody writes "hint: use cyclic sort" in the constraints. You have to read between the lines.

Cyclic sort hides behind problem descriptions, not algorithm names. You have to decode the constraints.

The signals:

"Array of n integers" combined with "each integer is in the range [1, n]" or "[0, n]"
"Find the missing number" or "find all missing numbers"
"Find the duplicate" in a range-constrained array
"O(1) extra space" + "O(n) time" + the array is mutable
"First missing positive" (the range [1, n+1] is implicit)

When you see those signals, the approach is:

Sort with cyclic sort
Scan for positions where nums[i] != i + 1
The answer is directly visible

Cyclic sort beats a hash set for these problems because the hash set costs O(n) space. The array itself plays the role of the hash table, mapping indices to expected values. If you want to understand why arrays can serve as hash structures, the hash map deep dive covers the underlying mechanics.

Worked Example 1: Find All Numbers Disappeared in an Array (LeetCode 448)

Problem: Given an array of n integers where each integer is in [1, n], return all integers in [1, n] that do not appear in the array.

Example: nums = [4, 3, 2, 7, 8, 2, 3, 1]

Why cyclic sort fits: The range [1, 8] maps exactly onto indices [0, 7]. Every element has a definable home. The question asks what's absent, which a post-sort scan answers directly without a hash set.

Step 1: cyclic sort.

The sort places duplicates in their correct positions first, then skips when it detects a collision. After the sort, positions holding duplicates are the ones that cannot be filled by the missing values.

After running cyclic sort on [4, 3, 2, 7, 8, 2, 3, 1], the array becomes [1, 2, 3, 4, 3, 2, 7, 8].

Position 6 holds 7 because 7 belongs at index 6 (7 - 1 = 6). The array has two 3s and two 2s, but also 7 and 8, which settle correctly. The duplicates (3 and 2) get stranded at positions 4 and 5, the exact slots where 5 and 6 should be.

Step 2: scan for mismatches.

Index:  0  1  2  3  4  5  6  7
Value:  1  2  3  4  3  2  7  8
Expect: 1  2  3  4  5  6  7  8

Mismatches at indices 4 and 5 → missing values 5 and 6.

Result: [5, 6].

def find_disappeared_numbers(nums: list[int]) -> list[int]:
    i = 0
    while i < len(nums):
        j = nums[i] - 1
        if nums[i] != nums[j]:
            nums[i], nums[j] = nums[j], nums[i]
        else:
            i += 1
    return [i + 1 for i in range(len(nums)) if nums[i] != i + 1]

Time: O(n). Space: O(1), not counting the output list.

Worked Example 2: First Missing Positive (LeetCode 41)

Problem: Given an unsorted integer array, find the smallest missing positive integer in O(n) time and O(1) space.

Example: nums = [3, 4, -1, 1]

Why cyclic sort fits: The smallest missing positive must be in [1, n + 1]. So any integer outside [1, n] is irrelevant noise. After filtering them out during the sort, the first mismatch in the sorted array is the answer.

Step 1: cyclic sort with out-of-range guard.

n=4, valid range [1, 4]

State       i  nums[i]  j  nums[j]  Decision
[3, 4,-1,1] 0    3      2   -1      valid(3), 3 ≠ -1 → swap(0,2)
[−1,4, 3,1] 0   -1      -   -       invalid(-1) → i++
[−1,4, 3,1] 1    4      3    1      valid(4), 4 ≠ 1 → swap(1,3)
[−1,1, 3,4] 1    1      0   -1      valid(1), 1 ≠ -1 → swap(1,0)
[ 1,-1, 3,4] 1   -1     -   -       invalid(-1) → i++
[ 1,-1, 3,4] 2    3      2    3     valid(3), 3 == 3 → i++
[ 1,-1, 3,4] 3    4      3    4     valid(4), 4 == 4 → i++
Done: [1, -1, 3, 4]

Step 2: scan for first mismatch.

Index:  0   1   2   3
Value:  1  -1   3   4
Expect: 1   2   3   4

First mismatch at index 1: answer is 1 + 1 = 2.

Result: 2.

The out-of-range guard does two things. It prevents j = nums[i] - 1 from producing a negative index (for values like -1 or 0). And it prevents it from producing an out-of-bounds index (for values larger than n). The result is that all noise values get stranded wherever they land, and only valid values compete for their correct positions.

Recap

Cyclic sort works on arrays whose elements are integers in a known bounded range, typically [1, n] or [0, n].
The correct index for value v in a [1, n] array is v - 1. This eliminates comparisons entirely.
The loop invariant: every index less than the current pointer holds its correct value.
Total swaps are bounded by n because each swap permanently settles one element. Combined with n pointer increments, total work is 2n = O(n).
The underlying theory is permutation cycle decomposition. A cycle of length k costs k - 1 swaps. Total: n minus the number of cycles.
The condition must be nums[i] != nums[j], not nums[i] != i + 1. The former handles duplicates safely; the latter loops forever on them.
For ranges that include out-of-bound values (LC 41), add the guard 1 <= nums[i] <= n before computing j.
After sorting, a single O(n) scan reads off missing numbers, duplicates, or the first missing positive directly from mismatched positions.
Cyclic sort is the O(1)-space alternative to a hash set for bounded-range problems.

Knowing the algorithm and deploying it cold in 45 minutes are different skills. If you want to close that gap with voice-based, rubric-graded mocks that test exactly this kind of pattern identification, SpaceComplexity is built for it.

The two-pointer technique is another O(n) array pattern worth having ready. Where cyclic sort exploits value-as-address, two pointers exploit sorted order or the convergence structure of a range. Together, they cover a large fraction of the "O(n), O(1) space" interview problems.

For understanding why using an array as a hash table works in the first place, the hash map time complexity breakdown traces the math from key to bucket to O(1) expected lookup.