What Is Quickselect? The O(n) Algorithm for the k-th Largest Element

June 5, 20268 min read
dsaalgorithmsinterview-prepdata-structures
What Is Quickselect? The O(n) Algorithm for the k-th Largest Element
TL;DR
  • Quickselect finds the k-th largest or smallest element in O(n) average time by partitioning once per pass and discarding the irrelevant half.
  • The partition step places a pivot at its final sorted index; only the half containing position k is recursed into, never both.
  • Worst case is O(n²) on sorted input with a deterministic pivot; randomizing the pivot defends against any adversarial input and keeps expected comparisons under 4n.
  • Use a heap when k is small (O(n log k), O(k) space) or for streaming data; Quickselect wins when k is large, roughly when k exceeds n / log n.
  • The iterative version replaces recursion with a while loop, achieving true O(1) space with no call-stack overhead.
  • LeetCode 215 (Kth Largest Element in an Array) is the canonical interview problem; the same pattern applies to LC 973, 347, and 324.

Finding the kth largest element in an unsorted array sounds like a job for sorting. Sort it, index the right position, done. That works. It's also O(n log n), which is like hiring a full moving crew to carry one box across the room. Quickselect finds the kth largest element in an array in O(n) average time without ever fully sorting it.

Tony Hoare invented it in 1961, the same year he published Quicksort. Same primitive, one change: only recurse into the half that contains your target. Throw everything else away.

The k-th Largest Element in an Array

You have an unsorted array. You want the k-th smallest element, or the k-th largest by flipping the index.

arr = [7, 2, 1, 6, 5, 3, 4] k = 3 # Find the 3rd smallest element # Answer: 3

The canonical interview version is LeetCode 215: Kth Largest Element in an Array. It appears at Meta and Amazon, and knowing quickselect cold is worth the 30 minutes it takes to understand.

The Partition Step Is Everything

Quickselect is built on one primitive: the partition operation from Quicksort. It sounds unexciting. It's actually the whole trick.

Partition picks a pivot, rearranges the array so everything less than the pivot comes before it and everything greater comes after it, and returns the pivot's final index.

After partitioning, the pivot is in its final sorted position. Nothing will ever move it again.

Here's the Lomuto partition scheme:

def partition(arr: list[int], left: int, right: int) -> int: pivot = arr[right] store = left for i in range(left, right): if arr[i] <= pivot: arr[i], arr[store] = arr[store], arr[i] store += 1 arr[store], arr[right] = arr[right], arr[store] return store

After this runs, arr[store] is the pivot. Everything to its left is smaller, everything to its right is larger. The pivot's final rank in the sorted array is now known, no sorting required.

One Recursive Call, Not Two

This is where Quickselect diverges from Quicksort. And where it gets fun.

Quicksort recurses on both halves to produce a full ordering. Quickselect recurses on only the half that contains position k. If the pivot landed exactly at position k, you're done without touching anything else.

Half the data, thrown away. Guilt-free.

def quickselect(arr: list[int], left: int, right: int, k: int) -> int: if left == right: return arr[left] pivot_index = partition(arr, left, right) if pivot_index == k: return arr[pivot_index] elif pivot_index < k: return quickselect(arr, pivot_index + 1, right, k) else: return quickselect(arr, left, pivot_index - 1, k)

Three cases, and only one recursive call:

  • Pivot landed exactly at k. Done. Return arr[k].
  • Pivot landed left of k. The answer is in the right subarray. Recurse right.
  • Pivot landed right of k. The answer is in the left subarray. Recurse left.

Two Passes. Never Sorted.

Find the 3rd smallest element in [7, 2, 1, 6, 5, 3, 4]. Zero-indexed: k=2.

Pass 1. Pivot = 4 (last element). After partitioning:

[2, 1, 3, 4, 5, 6, 7]
          ^
   pivot_index = 3

k=2, pivot_index=3. Pivot is right of k. Recurse left on [2, 1, 3], indices 0 to 2.

Pass 2. Pivot = 3. After partitioning:

[2, 1, 3]
       ^
pivot_index = 2

k=2, pivot_index=2. Match. Return arr[2] = 3.

Two passes. The entire right half of the original array was never touched after pass 1.

Partition step: pivot lands at index 3, right half discarded

The pivot locks into its final position after one pass. Everything to the right goes in the bin.

Why It's Faster Than Sorting

Sorting processes every element at every level of the recursion tree. Quickselect throws away the irrelevant half after each partition.

In the average case, each pass halves the problem: T(n) = T(n/2) + O(n). That solves to O(n).

The math is just a geometric series: n + n/2 + n/4 + ... converges to 2n. You do roughly twice the work of a single scan. Compare that to Quicksort's n log n passes over everything.

MethodAverageWorstSpace
Sort + indexO(n log n)O(n log n)O(1) to O(n)
Min-heap of size kO(n log k)O(n log k)O(k)
Quickselect (randomized)O(n)O(n²)O(1)
Median of MediansO(n)O(n)O(1)

For most interview problems, O(n log k) with a heap is acceptable. Quickselect pulls ahead when k is large, or when the interviewer explicitly asks for a linear bound.

The Sorted-Array Trap

Here's a fun way to ruin your day.

Quickselect's worst case is the same as Quicksort's: sorted input with a naive pivot. If you always pick the last element as pivot and the input is already sorted, every partition produces one subarray of size n-1 and one of size 0. You thought you were getting O(n). You're getting O(n²). The array came in sorted. It breaks you.

The fix is a one-liner:

import random def partition(arr: list[int], left: int, right: int) -> int: rand_index = random.randint(left, right) arr[rand_index], arr[right] = arr[right], arr[rand_index] pivot = arr[right] store = left for i in range(left, right): if arr[i] <= pivot: arr[i], arr[store] = arr[store], arr[i] store += 1 arr[store], arr[right] = arr[right], arr[store] return store

With a random pivot, the expected comparisons are at most 4n, and no adversarial input can reliably trigger the worst case.

If you need a guaranteed O(n) worst case, the Median of Medians algorithm finds a pivot guaranteed to partition at least 30/70. The constant factor is large enough that nobody uses it in practice, but knowing it exists is what separates "I randomize the pivot" from "I can also guarantee the bound."

O(1) Space With the Iterative Version

Quickselect is in-place. The partition step uses O(1) extra space.

The recursive version uses O(log n) call stack on average. The iterative version eliminates the stack entirely:

def quickselect_iterative(arr: list[int], k: int) -> int: left, right = 0, len(arr) - 1 while left < right: pivot_index = partition(arr, left, right) if pivot_index == k: break elif pivot_index < k: left = pivot_index + 1 else: right = pivot_index - 1 return arr[k]

True O(1) space. Worth knowing if an interviewer asks you to optimize stack usage.

Every k-th Problem Is a Quickselect Candidate

The canonical problem is LeetCode 215. It's often the expected follow-up after you give a heap solution: "Can you do better than O(n log k)?"

Other problems where quickselect applies:

  • K Closest Points to Origin (LeetCode 973). Compute squared distances, use quickselect to find the k-th smallest distance, return all points within that distance. No sorting needed.
  • Top K Frequent Elements (LeetCode 347). Use a frequency map, then quickselect on the frequency values.
  • Wiggle Sort II (LeetCode 324). Uses the median as a pivot found via quickselect.

The recognition signal: any "find the k-th [largest|smallest|closest|most frequent]" problem on a static array is a quickselect candidate. The top-k heap pattern is the easier implementation; quickselect is the faster algorithm.

Quickselect vs Heap: When to Use Each

A min-heap of size k for the k-th largest:

import heapq def kth_largest_heap(nums: list[int], k: int) -> int: heap = [] for num in nums: heapq.heappush(heap, num) if len(heap) > k: heapq.heappop(heap) return heap[0]

O(n log k), O(k) space, clean implementation. For small k this is excellent. Finding the 3rd largest in a million numbers means log k ≈ 1.6, barely above linear. Just use the heap.

Quickselect wins when k is large. Finding the median (k = n/2) with a heap costs O(n log n/2) = O(n log n). Quickselect stays O(n). The crossover is roughly when k exceeds n / log n.

For streaming data, the heap is the only option since quickselect requires the full array to partition. The heap data structure goes into the mechanics.

Name Both Approaches Before Touching the Keyboard

When you see a k-th element problem, name both solutions before writing any code. Sketch the heap solution first if O(n log k) is acceptable. Then offer quickselect as the O(n) option and implement it only if the interviewer pushes.

The mistake to avoid: coding quickselect with a deterministic last-element pivot and saying nothing about the worst case. The interviewer will ask. Lead with "I'll randomize the pivot to get O(n) expected time and defend against any sorted input." Then code it, hit the edge cases, narrate as you go.

If you want to practice pivot strategy, randomization, and complexity trade-offs out loud under pressure, SpaceComplexity runs voice-based mock DSA interviews with rubric feedback so you practice the actual performance, not just the code.

Further Reading