What Is a Comparison Sort? The O(n log n) Floor You Can't Break

Every sorting algorithm you've ever written probably works by comparing pairs of elements. Greater than? Swap. Less than? Leave it. That's the intuition. But there's a mathematical ceiling baked into that approach, and it's one of the most elegant proofs in computer science.

Comparison sorts are bounded from below. No matter how clever your implementation, if your algorithm decides order purely by comparing elements, you can't sort n items in fewer than Θ(n log n) comparisons in the worst case. Not sometimes. Always. And there's a proof.

At some point, every developer has the thought: what if I write a sort that's just... smarter? What if I pick comparisons in a cleverer order, or cache the intermediate results, or approach it from a completely fresh angle? The lower bound doesn't care. You can be as clever as you want. The tree still needs n! leaves, and that means O(n log n) depth. Information theory beat you here before you even started.

What Makes a Sort a Comparison Sort?

A comparison sort is any sorting algorithm whose only tool for determining order is pairwise element comparisons. It can ask "is A less than B?" as many times as it wants, in any order. What it cannot do is look at the actual values, their bit patterns, or their positions in any way other than through a comparison operator.

This constraint matters more than it sounds. The moment your algorithm knows something about the values themselves (they're integers in a range, they're strings of bounded length) it has a way out. But if all it can do is ask "which one's bigger," then it's working with exactly one bit of information per question. That's the limitation the proof exploits.

The canonical examples you've seen:

Merge sort
Quicksort
Heapsort
Insertion sort
Bubble sort (yes, even bubble sort counts, though using it in an interview will count against you for different reasons)
Selection sort

All comparison sorts. All fundamentally limited by the same lower bound.

Three Elements, Six Possibilities

Say you have [3, 1, 2]. How many comparisons does it take to sort them?

You compare 3 and 1. You compare 1 and 2. Maybe you compare 3 and 2. The sequence depends on your algorithm, but you're always asking "which is larger?" and branching on the answer.

Think about it from the algorithm's perspective. Before it starts, any of the 3! = 6 possible orderings could be the correct answer. Each comparison produces one bit of information, splitting the possibility space in two. To distinguish 6 cases, you need at least log₂(6) ≈ 2.58 comparisons, which rounds up to 3.

This is information theory, not algorithm cleverness. You can't extract more than one bit per comparison. And you need exactly log₂(n!) bits to identify which of the n! permutations is the right sorted order. There's no shortcut hiding somewhere in the comparison sequence.

With n elements, you have n! possible orderings, so you need at least log₂(n!) comparisons. That quantity is Θ(n log n).

The Decision Tree Lower Bound Proof

This is the clean formal version. Model any comparison sort as a binary decision tree. Each internal node is a comparison. Each edge is the outcome (≤ or >). Each leaf is one possible sorted order.

For the sort to be correct, the tree must have at least n! leaves (one per permutation of the input). A binary tree with L leaves has height at least log₂(L). So the height of the decision tree is at least log₂(n!).

Decision tree for sorting 3 elements: 5 comparison nodes, 6 permutation leaves, height 3 matching ⌈log₂ 6⌉

What's log₂(n!)? Stirling's approximation gives:

ln(n!) = n ln(n) - n + O(log n)

In base 2: log₂(n!) ≈ n log₂(n) - n log₂(e) + O(log n)

The dominant term is n log₂(n), which means any comparison sort must make at least Ω(n log n) comparisons in its worst case. Merge sort and heapsort both achieve this. They're optimal comparison sorts.

The Stirling approximation can feel like a magic trick. The intuition is pretty clean though: n! is roughly (n/e)^n. Take the log and you get n·log(n/e) = n·log(n) - n·log(e). The second term is O(n), which is lower-order than n log n, so it falls out of the asymptotic picture. What's left is n log n.

One important nuance: randomization doesn't help. Even randomized comparison sorts like randomized quicksort hit the same lower bound on average. The bound is information-theoretic, not algorithmic. You're just not getting more than one bit of information per comparison, and log₂(n!) bits are required.

When someone asks you to walk through the decision tree proof

The face of someone who just realized they have to explain Stirling's approximation before they can explain why merge sort is optimal.

Comparison Sort Time and Space at a Glance

"Stable" means equal elements keep their original relative order. It matters when you're sorting on one key and want to preserve a previous sort on another.

Algorithm	Time (worst)	Time (average)	Space	Stable?
Merge sort	O(n log n)	O(n log n)	O(n)	Yes
Heapsort	O(n log n)	O(n log n)	O(1)	No
Quicksort	O(n²)	O(n log n)	O(log n)	No
Insertion sort	O(n²)	O(n²)	O(1)	Yes
Bubble sort	O(n²)	O(n²)	O(1)	Yes

Merge sort and heapsort are both worst-case optimal. Quicksort's worst case is O(n²), but its average-case constant factors are so good that it usually wins in practice. The catch: if you give naive quicksort a sorted or nearly-sorted array with a first-element pivot, it will politely deliver O(n²) performance and technically be correct about it. Modern implementations use randomized pivoting or median-of-three to avoid this. Heapsort gets O(1) space but poor cache behavior, which costs it in benchmarks despite matching merge sort asymptotically.

When You Can Beat O(n log n)

The lower bound only applies if you're using comparisons. Non-comparison sorts sidestep the proof entirely by exploiting structure in the values themselves.

Counting sort allocates a frequency array indexed by value. You count how many times each value appears, then reconstruct the sorted array. No comparisons at all. Time: O(n + k) where k is the range of values. Sorting user ages (all between 0 and 120)? O(n + 120) = O(n). Merge sort would take O(n log n) on the same input and be provably worse for this case.

Radix sort processes integers digit by digit, using a stable sort at each pass. Time: O(d(n + b)) where d is the number of digits and b is the base.

Bucket sort distributes elements into buckets based on value ranges, sorts each bucket (often with insertion sort), then concatenates. With a uniform distribution, this hits O(n) on average.

The catch: these algorithms don't sort arbitrary comparable objects. They work on specific input types (integers, fixed-length strings) with bounded value ranges. You can't radix sort a list of custom objects without first mapping them to integers somehow. The moment you're back to asking "which one comes first?" about abstract objects, you're in comparison sort territory, lower bound and all.

What Languages Actually Use

Python's list.sort() and sorted() use Timsort, a hybrid of merge sort and insertion sort developed by Tim Peters in 2002. It's stable and runs in O(n log n) worst case, with O(n) best case on nearly-sorted input. Timsort looks for "runs" (already-sorted subsequences) in the data and merges them rather than sorting from scratch. Real-world data is often partially sorted (logs, time-ordered records, user input that arrives in rough order), and Timsort is built for that world. It's still a comparison sort, bounded by the same proof, but engineered to exploit what real inputs actually look like.

Java's Arrays.sort() splits depending on what you're sorting. For primitive types (int[], double[], etc.), it uses dual-pivot quicksort: a comparison sort with better constants than classic quicksort. For object arrays (Integer[], String[], custom objects), it uses Timsort, because stability matters when sorting objects.

JavaScript's Array.prototype.sort is technically implementation-defined, but V8 (Chrome, Node) switched to Timsort in 2019. It's a comparison sort.

The takeaway: in an interview, when you call a built-in sort, you're almost certainly invoking a comparison sort with O(n log n) guaranteed or expected time. Knowing that matters when you're reasoning about complexity.

Why Interviews Test This

The O(n log n) lower bound is a classic interviewer question for one specific reason: it lets them check if you know when you're stuck and when you're not.

If someone asks "can we sort this in O(n)?" your answer should depend on what you're sorting. Arbitrary comparable objects? No. You're bounded by Ω(n log n). Integers in a known range? Possibly. Counting sort or radix sort might apply.

A few common interview scenarios where this comes up:

"Sort this array of integers from 0 to 100." The range is bounded. Counting sort works. O(n + k) = O(n + 100) = O(n). Reaching for merge sort here is correct but leaves an obvious optimization on the table, and interviewers notice. It signals you're pattern-matching ("sorting means merge sort") rather than reasoning about the structure of the input.

"Sort these strings." Strings compare character by character. The comparison itself takes O(m) time per string where m is the length. A sorting algorithm makes O(n log n) comparisons, but each comparison costs O(m), so the total time is O(nm log n). A 20-character string multiplies your log factor by 20. This is easy to miss under pressure, and stating "O(n log n)" without the string-length caveat is the kind of subtle miss that shows up in feedback write-ups.

"Can we do better than O(n log n) to sort these user objects?" No, not with comparisons. That's a solid answer backed by a proof, not a guess. Say it confidently: "The decision tree lower bound shows any comparison sort needs Ω(n log n) comparisons in the worst case. We'd need to exploit structure in the values themselves to break out of that." That answer is better than "I don't think so."

The other reason this comes up: stability. Heapsort is worst-case optimal but unstable. Merge sort is worst-case optimal and stable. That tradeoff decides which one you pick when equal elements need to preserve their original relative order, as in a multi-key sort or stable ranking. See the stable sort deep dive for where this actually bites you in practice.

If you want to practice articulating this reasoning under pressure, voice-based mock interviews at SpaceComplexity will push you to explain the why, not just the answer. The lower bound is exactly the kind of thing an interviewer expects you to justify out loud, not just state.