C++ Sort and Custom Comparators: The Interview Reference

May 29, 20269 min read
dsaalgorithmsinterview-prepdata-structures
C++ Sort and Custom Comparators: The Interview Reference
TL;DR
  • std::nth_element rearranges in O(n) average so position k holds its sorted value — reaching for full sort instead is a missed optimization interviewers notice
  • Strict weak ordering requires comp(a, a) == false; using <= instead of < is undefined behavior, not just wrong output
  • Subtraction comparators (a - b) overflow at INT_MIN and have the wrong return type in C++ — always use <
  • Priority queue comparator semantics are inverted: comp(a, b) == true means a pops last; pass greater<int> for a min-heap
  • Lambda with priority_queue requires decltype(cmp) as the template type argument and cmp as the constructor argument — omitting either fails to compile
  • std::tie breaks for mixed-direction multi-field sorts — use explicit branching when per-field sort directions differ
  • Equivalent elements in std::set/std::map under a custom comparator are treated as duplicate keys and silently discarded

You write the comparator. It "looks right." The tests pass. Then it silently corrupts the array on duplicate inputs, and when you add a print statement to debug it, the bug disappears. Welcome to undefined behavior. This guide covers the full sorting toolkit, every comparator form, and the traps that surface at the worst possible time.

Which Sort Function Do You Actually Need?

std::sort handles the general case in O(n log n). Three other functions often fit better, and reaching for the right one signals depth.

FunctionComplexityUse when
std::sortO(n log n), unstableGeneral purpose; equal element order doesn't matter
std::stable_sortO(n log² n) / O(n log n) with memoryEqual element order must be preserved
std::partial_sortO(n log k)Need first k sorted; rest don't matter
std::nth_elementO(n) averageNeed the k-th element; no full sort required

std::sort is introsort internally: quicksort by default, heapsort if recursion depth exceeds 2·log₂(n), insertion sort for small subarrays. O(n log n) guaranteed, small constant.

std::nth_element is the one most candidates miss. It rearranges the array so position k holds exactly the element that would sit there sorted. Everything before is ≤, everything after is ≥. The algorithm runs in O(n) average via introselect. For "k-th largest" problems, reaching for full sort is the algorithmic equivalent of calling a cab to cross the street. Interviewers notice.

vector<int> nums = {3, 1, 4, 1, 5, 9, 2, 6}; // k-th smallest in O(n), not O(n log n) int k = 3; nth_element(nums.begin(), nums.begin() + k, nums.end()); int kth_smallest = nums[k]; // correct, rest of array unordered // top 3 values in O(n log 3) partial_sort(nums.begin(), nums.begin() + 3, nums.end(), greater<int>()); // nums[0..2] are the three largest, sorted

For what nth_element is doing under the hood, see the Quickselect algorithm walkthrough.

Three Custom Comparator Forms: Pick Lambdas

C++ gives you three syntactic options. Lambdas win in interview code.

Lambda (default choice):

vector<pair<int,int>> points = {{3, 1}, {1, 4}, {2, 5}}; // sort by second element descending, break ties by first element ascending sort(points.begin(), points.end(), [](const pair<int,int>& a, const pair<int,int>& b) { if (a.second != b.second) return a.second > b.second; return a.first < b.first; });

Functor (struct with operator()):

struct ByAbsValue { bool operator()(int a, int b) const { return abs(a) < abs(b); } }; sort(nums.begin(), nums.end(), ByAbsValue{});

Mark operator() as const. The sort machinery may call it on const objects, and a non-const operator() won't compile. This trips candidates who write functors quickly without thinking about constness.

Free function:

bool byLength(const string& a, const string& b) { return a.size() < b.size(); } sort(words.begin(), words.end(), byLength);

Free functions are the most readable but can't capture local variables. Use lambdas when you need to close over something: a target value, a reference map, a custom ordering table.

The Law You Cannot Break: Strict Weak Ordering

Every comparator passed to std::sort, std::set, std::map, or std::priority_queue must satisfy strict weak ordering. Violating it is undefined behavior. Not wrong output. Undefined behavior, which means the standard gives the implementation permission to do literally anything. Infinite loop, segfault, silently wrong results that pass every test you run. The compiler is not obligated to tell you, and in release builds it will not.

The four requirements:

  1. Irreflexivity: comp(a, a) must return false. An element is never less than itself.
  2. Asymmetry: if comp(a, b) is true, then comp(b, a) must be false.
  3. Transitivity: if comp(a, b) and comp(b, c), then comp(a, c).
  4. Transitivity of equivalence: if a and b are equivalent (neither is less than the other) and b and c are equivalent, then a and c are equivalent.

Your comparator means strict "less than," and equal elements must return false in both directions.

The most common violation is <= instead of <. If your comparator returns true for equal elements, comp(a, a) is true, breaking irreflexivity. The standard library now has permission to rearrange your array in whatever order it finds amusing, silently produce wrong output, or just segfault. Depending on build flags, all three are possible from the same source file.

// BAD: <= violates irreflexivity on equal elements sort(v.begin(), v.end(), [](int a, int b) { return a <= b; }); // undefined behavior // GOOD sort(v.begin(), v.end(), [](int a, int b) { return a < b; });

The Subtraction Bug Will Find You

You've seen subtraction comparators in old C tutorials and maybe Java code. They don't belong in C++. C++ comparators return bool, not int. But the deeper problem is integer overflow, and that breaks any language.

// BAD: overflows when a = INT_MIN, b = 1 (INT_MIN - 1 wraps to INT_MAX) sort(v.begin(), v.end(), [](int a, int b) { return a - b; }); // overflow + wrong return type // GOOD sort(v.begin(), v.end(), [](int a, int b) { return a < b; });

a - b compiles in C++ because non-zero integers implicitly convert to true, but the semantics break for negative differences and the overflow is real. Use <, never subtraction. This exact bug lived inside Java's Arrays.sort documentation for nine years before Joshua Bloch documented it in 2006. Nine years in a widely-read reference, on one of the most-used functions in the standard library. The fact that it survived says everything about how rarely the bad inputs show up in normal test cases.

Sorting By Multiple Fields

When sorting by multiple fields, chain conditionals explicitly or delegate to std::tie.

struct Job { int deadline; int profit; }; // explicit chaining: clear about direction per field sort(jobs.begin(), jobs.end(), [](const Job& a, const Job& b) { if (a.deadline != b.deadline) return a.deadline < b.deadline; return a.profit > b.profit; // tie-break by profit descending });

std::tie builds a tuple of references and compares lexicographically. It works cleanly when all fields sort in the same direction:

// both fields ascending: tie is clean sort(jobs.begin(), jobs.end(), [](const Job& a, const Job& b) { return tie(a.deadline, a.profit) < tie(b.deadline, b.profit); });

When directions differ, tie breaks down. You can negate numeric fields (-a.profit), but that overflows for INT_MIN and doesn't work for strings. Stick with explicit branching when directions differ.

For LC 179 (largest number), the comparator is non-obvious: sort strings by concatenation order. This satisfies strict weak ordering because string lexicographic comparison does.

sort(strs.begin(), strs.end(), [](const string& a, const string& b) { return a + b > b + a; });

Your Priority Queue Comparator Is Backwards (On Purpose)

Priority queues default to max-heap. priority_queue<int> pops the largest element. The comparator semantics are the opposite of std::sort, and the standard library will not warn you about this.

In a priority queue, comp(a, b) == true means a has lower priority than b. B gets popped first. To get a min-heap, you pass greater<int>, which tells the queue "larger means lower priority, so pop the minimum." The mental gymnastics required here are real, and the only cure is writing it enough times that the muscle memory takes over.

// max-heap: default priority_queue<int> max_heap; // min-heap priority_queue<int, vector<int>, greater<int>> min_heap; // custom: pop the pair with smallest second element auto cmp = [](const pair<int,int>& a, const pair<int,int>& b) { return a.second > b.second; // true means a has LOWER priority }; priority_queue<pair<int,int>, vector<pair<int,int>>, decltype(cmp)> pq(cmp);

The decltype(cmp) as template argument and cmp as constructor argument is the required idiom for lambdas. Lambdas don't have default constructors, so passing only the type without the instance fails to compile. Every candidate who skips the constructor argument finds this out mid-interview.

Write a quick mental test before coding: "what do I want at the top?" Then write the comparator so that everything else is lower priority relative to that element.

SpaceComplexity runs voice-based mock interviews where you have to narrate the priority direction before writing the comparator. That narration step catches the inversion before the code does.

Sorted Containers: The Comparator Is a Type, Not a Value

std::set and std::map take the comparator as a template type parameter. Passing a lambda requires the decltype pattern.

// functor: clean, no boilerplate struct ByAbsValue { bool operator()(int a, int b) const { return abs(a) < abs(b); } }; set<int, ByAbsValue> s; // lambda: needs decltype and constructor argument auto cmp = [](int a, int b) { return abs(a) < abs(b); }; set<int, decltype(cmp)> s(cmp);

If two elements are equivalent under your comparator (neither comp(a,b) nor comp(b,a)), the container treats them as the same key. Build a set<string> sorted by length, insert "abc" after "xyz", and it silently discards "abc". Both have length 3. Technically correct per the spec. Almost never what you wanted.

For a broader look at C++ standard library tools in interviews, the C++ for coding interviews cheat sheet covers collections, gotchas, and patterns that sort questions build on.

Before You Write Anything: Two Checklists

Before reaching for sort:

  1. Do I need the k-th element only? Use nth_element. O(n).
  2. Do I need the smallest k elements? Use partial_sort. O(n log k).
  3. Do I need equal elements to stay in their original order? Use stable_sort.
  4. Otherwise, sort.

Before writing a comparator:

  1. Is it for a priority queue? true means lower priority.
  2. Does it return false for equal inputs? It must. If not, UB.
  3. Am I using subtraction? Don't. Use <.

Further Reading