What Is the Closest Pair of Points? The O(n log n) Algorithm Explained

June 17, 20269 min read
dsaalgorithmsinterview-prepdata-structures
What Is the Closest Pair of Points? The O(n log n) Algorithm Explained
TL;DR
  • Brute force checks all O(n²) pairs; the closest pair of points algorithm uses divide and conquer to find the answer in O(n log n) by splitting on x-coordinates.
  • The strip check narrows the merge to points within distance d of the dividing line, where d = min(left closest distance, right closest distance).
  • The 8-point bound proves each strip point has at most 7 neighbors to compare: a d × 2d box above any point divides into 8 cells, each holding at most one prior point.
  • Re-sorting the strip by y inside each merge gives O(n log² n); pre-sorting by y before recursion achieves true O(n log n).
  • Three classic bugs: strip width d instead of 2d, missing base case for n ≤ 3, and mixing squared distances with real distances in the inner-loop comparison.
  • The lower bound is tight: closest pair requires Ω(n log n) in the algebraic decision tree model, so no algorithm can do asymptotically better.

You have a million points scattered across a 2D plane. A production system that needs to know which two are closest together. And five seconds before you ship it, you remember that your nested-loop solution is O(n²).

This guide walks through the closest pair of points algorithm from scratch: the brute force baseline, the divide and conquer approach that beats it, and the non-obvious geometric insight that makes the whole thing work in O(n log n).


What You're Actually Solving

Given a set of points, find the pair with the minimum Euclidean distance.

Points: (2,3), (12,30), (40,50), (5,1), (12,10), (3,4)
Answer: (2,3) and (3,4), distance ≈ 1.414

No special constraints. No graph structure. Just a list of (x, y) coordinates and the nearest pair.

This shows up in computational geometry, collision detection, and clustering. It also shows up in interviews where the examiner wants to see whether you can replace an O(n²) loop with a geometric argument.


Brute Force: Technically Correct, Practically Useless

import math def distance(p1, p2): return math.sqrt((p1[0] - p2[0])**2 + (p1[1] - p2[1])**2) def closest_pair_brute(points): min_dist = float('inf') n = len(points) for i in range(n): for j in range(i + 1, n): min_dist = min(min_dist, distance(points[i], points[j])) return min_dist

Check every pair, keep the minimum. O(n²) time, O(1) space. Correct on every input. Unusable past a few hundred thousand points.

At n = 10⁶, brute force needs roughly 5 × 10¹¹ distance calculations. Around eight minutes of wall time. The divide and conquer version handles the same input in under a second.

Friend asks how long you've been debugging. You say since 5pm. Friend says it's 4pm.

Your brute force on n = 10⁶. Still running.


How to Actually Beat O(n²)

The approach mirrors merge sort. Split the problem, solve the halves recursively, combine results in the merge step.

  1. Sort all points by x-coordinate.
  2. Split at the median into left half and right half.
  3. Recursively find the closest pair in each half. Call those distances d_L and d_R.
  4. Let d = min(d_L, d_R).
  5. Check whether a pair crossing the dividing line has distance less than d.

Steps 1 through 4 are standard. Step 5 is where the interesting work happens, and where most people get the analysis wrong.


Why Crossing the Line Is the Hard Part

A pair that crosses the dividing line must have one point on each side. For that pair to beat d, both points must be within d of the dividing line. So you only need to examine points inside a vertical strip of width 2d centered at the median x-coordinate.

        |
 ←─d─→  |  ←─d─→
        |
   (dividing line)

You collect all points in this strip and ask: what is the shortest distance among them? The naive approach is O(|strip|²). The key question is whether the strip comparison can be made linear.

It can. But the proof requires a geometric observation.


The 8-Point Bound: Why the Inner Loop Runs Seven Times, Not n Times

Sort the strip by y-coordinate. For each point p, scan upward while the y-distance from p is less than d. On the surface this looks O(n²). In practice the inner scan terminates after at most 7 comparisons.

Consider the rectangular box of width 2d and height d sitting above point p. Divide that box into 8 squares of size d/2 × d/2:

+--------+--------+--------+--------+
|        |        |        |        |  ← row 1 (height d/2)
+--------+--------+--------+--------+
|        |        |        |        |  ← row 2 (height d/2)
+--------+--------+--------+--------+
←d/2→  ←d/2→  ←d/2→  ←d/2→
←───────────── 2d ─────────────────→

Eight cells. Each d/2 × d/2 square has diagonal d/√2 ≈ 0.707d.

Within any single cell, two points would be at most 0.707d apart, which is less than d. But both halves already have minimum distance d by the recursive result, so each cell holds at most one point. Eight cells, at most seven other points in the box above p.

The inner scan terminates in constant time. The strip check is O(n).

This is the part most candidates wave their hands through. "And then the strip is fast because geometry." The specific bound is what the interviewer is waiting for.


Closest Pair of Points Algorithm: Full Python Code

import math def distance(p1, p2): return math.sqrt((p1[0] - p2[0])**2 + (p1[1] - p2[1])**2) def brute_force(points): min_dist = float('inf') for i in range(len(points)): for j in range(i + 1, len(points)): min_dist = min(min_dist, distance(points[i], points[j])) return min_dist def strip_closest(strip, d): strip.sort(key=lambda p: p[1]) # sort by y-coordinate min_dist = d for i in range(len(strip)): j = i + 1 while j < len(strip) and strip[j][1] - strip[i][1] < min_dist: min_dist = min(min_dist, distance(strip[i], strip[j])) j += 1 return min_dist def closest_rec(pts): n = len(pts) if n <= 3: return brute_force(pts) mid = n // 2 mid_x = pts[mid][0] d_left = closest_rec(pts[:mid]) d_right = closest_rec(pts[mid:]) d = min(d_left, d_right) strip = [p for p in pts if abs(p[0] - mid_x) < d] return min(d, strip_closest(strip, d)) def closest_pair(points): pts = sorted(points, key=lambda p: p[0]) return closest_rec(pts)

One subtlety: strip_closest re-sorts by y inside each recursive call. That costs O(n log n) per merge step. The recurrence becomes T(n) = 2T(n/2) + O(n log n), which the Master Theorem resolves to O(n log² n). For most interview purposes this is fine. The theoretical O(n log n) requires maintaining a separate y-sorted order across all recursion levels, which adds implementation complexity without changing the core insight.


Following the Split

Take points sorted by x: (0,0), (1,2), (3,1), (4,5), (5,4), (7,3). Split at index 3.

Left half: d_L = distance((0,0),(1,2)) = √5 ≈ 2.24. Right half: d_R = distance((4,5),(5,4)) = √2 ≈ 1.41.

d = 1.41. Strip = points within 1.41 of mid_x = 3: namely (1,2), (3,1), (4,5). Sort by y: (3,1), (1,2), (4,5).

Strip check: (3,1) to (1,2) = √5 ≈ 2.24, worse than d. (1,2) to (4,5): y-gap = 3 > d, while loop exits. Final answer: √2 from the right half.

The strip added nothing. That is the common case.


O(n²) vs O(n log n): The Numbers

ApproachTimeSpace
Brute forceO(n²)O(1)
Divide and conquer (strip re-sort)O(n log² n)O(n)
Divide and conquer (y-presort)O(n log n)O(n)

The O(n log n) divide and conquer is asymptotically optimal. In the algebraic decision tree model, every correct algorithm for closest pair requires Ω(n log n) comparisons. Sorting reduces to closest pair, and sorting is Ω(n log n), so the bound is tight. See comparison sort lower bound for that argument.

The space usage is O(n) for the strip arrays. Even with O(log n) recursion levels, strip arrays at any single level contain at most n total points, and they are not all alive simultaneously.

Cartoon villain presenting four CS theory classes: P, NP-Complete, Halting Problems, and Cryptography I Didn't Write

When the interviewer asks why O(n log n) is optimal and you have to remember that information-theoretic lower bound argument from sophomore year.


What the Interviewer Wants to Hear

Closest pair appears in three contexts.

Direct algorithm question. "Design an algorithm to find the closest pair of points in better than O(n²)." The interviewer wants to see that you recognize divide and conquer as the pattern, split on x-coordinate, and describe the strip merge with the 8-point bound. Most candidates can describe the split. Fewer can explain why the strip check is linear. That gap is exactly where the interesting signal comes from.

Follow-up depth. After a merge sort question: "Can you think of a problem where the merge step costs more than O(n)?" Closest pair is the canonical answer. The strip sort bumps the total from O(n log n) to O(n log² n).

Pattern identification. The 8-point bound is an instance of a broader technique: bounding a seemingly quadratic loop with a packing argument. The same shape appears in computational geometry, randomized algorithm analysis, and interval problems. Once you see it here, you start recognizing it everywhere.

To practice explaining the strip bound argument out loud, SpaceComplexity runs voice-based mock interviews where you walk through your reasoning in real time, which is exactly how this question gets asked in person.


Where People Get It Wrong

Strip width is 2d, not d. The strip extends d on each side of the dividing line. Using just d misses crossover pairs near the boundary. This is the most common mistake and it produces wrong answers on exactly the cases that matter.

Base case too small. Handle n ≤ 3 via brute force. Recursing to n = 1 or n = 2 without a guard causes index errors or infinite recursion. Three lines of code, but candidates skip it half the time.

Mixing squared and real distances. You can skip the square root in most comparisons, but strip[j][1] - strip[i][1] < min_dist uses actual distance. If min_dist is squared, the comparison is wrong.

Forgetting to update strip width as min_dist shrinks. The while loop uses < min_dist, not < d. As the scan finds closer pairs, the window narrows. This is correct and makes the algorithm slightly faster in practice.


Further Reading