What Is the Sweep Line Algorithm? The Pattern That Turns 2D Into 1D

June 17, 20269 min read
dsaalgorithmsinterview-prepleetcode
What Is the Sweep Line Algorithm? The Pattern That Turns 2D Into 1D
TL;DR
  • Sweep line algorithm converts O(n²) interval pair comparisons into O(n log n) sorted event processing.
  • Events are the only moments that matter: interval starts (+1) and ends (-1), sorted by timestamp.
  • Two flavors: a counter for "how many active" problems, a heap or sorted structure for "current max/min" problems like the Skyline Problem.
  • Tie-breaking order matters when two events share a timestamp — end before start is the most common rule and getting it wrong shifts the answer by one.
  • O(n log n) is always the floor because the sort is the bottleneck, not the event traversal.
  • Recognition signal: intervals, brute force O(n²), answer depends on simultaneous activity rather than which specific pairs overlap.

You have a list of meeting intervals. Someone needs to know how many conference rooms to book. Your first instinct: compare every meeting against every other meeting. That's O(n²), and it's also the sound of your interviewer quietly reaching for the "no hire" column.

The sweep line algorithm is the better idea.

Instead of comparing all pairs, imagine a vertical line sliding left to right across a timeline. Every time it hits something interesting, you update a count. That's genuinely the whole thing. A 2D overlap problem becomes a 1D scan in sorted order, and the power comes from processing events in sequence rather than comparing every pair at once.

What the Line Is Actually Doing

Picture a horizontal timeline with intervals drawn as bars. You want the maximum number of bars stacked on top of each other at any point.

Brute force: for every point on the timeline, count how many intervals contain it. O(n²) if you're naive about it. Technically correct. Technically the wrong answer in an interview.

Sweep line: you don't care about every point. You only care about the points where the count changes. Those are the events. An interval starting is an event. An interval ending is an event. Between events, nothing changes, so nothing to check.

So you:

  1. Create one event for each interval start (type: +1)
  2. Create one event for each interval end (type: -1)
  3. Sort all events by time
  4. Walk through them left to right, tracking a running total

The maximum value of that running total is your answer.

Sort the events, process them left to right, maintain running state. Everything else is a detail.

Three Parts Every Sweep Line Algorithm Needs

Every sweep line solution has the same skeleton. Once you recognize it, you'll start seeing it everywhere, which is either satisfying or slightly haunting.

Event points. The x-coordinates (or timestamps, or values) where something changes. For intervals, that's starts and ends. For a building skyline, it's left and right edges of each building.

Event queue. Sort all event points and process them in order. The sort is the O(n log n) cost. Tie-breaking rules matter here and will bite you if you ignore them.

Sweep line status. A data structure tracking what's currently active as the line sweeps past. For simple problems this is just an integer counter. For harder problems it's a heap or sorted map.

A Worked Example: Meeting Rooms II

LeetCode 253. You get a list of meeting intervals like [[0,30],[5,10],[15,20]]. Find the minimum number of rooms required.

Brute force approach: build a conflict graph, find the maximum clique. Counting maximum cliques in a general graph is NP-hard. Genuinely NP-hard, not just "hard." So that's out.

The actual insight: you don't care which specific meetings conflict, only how many are happening at once. The maximum number of overlapping intervals at any moment equals the number of rooms you need.

def minMeetingRooms(intervals): events = [] for start, end in intervals: events.append((start, 1)) events.append((end, -1)) # End events before start events at same time: # a room freed at 5 can be reused by a meeting starting at 5. events.sort(key=lambda x: (x[0], x[1])) rooms = 0 max_rooms = 0 for time, delta in events: rooms += delta max_rooms = max(max_rooms, rooms) return max_rooms

For [[0,30],[5,10],[15,20]]: events fire at (0,+1), (5,+1), (10,-1), (15,+1), (20,-1), (30,-1). Running totals: 1, 2, 1, 2, 1, 0. Answer: 2 rooms.

O(n log n) time for the sort, O(n) space for the event list.

The tie-breaking detail is the one that gets people: if a meeting ends at time 5 and another starts at time 5, the end should be processed first. A room freed at T is available for a meeting starting at T. Sorting by (time, delta) handles this because -1 sorts before +1 at the same timestamp. Get it wrong and your answer is off by one on exactly the edge case your interviewer will test.

The Skyline Problem: When a Counter Isn't Enough

LeetCode 218. You get a list of buildings as [left, right, height] and need to return the "skyline" as a list of [x, height] critical points where the silhouette changes.

Same sweep line skeleton. Harder status structure. Instead of counting active intervals, you need the maximum active height at each point. That requires a heap.

Events: each building enters at its left edge and leaves at its right edge. Status: a max-heap of currently active building heights (with their right edges so you know when they expire).

At each event, update the active heights and check if the maximum changed. If it did, that's a critical point in the skyline.

import heapq def getSkyline(buildings): events = [] for left, right, height in buildings: events.append((left, -height, right)) # negative height for max-heap events.append((right, 0, 0)) events.sort() result = [] heap = [(0, float('inf'))] # ground level, never expires prev_max = 0 for x, neg_h, end in events: if neg_h != 0: heapq.heappush(heap, (neg_h, end)) while heap[0][1] <= x: heapq.heappop(heap) curr_max = -heap[0][0] if curr_max != prev_max: result.append([x, curr_max]) prev_max = curr_max return result

The lazy-deletion trick means you only remove heap entries when they reach the top, not the moment they expire. This avoids needing a full ordered set and keeps the code clean. O(n log n) time, O(n) space.

Why It's Always O(n log n)

ComponentCost
Building eventsO(n)
Sorting eventsO(n log n)
Processing each event with a heapO(log n) per event
Total timeO(n log n)
SpaceO(n)

The sort is the bottleneck, not the traversal. The comparison sort lower bound proves you can't sort n items faster than O(n log n) in the comparison model. Since sweep line depends on sorted events, that floor transfers directly.

For a difference-array variant, you can sometimes reach O(n + K) where K is the coordinate range, but only when K is small enough to hold in memory. If your coordinates go up to 10^9, that's not happening.

Two Flavors of the Same Pattern

There's a spectrum here, and knowing which end you're on determines your data structure choice.

Counter-based. Status is just an integer. You're counting how many intervals are active. Meeting Rooms II lives here. Pattern: +1 for starts, -1 for ends, track the max. The code fits in a napkin.

Status-structure-based. Status is a sorted data structure tracking individual active elements. The Skyline Problem lives here. You need the maximum or minimum of a changing set, which requires a heap (Python), TreeMap (Java), or multiset (C++).

Python's standard library has no built-in sorted multiset. sortedcontainers.SortedList fills the gap when it's available, or use heap with lazy deletion as shown above. This is fine. The interview isn't testing your knowledge of Python's stdlib gaps.

How to Spot It Mid-Interview

Sweep line problems tend to announce themselves. These phrases are basically the pattern tapping you on the shoulder:

  • "Given a list of intervals..." followed by anything involving overlap
  • "Minimum number of [rooms / servers / workers]..."
  • "Find the maximum number of [things] active at any time"
  • "Merge overlapping intervals" (see merge intervals)
  • "Find the skyline" or "elevation profile"
  • "Employee free time" or "common available windows"

The structural signal: you have n segments, brute force is O(n²) because you're comparing all pairs, and the answer only depends on what's simultaneously active, not which specific pairs overlap.

Three mistakes that come up enough to be worth naming:

  1. Wrong tie-breaking. When two events share an x-coordinate, the order often matters. Work through a small example before committing to a sort key. This is responsible for more "why is my answer off by one" moments than any other part of the algorithm.
  2. Wrong status structure. A counter answers "how many are active." A heap answers "which is currently tallest." Using a counter for a skyline problem will produce confidently wrong output.
  3. Open vs. closed interval confusion. A meeting ending at T and one starting at T may or may not conflict depending on how the problem defines it. Read the statement carefully, then test your tie-breaking on a two-interval example.

Practice Problems

ProblemDifficultyVariant
LeetCode 56 - Merge IntervalsMediumCounter-based
LeetCode 253 - Meeting Rooms IIMediumCounter-based
LeetCode 218 - The Skyline ProblemHardStatus-structure-based
LeetCode 759 - Employee Free TimeHardCounter-based (finding gaps)
LeetCode 1851 - Minimum Interval to Include Each QueryHardStatus-structure-based

Meeting Rooms II and Merge Intervals are the most frequently asked. The Skyline Problem earns its place because the events-plus-heap-plus-lazy-deletion pattern shows up in a handful of other hard problems. Learn it once, recognize it three more times.

Want to practice explaining your approach before you code it? SpaceComplexity runs AI-powered mock interviews where you work through problems like these out loud and get rubric-based feedback on your reasoning, not just whether your code passes.

Takeaways

  • Sweep line converts O(n²) overlap into O(n log n) sorted traversal. Sort events, process them in order, maintain running state.
  • Two flavors. A counter for "how many are active," a sorted structure for "what's the current max/min."
  • The sort is the bottleneck. O(n log n) comes from sorting the events, not traversing them.
  • Tie-breaking is a trap. Same-timestamp events often have a required order. Test with a small example.
  • Recognition signal: intervals or time ranges, O(n²) brute force, answer depends on simultaneous activity.

Further Reading