Sparse Table: Two Lookups to Answer Any Range Minimum Query

May 26, 202625 min read
dsaalgorithmsinterview-prepdata-structures
Sparse Table: Two Lookups to Answer Any Range Minimum Query
TL;DR
  • Sparse table precomputes minimums for every power-of-two-length subrange, giving O(n log n) build time and O(n log n) space.
  • O(1) range minimum queries work by covering [l, r] with two overlapping intervals of length 2^k; double-counting is safe because min is idempotent.
  • The overlap trick only applies to idempotent operations (min, max, gcd, bitwise AND/OR). Sum, XOR, and product need O(log n) queries.
  • No point updates: any single change to the array invalidates entries across all levels and forces a full O(n log n) rebuild. Use a segment tree for mutable data.
  • LCA via Euler tour reduces lowest common ancestor to an RMQ on a 2n−1 depth array, giving O(1) LCA queries after O(n log n) preprocessing.
  • Computing floor(log2(len)) in O(1) requires a precomputed table or a bit intrinsic—this is what makes the two-lookup query truly constant time.
  • The level-first memory layout (st[level][index]) keeps both build and query access patterns sequential and cache-friendly.

You have an array that never changes. You need to answer thousands of "what's the minimum between index 3 and index 847?" queries. Fast.

The naive answer is O(n) per query. Scan the subarray each time. For a million queries on a million-element array, that's 10^12 operations. Your laptop does not have that kind of patience. Neither does the person running your code.

A sparse table preprocesses the array in O(n log n) and answers range minimum queries in exactly two array lookups. Not O(log n). Not amortized O(1). Two lookups, every time, worst case.

The trick is counterintuitive enough to stop most engineers cold the first time they see it. Two overlapping intervals, intentional double-counting, and the property that makes none of that matter. When you tell someone this is O(1), they nod politely and wait for the catch. The catch is upfront memory. That's it.


Tweet: "No mom it's not a messy pile of clothes on my chair, it's an L1 cache for fast random access to my frequently used clothes in O(1) time"

The sparse table mindset in its natural habitat. Trade RAM upfront, answer queries instantly. Your PM does not need to understand why.


Why Naive Range Minimum Query Approaches All Fail

Say your array is [3, 1, 4, 1, 5, 9, 2, 6]. You want min(arr[2..6]).

Approach 1: Scan. O(n) per query. Hopeless for repeated queries.

Approach 2: Prefix minimums. Precompute prefix_min[i] = min(arr[0..i]). Doesn't help: prefix_min[6] - prefix_min[1] has no meaning for minimum. You cannot subtract minimums. The universe just does not work that way. You can be annoyed about it.

Approach 3: Precompute everything. Store the answer for all O(n²) pairs (l, r). O(n²) space and preprocessing. For n = 10^6, that's a terabyte. Your laptop says no.

The sparse table threads the needle: O(n log n) space and preprocessing, O(1) queries. The key insight is that you only need to precompute answers for ranges whose lengths are powers of two. There are only O(log n) such lengths, giving O(n log n) total entries. The rest of the article shows why that's enough to answer any query.


How the Table Is Built

Memory Layout

The sparse table is a 2D array st[level][index] where:

  • Row i (level i): stores minimum values for all ranges of length 2^i
  • Column j (index j): starting position in the original array

So st[i][j] = minimum of arr[j .. j + 2^i - 1].

Array: [3, 1, 4, 1, 5, 9, 2, 6]  (indices 0..7)

st[0][j] = min of arr[j..j]   (ranges of length 1)
st[1][j] = min of arr[j..j+1] (ranges of length 2)
st[2][j] = min of arr[j..j+3] (ranges of length 4)
st[3][j] = min of arr[j..j+7] (ranges of length 8)

Filling that in for our example:

Level 0 (length 1):  [3, 1, 4, 1, 5, 9, 2, 6]
Level 1 (length 2):  [1, 1, 1, 1, 5, 2, 2, _]
Level 2 (length 4):  [1, 1, 1, 1, 2, _, _, _]
Level 3 (length 8):  [1, _, _, _, _, _, _, _]

(underscore = unused entry: that range would extend past the array)

The sparse table 2D layout showing 4 levels with active and unused cells for the example array [3,1,4,1,5,9,2,6]

Each row stores minimums over ranges of a fixed power-of-two length. Unused cells (dashed) would extend past the end of the array. Total entries: O(n log n).

The level-first layout (st[level][index]) isn't arbitrary. During the build phase, the inner loop iterates over starting positions j with a fixed level i. Accessing st[i][j] and st[i][j+1] hits adjacent memory locations, which is cache-friendly.

The Build Recurrence

Each level is built from the previous one. A range of length 2^i starting at j splits cleanly into two ranges of length 2^(i-1):

st[i][j] = min(st[i-1][j], st[i-1][j + 2^(i-1)])

Left half: [j, j + 2^(i-1) - 1]. Right half: [j + 2^(i-1), j + 2^i - 1]. Together they cover [j, j + 2^i - 1] exactly, no overlap, no gap.

st[2][1] = min(st[1][1], st[1][3])
         = min(min(arr[1..2]), min(arr[3..4]))
         = min(1, 1)
         = 1
         ✓ (actual min of arr[1..4] = min(1,4,1,5) = 1)

The build in code:

# base case: ranges of length 1 st[0] = arr[:] for i in range(1, levels): for j in range(n - (1 << i) + 1): st[i][j] = min(st[i-1][j], st[i-1][j + (1 << (i-1))])

Correctness by induction: base case trivially holds. Assuming st[i-1] is correct, splitting the range at the midpoint and taking the minimum is correct because min is associative. The split is exact (two halves, no overlap), so no correctness concern at build time.

Why O(n log n): there are floor(log2(n)) + 1 levels. Each level has at most n entries. Each entry takes O(1) to compute. Total: O(n log n).

Build Dependency

Arrows showing how each level of the sparse table is built from the previous level, with left and right halves merging to form each entry

Each entry is the min of two entries from the level above, spaced 2^(i-1) apart. Blue arrows show the left half; amber arrows show the right half. The gap doubles at each level.

Each entry depends on exactly two entries from the level above, both at distance 2^(i-1) apart. The gap doubles at every level: adjacent pairs at level 0, distance-2 pairs at level 1, distance-4 pairs at level 2.


Why the Query Is O(1): The Overlapping Trick

Most people expect O(log n) here. It's not.

For a query [l, r], you cannot split it into disjoint power-of-two ranges in O(1). The number of disjoint pieces is proportional to the number of set bits in r - l + 1, which is at most log n. That gives O(log n) per query, not O(1).

The trick: stop caring about disjoint. For minimum queries, double-counting an element doesn't change the answer. min(a, a) = a. If this feels like cheating, that's because it kind of is. The math makes it legal.

So instead of partitioning [l, r] into disjoint pieces, use exactly two overlapping ranges that together cover [l, r]:

  1. Find k = floor(log2(r - l + 1)). This is the largest power of 2 that fits in the range.
  2. Take the range starting at l with length 2^k: interval [l, l + 2^k - 1].
  3. Take the range ending at r with length 2^k: interval [r - 2^k + 1, r].
  4. Return min(st[k][l], st[k][r - 2^k + 1]).

Diagram showing two overlapping power-of-two intervals covering a query range, with the overlap region highlighted and the result computed from exactly two table lookups

Query [2, 6]: the left interval (blue) covers [2, 5], the right interval (amber) covers [3, 6]. Elements 3, 4, 5 appear in both. That's fine. min(x, x) = x, so double-counting does not corrupt the answer.

Proof That the Two Intervals Cover [l, r]

Let len = r - l + 1 and k = floor(log2(len)). By definition: 2^k <= len < 2^(k+1).

You need to show there is no gap between l + 2^k - 1 (end of left interval) and r - 2^k + 1 (start of right interval). If l + 2^k - 1 >= r - 2^k + 1, they meet or overlap.

Rearranging: 2^(k+1) >= r - l + 2 = len + 1.

From len < 2^(k+1), integer arithmetic gives len + 1 <= 2^(k+1). QED. The two intervals always cover the full range with no gap.

The critical requirement: idempotence. An operation f is idempotent if f(x, x) = x. Mathematicians love this word. It means "doing something twice to the same input is the same as doing it once." For overlapping elements, the query computes f(f(left half), f(right half)) where the overlap region appears in both. For minimum: min(min(..., x, ...), min(x, ...)) = min(..., x, ...). Idempotence absorbs the double-count.

Operations that work: min, max, gcd, bitwise AND, bitwise OR.

Operations that do not: sum (adds overlap twice), XOR (cancels overlap to zero, because XOR cancels when you double-count, that's its whole deal), product.

Computing k in O(1)

The whole scheme only works if computing k = floor(log2(len)) is itself O(1). Three common approaches:

Precomputed table (recommended):

log = [0] * (n + 1) for i in range(2, n + 1): log[i] = log[i // 2] + 1 # log[i] = floor(log2(i)) for all i >= 1

Built in O(n) before the first query. Then k = log[r - l + 1] is a single array access.

Bit intrinsics: In C++, k = 31 - __builtin_clz(len). In Java, k = 31 - Integer.numberOfLeadingZeros(len). In Python, k = (len).bit_length() - 1. In Rust, k = (usize::BITS - len.leading_zeros() - 1) as usize. All O(1) at the hardware level.


Build and Query Costs

OperationTimeSpaceNotes
Build (preprocessing)O(n log n)O(n log n)One pass per level
Range query (idempotent: min, max, gcd)O(1)O(1)Two lookups via overlapping intervals
Range query (non-idempotent: sum)O(log n)O(1)Decompose into disjoint power-of-2 parts
Point updateO(n log n)O(n log n)Full rebuild required
Point queryO(1)O(1)st[0][i] is arr[i]
SpaceN/AO(n log n)Approx. 20n entries for n up to 10^6

Why O(n log n) Build Is Tight

You cannot avoid the O(n log n) preprocessing. Each of the floor(log2(n)) + 1 levels needs O(n) entries computed. The first level is just a copy. Each subsequent level requires one min per entry. There is no shortcut.

Why No Updates

Every entry st[i][j] encodes the minimum of a specific contiguous subarray. Changing any single element of the original array potentially invalidates O(n log n) entries across every level. Compare this to a segment tree, which localizes each update to O(log n) nodes. If your array changes between queries, use a segment tree. If it is static, a sparse table is unbeatable.

The segment tree is the flexible friend who handles change gracefully. The sparse table is the specialist: shows up, answers queries at maximum speed, refuses to update. Choose accordingly.

Memory Arithmetic

For n = 10^6 elements, you need floor(log2(10^6)) + 1 = 20 levels. At 4 bytes per int, that's 20 × 10^6 × 4 = 80 MB. That's about the RAM cost of opening Chrome with four tabs. Manageable in most contest and production settings, but worth knowing upfront. For n = 10^5, you need 17 levels and about 6.8 MB.


What "Sparse" Actually Means

The name is slightly misleading. The table is not sparse in the CS sense of having mostly zero entries. It is sparse in coverage: there are O(n²) possible (l, r) pairs you might query, but the table stores only O(n log n) entries to answer all of them. A sparse sampling of the range space that still answers every query.

Computer science naming conventions are always a little off. This one is not the worst offender.


One Structure, Every Language

All implementations: build the sparse table for range minimum queries on a 0-indexed integer array. The query method returns min(arr[l..r]) inclusive.

class SparseTable: def __init__(self, arr: list[int]) -> None: n = len(arr) levels = n.bit_length() # floor(log2(n)) + 1 self.log = [0] * (n + 1) for i in range(2, n + 1): self.log[i] = self.log[i // 2] + 1 self.st = [[0] * n for _ in range(levels)] self.st[0] = arr[:] for i in range(1, levels): for j in range(n - (1 << i) + 1): self.st[i][j] = min(self.st[i - 1][j], self.st[i - 1][j + (1 << (i - 1))]) def query(self, left: int, right: int) -> int: k = self.log[right - left + 1] return min(self.st[k][left], self.st[k][right - (1 << k) + 1])

Where Sparse Tables Show Up

Range Minimum and Maximum Queries

The canonical application. Static array, offline queries, minimum or maximum over any subrange. Sparse table is the fastest known static RMQ structure for the common case.

Range GCD Queries

gcd(a, a) = a, so GCD is idempotent. Replace min with gcd in the implementations above. Range GCD in O(1) after O(n log n log(max_val)) preprocessing (each GCD call is O(log(max_val))).

Lowest Common Ancestor (LCA) via Euler Tour

The LCA of two nodes in a rooted tree can be reduced to an RMQ problem. Do a DFS traversal and record each node visit in an array (the "Euler tour"). Each node of depth d appears in this array whenever you visit or leave it. The LCA of nodes u and v is the node with minimum depth in the Euler tour between the first appearances of u and v. The tour has length 2n - 1 for a tree with n nodes. Build a sparse table on the depths. Every LCA query becomes one RMQ query: O(1). Total: O(n log n) preprocessing, O(1) queries.

Binary Lifting for LCA

A related technique worth knowing: precompute up[v][k] = the 2^k-th ancestor of node v. Build it with the same recurrence (up[v][k] = up[up[v][k-1]][k-1]). This is a sparse table on ancestor chains, not on array values. Queries are O(log n), not O(1), but you do not need an Euler tour. Standard in interviews when LCA is an intermediate step.

Sliding Window Problems (Static)

If your array is fixed and you need to precompute all range minimums (for every possible window size over every starting position), sparse table is the right structure. Dynamic sliding windows with a changing array use a monotonic deque instead.

Offline Static Query Workloads

Any time you have the full query set upfront, the array is immutable, and the operation is idempotent, a sparse table will beat a segment tree on both constants and cache performance (no pointer chasing, perfectly sequential access pattern).


When to Reach for Sparse Table

Four signals point to sparse table:

  • The array is read-only after construction. No point updates, no insertions.
  • You need many range queries. The O(n log n) build cost pays off at roughly O(log n) queries.
  • The operation is idempotent. f(x, x) = x. Check this first.
  • Query latency matters more than flexibility. You need O(1), not O(log n).

Updates push you toward a segment tree. Range sums or other non-idempotent operations with updates belong to a Fenwick tree, which handles both in O(log n).

Worked Example 1: Stock Price Range Minimum

Problem: You receive n stock prices for a fixed historical period. Then you receive q queries: "what was the minimum price between day l and day r?"

Brute force: O(n) per query. For n = 10^5 and q = 10^5, that is 10^10 operations. Way too slow.

Why sparse table fits:

  1. The prices are fixed after input (static array). Check.
  2. Minimum is idempotent. Check.
  3. Queries arrive offline and are frequent. Check.

Approach: Build a sparse table in O(n log n). Answer each query in O(1). Total: O(n log n + q).

prices = [100, 80, 90, 70, 110, 60, 95] table = SparseTable(prices) print(table.query(1, 5)) # min(80, 90, 70, 110, 60) = 60

Worked Example 2: Lowest Common Ancestor

Problem: Given a tree with n nodes and q queries of the form "find the LCA of nodes u and v."

Naive LCA: Walk both nodes up to the root repeatedly. O(n) per query.

Binary lifting: O(n log n) preprocessing, O(log n) per query.

Euler tour + sparse table: O(n log n) preprocessing, O(1) per query.

Why sparse table fits:

  1. The Euler tour is a static array (the tree doesn't change). Check.
  2. Minimum depth is an idempotent operation over the tour. Check.
  3. q is large and we want constant-time queries.

How the reduction works:

Tree:         1
            /   \
           2     3
          / \
         4   5

Euler tour (DFS, record on entry and exit):
  [1, 2, 4, 2, 5, 2, 1, 3, 1]

Depths:
  [0, 1, 2, 1, 2, 1, 0, 1, 0]

First visit indices:
  node 1 → index 0
  node 2 → index 1
  node 4 → index 2
  node 5 → index 4
  node 3 → index 7

LCA(4, 5):
  Tour slice between first(4)=2 and first(5)=4: depths[2..4] = [2, 1, 2]
  Min depth = 1, at index 3, which is node 2.
  LCA(4, 5) = 2  ✓

Build a sparse table on the depths array. Every LCA(u, v) becomes rmq(first[u], first[v]) (or rmq(first[v], first[u]) if v appears first). Two lookups.


The Non-Idempotent Case: Don't Give Up

If your operation is not idempotent but is associative (like sum, XOR, product), you still have two options:

Option 1: O(log n) query. Decompose [l, r] into at most floor(log n) disjoint power-of-2 ranges. Walk from highest to lowest level:

def query_sum(st, log, left, right): total = 0 while left <= right: k = log[right - left + 1] total += st[k][left] left += 1 << k return total

Option 2: Disjoint Sparse Table. An extension that precomputes "directional" aggregates from each position toward block midpoints. Query time is still O(1) and supports any associative operation with the same O(n log n) preprocessing. The build is more complex but the query is identical in speed. Worth knowing for competitive programming; less common in production code.


Sparse Table or Segment Tree?

FactorSparse TableSegment Tree
Build timeO(n log n)O(n)
Query time (idempotent)O(1)O(log n)
Query time (non-idempotent)O(log n)O(log n)
Point updateO(n log n) rebuildO(log n)
MemoryO(n log n)O(n)
Cache behaviorExcellent (linear arrays)Moderate (implicit tree)
Implementation complexityLowMedium

The rule of thumb: static data, idempotent operation, many queries = sparse table. Anything with updates = segment tree.


What to Take Away

  • A sparse table precomputes minimum values for all ranges whose length is a power of two: st[i][j] = min of arr[j .. j + 2^i - 1].
  • Build traverses levels bottom-up: st[i][j] = min(st[i-1][j], st[i-1][j + 2^(i-1)]). Time: O(n log n).
  • Range minimum queries use two overlapping power-of-two intervals that together cover [l, r]. Two lookups. O(1).
  • The O(1) trick only works for idempotent operations (min, max, gcd, bitwise AND/OR). For sum or XOR, queries are O(log n).
  • No updates: the structure is static. A full rebuild costs O(n log n).
  • Memory is O(n log n): roughly 20 × n entries for n up to 10^6.
  • Computing k = floor(log2(len)) in O(1) requires either a precomputed table or a bit intrinsic.
  • The level-first memory layout (st[level][index]) is cache-friendly during both build and query.
  • LCA via Euler tour + sparse table: O(n log n) preprocessing, O(1) per query.

Practicing these kinds of range query problems out loud, under time pressure, is a different skill from solving them at your desk. SpaceComplexity runs voice-based mock interviews with rubric-based feedback so you can train the version of problem-solving that actually matters on interview day. Give it a try.


Further Reading