Cartesian Tree: O(n) Build, O(1) Range Queries, Zero Rotations

You have an array. Someone wants a thousand range minimums. The naive approach scans each range in O(n), which works fine until it really doesn't. The standard fix is a Sparse Table: O(n log n) preprocessing, O(1) per query. But there is a third path, and it starts with a tree that builds in O(n), encodes every possible range minimum into its own structure, and happens to be the skeleton underneath treaps and suffix trees as well.

A Cartesian tree is a binary tree built from a sequence where two invariants hold simultaneously: an in-order traversal yields the original index order (BST property on positions), and every node has a smaller value than its children (min-heap property on values). The key insight: the minimum of any subarray arr[l..r] is exactly the lowest common ancestor of nodes l and r in the tree. Build it once, and range minimum queries become tree navigation.

You reach for a Cartesian tree when you need range minimum or maximum queries on a static array, or when each subarray's structure is determined by its dominant element. It is also the skeleton inside both Treaps and suffix trees.

A two-panel office meme: "Job Candidate: I coded a binary tree from scratch!" is APPROPRIATE; the same claim from an Employee triggers HR "I coded a binary tree from scratch." Appropriate in an interview. The Cartesian tree just makes the next fifteen minutes worth it.

Two Invariants, One Unique Tree

A Cartesian tree of arr[0..n-1] satisfies exactly two rules.

In-order invariant. Traversing the tree left-to-right (inorder) visits nodes in index order 0, 1, 2, ..., n-1. Each node stores an index into arr, and the BST property holds on those indices: all nodes in the left subtree have smaller indices, all in the right subtree have larger indices.

Heap invariant. Every node's value is less than or equal to its children's values. The root holds the minimum of the entire array.

Together, these two rules uniquely determine the tree for any sequence of distinct elements. No tinkering with node colors, no height fields, no rebalancing passes. Given distinct elements, there is exactly one Cartesian tree. It just is what it is.

Jean Vuillemin introduced and named the structure in 1980 in "A Unifying Look at Data Structures" (CACM). The name comes from Cartesian coordinates: think of each element as a point (index, value) in a 2D plane, with index as x and value as y. The BST property governs the x-axis, the heap property governs the y-axis. The structure lives in Cartesian space.

What the Structure Looks Like

Array: [3, 2, 6, 1, 9]

Indices:  0    1    2    3    4
Values:   3    2    6    1    9

The minimum of the entire array is 1 at index 3. That becomes the root. The left subtree covers indices 0, 1, 2 (the subarray [3, 2, 6]); its minimum is 2 at index 1, which becomes that subtree's root. The right subtree is just index 4.

Cartesian tree of [3, 2, 6, 1, 9] showing the BST-on-indices and min-heap-on-values dual invariants, with the original array displayed below Each node shows its index and value. In-order traversal reads 0, 1, 2, 3, 4. Every parent's value is smaller than its children's.

Verify:

In-order traversal: 0 → 1 → 2 → 3 → 4. Same as the original index order. ✓
Heap: val(1) = 1 < 2, 1 < 9, 2 < 3, 2 < 6. ✓

Every subtree is itself a valid Cartesian tree of the corresponding subarray. The left subtree rooted at index 1 is the Cartesian tree of [3, 2, 6], with 2 as its minimum at index 1. This recursive property is what makes RMQ work.

How the Tree Lives in Memory

A Cartesian tree node stores the index into the original array, the value at that index, and pointers (or indices) to left and right children. For n elements, you get exactly n nodes, so the tree uses O(n) space.

In competitive programming, three parallel arrays (left[i], right[i], parent[i] indexed by position) avoid pointer overhead and fit cache lines better than linked nodes. For general use, pointer-based nodes are more natural.

Diagram: Memory layout for Cartesian tree of [3, 2, 6, 1, 9]

Index:  0    1    2    3    4
Val:    3    2    6    1    9
Left:  -1   0   -1    1   -1    (index of left child, -1 = none)
Right: -1   2   -1    4   -1    (index of right child)
Root:  3                        (index 3 holds global minimum)

Building a Cartesian Tree in O(n): The Right-Spine Stack

The naive construction is O(n²): scan the array for the minimum to build the root, then recurse on left and right subarrays. Sounds fine. Then you try a sorted array. On [1, 2, 3, ..., n], every element is the minimum of its remaining subarray, every scan touches the full remaining length, and you've accidentally built a linked list with O(n²) extra steps to get there.

The stack algorithm does it in O(n), and the proof is one sentence: each element is pushed onto the stack exactly once and popped at most once.

What the Stack Is Tracking

At each step, the stack holds the right spine of the current partial Cartesian tree: the path from root to the rightmost node, in increasing order of values from bottom to top. Think of it as a waiting room. Every element in the stack is hoping to become someone's parent someday. When a smaller element arrives, anyone larger gets evicted immediately and becomes a left subtree instead.

The bottom of the stack (pushed earliest and never popped) is the current root, which holds the smallest value seen so far. The top is the most recently inserted node.

Diagram: Right spine at step i=2 (after processing [3, 2, 6])

Tree so far:
    node(1, val=2)    ← root (stack bottom)
    /          \
node(0, val=3)  node(2, val=6)   ← rightmost (stack top)

Stack (bottom → top): [node(1, val=2), node(2, val=6)]
                       ↑ root                ↑ rightmost

When a new element arrives, it belongs somewhere on this right spine. If its value is smaller than the top, it should sit higher in the tree (smaller values live closer to the root in a min-heap). So we pop elements larger than the new one. Those elements become the left subtree of the new element (they came before it in the array, so they have smaller indices, so in-order they go to the left). The remaining stack top, if any, becomes the parent, with the new element as its right child.

The Pseudocode

initialize empty stack, root = null
for i in 0..n-1:
    create node(i) with value arr[i]
    last_popped = null

    while stack not empty and stack.top.val > arr[i]:
        last_popped = stack.pop()

    node(i).left = last_popped    // displaced right-spine becomes left subtree
    if stack not empty:
        stack.top.right = node(i) // attach as right child of new spine parent
    else:
        root = node(i)            // no parent: new global minimum, new root
    stack.push(node(i))

Step-by-Step Trace: `[3, 2, 6, 1, 9]`

i=0, val=3:
  Stack empty → root = node(0).  Stack: [3]
  Tree: node(0)

i=1, val=2:
  Top 3 > 2 → pop node(0), last = node(0). Stack empty.
  node(1).left = node(0).  root = node(1).  Stack: [2]
  Tree:    2
           /
           3

i=2, val=6:
  Top 2 < 6 → no pops.  node(1).right = node(2).  Stack: [2, 6]
  Tree:    2
          / \
         3   6

i=3, val=1:
  Top 6 > 1 → pop node(2), last = node(2).
  Top 2 > 1 → pop node(1), last = node(1).
  Stack empty.  node(3).left = node(1).  root = node(3).  Stack: [1]
  Tree:       1
             /
            2
           / \
          3   6

i=4, val=9:
  Top 1 < 9 → no pops.  node(3).right = node(4).  Stack: [1, 9]
  Final tree:   1
               / \
              2   9
             / \
            3   6

Five-panel build trace showing the Cartesian tree's right-spine stack state after each insertion of [3, 2, 6, 1, 9]; step 4 (val=1) triggers a cascade that makes 1 the new root The gold dashed ring marks the newly inserted node at each step. Notice step 4: inserting val=1 pops two elements and becomes the new root.

Why O(n)?

Look at the while loop. Across all n iterations of the outer for loop, the total number of stack pops is at most n. Why? Every element is pushed exactly once. An element can only be popped once (after it's popped, it's gone). So total pushes = n, total pops ≤ n. The while loop accounts for all those pops. Total operations across the entire algorithm: O(n).

Each element's entire lifecycle on the stack, across all iterations, costs O(1) amortized. This is the same argument that makes the Next Greater Element problem O(n).

Why Range Minimum = LCA

Claim: In the min-heap Cartesian tree of arr, RMQ(l, r) (the index of the minimum in arr[l..r]) equals LCA(l, r) (the lowest common ancestor of nodes l and r).

Proof:

Let m = argmin(arr[l..r]). We want to show node_m is the LCA of node_l and node_r.

Step 1: node_m is an ancestor of both node_l and node_r. The Cartesian tree is built recursively by always placing the minimum of a subrange at the root of the corresponding subtree. Since m is the minimum of arr[l..r], node m is the root of the subtree covering positions l through r. Both l and r are in this subtree, so they are descendants of m.

Step 2: No proper descendant of node_m can be an ancestor of both. When we split at m, positions less than m go to the left subtree and positions greater than m go to the right subtree. Since l < m (assuming m is strictly between l and r for the interesting case), node_l is in the left subtree. Since r > m, node_r is in the right subtree. Any descendant of node_m lives entirely in one of those subtrees and cannot be an ancestor of nodes in both.

Therefore node_m is the deepest common ancestor of node_l and node_r. QED.

RMQ equals LCA visualization: the Cartesian tree of [5, 1, 3, 8, 2, 6] with query RMQ(2,5) annotated; the root (idx=1) is out of range so we navigate right to node(idx=4), which is in [2,5] and is the answer Navigate from the root: if the current node's index falls below l, go right; above r, go left; in range, return it. That first in-range node is the LCA and the range minimum.

Querying the Tree

The first node whose index falls in [l, r], encountered from the root, is the LCA, which is the range minimum. Navigate down: if the current node's index is below l, go right; if above r, go left; otherwise return it.

function rmq(node, l, r):
    if node.idx < l:  return rmq(node.right, l, r)  // l is to the right
    if node.idx > r:  return rmq(node.left,  l, r)  // r is to the left
    return node.idx                                  // first node in [l,r] = LCA

This query runs in O(h) time, where h is the tree height. Average case O(log n) for random data. Worst case O(n) for sorted input.

For guaranteed O(1) queries, use the Farach-Colton-Bender pipeline: take an Euler tour of the Cartesian tree (produces a sequence of length 2n-1 where consecutive entries differ by exactly ±1 in depth), build a block-decomposed Sparse Table on that sequence, and each RMQ query becomes an O(1) LCA lookup on the tour. Total preprocessing is O(n). But for most practical purposes, the simple traversal is fast enough and far easier to implement.

Operation Costs

Operation	Time	Space	Notes
Build (stack)	O(n)	O(n)	Each element pushed and popped at most once
RMQ query	O(h)	O(1)	O(log n) avg, O(n) worst for sorted input
RMQ with Euler tour + Sparse Table	O(1)	O(n log n)	Full ⟨O(n), O(1)⟩ pipeline
Root (global min)	O(1)	O(1)	Always at the root
Inorder traversal	O(n)	O(h)	Recovers original array order
BST-style search by index	O(h)	O(1)	O(log n) avg, O(n) worst

Height is O(log n) expected for random input, O(n) worst case for sorted input. For a uniformly random permutation, the Cartesian tree has the same shape distribution as a random BST: expected height ~4.31 log₂ n (Reed 2003 proved the exact constant is 4.311). For sorted ascending input [1, 2, ..., n], the minimum is always at index 0, each right subtree's minimum is at its first element, and the tree degenerates into a right-leaning chain of height n-1. Reverse-sorted input gives a left-leaning chain of the same height.

Side-by-side comparison of Cartesian tree height for random input (balanced, O(log n)) versus sorted input (right-leaning chain, O(n)); the sorted case is labeled "basically a linked list" Random input produces a tree you can query efficiently. Sorted input produces a chain. Both are correct Cartesian trees. Only one is useful.

Feed a sorted array into the construction and you get a chain leaning all the way to the right. Technically a valid tree. Practically a linked list with extra steps. If your data is sorted or nearly sorted and you need fast RMQ, use a Sparse Table directly. It doesn't care about element order.

One Structure, Every Language

from __future__ import annotations
from dataclasses import dataclass, field
from typing import Optional

@dataclass
class Node:
    idx: int
    val: int
    left: Optional[Node] = field(default=None, repr=False)
    right: Optional[Node] = field(default=None, repr=False)

def build(arr: list[int]) -> Optional[Node]:
    """Build a min-heap Cartesian tree in O(n) time."""
    stack: list[Node] = []
    root: Optional[Node] = None
    for i, v in enumerate(arr):
        node = Node(idx=i, val=v)
        last: Optional[Node] = None
        while stack and stack[-1].val > v:
            last = stack.pop()
        node.left = last
        if stack:
            stack[-1].right = node
        else:
            root = node
        stack.append(node)
    return root

def rmq(node: Optional[Node], lo: int, hi: int) -> int:
    """Return index of minimum in arr[lo..hi]."""
    if node is None:
        return -1
    if node.idx < lo:
        return rmq(node.right, lo, hi)
    if node.idx > hi:
        return rmq(node.left, lo, hi)
    return node.idx

What Problems It Solves

Range minimum and maximum queries on static arrays. Build once in O(n), then answer each query in O(h). The pure-RMQ use case is often better served by a Sparse Table (simpler, handles O(1) query with O(n log n) preprocessing without any tree), but the Cartesian tree is the conceptual foundation and the only path to the optimal ⟨O(n), O(1)⟩ bound via LCA.

The Maximum Binary Tree (LeetCode 654). Build a tree where the root is the maximum of the array, the left subtree is built recursively on the left portion, and the right subtree on the right portion. This is exactly a max-heap Cartesian tree. Flip one comparison in the algorithm (pop while stack.top.val < arr[i] instead of >), and the O(n) algorithm applies directly. The naive recursive O(n²) solution is the standard answer given on LeetCode, but the O(n) stack algorithm is the interview-differentiating answer.

Sorting data with known structure. The Levcopoulos-Petersson algorithm sorts an array in O(n log k) comparisons, where k measures disorder (roughly: the average number of elements within any element's "natural neighborhood"). For nearly-sorted data, k is small and you get sub-O(n log n) performance. The algorithm uses a Cartesian tree as its core data structure, extracting minimums in heap order while maintaining a priority queue of candidates.

Suffix tree construction from suffix and LCP arrays. Given a string's suffix array and LCP array, the Cartesian tree of the LCP array encodes the internal nodes of the corresponding suffix tree. Each internal node in the suffix tree corresponds to a node in the LCP Cartesian tree: the subtree rooted at the LCP minimum covers the range of suffixes sharing that common prefix. This connection powers compressed suffix tree libraries that replace explicit suffix tree pointers with compact array representations.

The Treap. A Treap is a Cartesian tree that solved the sorted-input problem by cheating. The keys are BST search keys; the values are random priorities assigned at insertion. Random priorities mean no adversarial input can force a degenerate tree, so you get O(log n) expected height with no rebalancing, no rotations, and no color bits. Vuillemin described this in 1980. Aragon and Seidel independently rediscovered and popularized it in 1989. If you already understand Cartesian trees, you understand Treaps. The only difference is that priorities are random rather than taken from a fixed input sequence.

When to Reach for This

Five Signals That Point Here

1. Range minimum or maximum with many queries on a fixed array. If the problem gives you an array and asks you to answer many queries of the form "what is the min/max in range [l, r]?", the Cartesian tree is the conceptual answer. For implementation, Sparse Table is usually simpler.

2. Build a tree by repeatedly finding the min/max of a subarray. The problem explicitly asks you to construct a binary tree where the root is the min or max of the input and subtrees are built recursively. This is a Cartesian tree, and the O(n) stack algorithm applies.

3. Subarray structure governed by dominant elements. Problems where the largest or smallest element in each subarray "controls" the structure of that subarray often have a Cartesian tree at their core, even if the problem statement doesn't say "tree."

4. The problem reduces to "all nearest smaller values." Finding, for each element, the nearest smaller element to its left and right is exactly the parent-finding step in Cartesian tree construction. Problems asking for this pattern solve directly with the stack algorithm.

5. You need a balanced BST without rotation logic. If you want O(log n) BST operations and don't want to implement AVL or red-black trees, a Treap is the answer, and a Treap is a Cartesian tree with random priorities.

Worked Example 1: Maximum Binary Tree (LeetCode 654)

Problem: Given nums = [3, 2, 6, 1, 9], construct the Maximum Binary Tree. The root is the maximum of the array. The left subtree is the Maximum Binary Tree of the left portion, the right subtree is the Maximum Binary Tree of the right portion (both measured relative to the root's position).

Why Cartesian tree: This is the definition of a max-heap Cartesian tree. The maximum plays the role of minimum in a min-heap. The BST-on-indices property (in-order traversal returns original positions) is exactly the "left portion / right portion" split.

O(n) solution: Apply the stack algorithm with the comparison flipped: pop from the stack while stack.top.val < arr[i] (a larger element displaces smaller ones from the right spine).

def max_cartesian_tree(nums):
    stack = []
    root = None
    for i, v in enumerate(nums):
        node = Node(idx=i, val=v)
        last = None
        while stack and stack[-1].val < v:   # flip: pop SMALLER elements
            last = stack.pop()
        node.left = last
        if stack:
            stack[-1].right = node
        else:
            root = node
        stack.append(node)
    return root

Trace on [3, 2, 6, 1, 9]:

i=0, val=3:  stack=[3], root=node(0)
i=1, val=2:  2 < 3, no pops. node(0).right = node(1). stack=[3,2]
i=2, val=6:  6 > 2, pop node(1). 6 > 3, pop node(0). last=node(0).
             node(2).left = node(0). root=node(2). stack=[6]
i=3, val=1:  1 < 6, no pops. node(2).right = node(3). stack=[6,1]
i=4, val=9:  9 > 1, pop. 9 > 6, pop. last=node(2).
             node(4).left = node(2). root=node(4). stack=[9]

Diagram: Maximum Binary Tree of [3, 2, 6, 1, 9]

       node(4, val=9)         ← root = global maximum
       /
  node(2, val=6)
  /           \
node(0, val=3)  node(3, val=1)
     \
  node(1, val=2)

In-order traversal: 0, 1, 2, 3, 4. Values: 3, 2, 6, 1, 9. Original array recovered. ✓ Max-heap: 9 > 6 > 3, 9 > 6 > 1, 3 > 2. ✓

Worked Example 2: Range Minimum Queries

Problem: Given arr = [5, 1, 3, 8, 2, 6] and queries (0, 2), (1, 4), (2, 5), find the minimum in each range efficiently.

Why Cartesian tree: Multiple range minimum queries on a static array. Build the tree once, answer each query in O(h).

Build: The global minimum is arr[1] = 1 at index 1. Root is node(1). Left subtree covers index 0, right subtree covers indices 2 through 5. In the right subtree, minimum is arr[4] = 2 at index 4, which becomes that subtree's root. And so on.

Diagram: Cartesian tree of [5, 1, 3, 8, 2, 6]

               node(1, val=1)              ← global minimum
              /               \
       node(0, val=5)     node(4, val=2)   ← min of [3,8,2,6]
                          /             \
                   node(2, val=3)   node(5, val=6)
                        \
                    node(3, val=8)

Queries:

RMQ(0, 2): root.idx=1 ∈ [0,2]. Return 1. arr[1]=1. ✓
RMQ(1, 4): root.idx=1 ∈ [1,4]. Return 1. arr[1]=1. ✓
RMQ(2, 5): root.idx=1 < 2, go right. Now at node(4). idx=4 ∈ [2,5]. Return 4. arr[4]=2. ✓

Each query touched at most two nodes.

What Gets Subtle

Duplicate values break uniqueness. For sequences with repeated values, the two invariants no longer uniquely determine the tree. You need a stable tie-breaking rule: treat the first occurrence as strictly smaller, for example, or use a (value, index) pair for comparisons. Whichever rule you choose, apply it consistently in the stack algorithm's while loop condition.

The O(n) build uses a monotonic stack, not a heap. This trips people up every time. You're producing a heap-shaped tree, so you expect to need a heap. You don't. The heap shape emerges from the construction process. The stack works because you're building the right spine incrementally, not inserting into an existing heap. The amortized O(1) per element is the same argument used for the Next Greater Element problem: each element enters and exits the stack exactly once.

The height worst case matters for competitive programming. If your input is sorted or nearly sorted and you use Cartesian tree RMQ without LCA preprocessing, each query can hit O(n) nodes. On n=10^5 with 10^5 queries, that is 10^10 operations. Use Sparse Table for RMQ if you cannot guarantee random-order input.

Max-heap vs min-heap is a choice. The structure works identically for both. Min-heap Cartesian trees correspond to range minimum queries. Max-heap Cartesian trees correspond to range maximum queries and to problems like Maximum Binary Tree. Flip one comparison.

The Essentials

A Cartesian tree satisfies two invariants simultaneously: BST on indices (in-order traversal recovers original order) and heap on values (root is always the min or max).
The O(n) stack algorithm maintains the right spine as a monotonic stack. Every element is pushed once and popped at most once, giving amortized O(1) per element.
RMQ(l, r) = LCA(node_l, node_r) in the Cartesian tree. The LCA is the first node in [l, r] encountered from the root, because the minimum of any range must be an ancestor of all nodes in that range.
Height is O(log n) expected for random input, O(n) worst case for sorted input. Plan accordingly.
A Treap is a Cartesian tree with random priorities. Same structure, same O(n) build intuition, random priorities give O(log n) expected height for any key sequence.
The Cartesian tree of an LCP array encodes the internal nodes of the corresponding suffix tree.
LeetCode 654 (Maximum Binary Tree) is a direct Cartesian tree problem solved in O(n) with the stack algorithm.

Explaining the RMQ=LCA equivalence under time pressure is the kind of structural reasoning that separates strong candidates in FAANG-level interviews. SpaceComplexity runs voice-based mock interviews with rubric feedback on data structure problems like this, so you can practice the full explanation, not just the code.

For pure range queries where you want the simplest O(1) implementation, the Sparse Table handles idempotent range queries without any tree structure. For range queries where the underlying array changes, the Segment Tree gives O(log n) updates and queries. And if you want a balanced BST that uses the same structural idea as a Cartesian tree, the Treap is the direct application: same stack intuition, random priorities instead of fixed values.