Red-Black Tree: The Self-Balancing BST Every Ordered Map Depends On

May 26, 202645 min read
dsaalgorithmsinterview-prepdata-structures
Red-Black Tree: The Self-Balancing BST Every Ordered Map Depends On
TL;DR
  • Red-black tree: a BST with one color bit per node; five invariants guarantee height ≤ 2 log₂(n+1) and O(log n) worst-case for every operation
  • Insert fixup: at most 2 rotations per insert; Case 1 recolors and climbs, Cases 2 and 3 together terminate the loop immediately
  • Delete fixup: at most 3 rotations per delete; Case 2 propagates double-black up without rotating, Cases 1, 3, and 4 each use one rotation
  • Amortized O(1) rotations: any sequence of m inserts and deletes incurs O(m) total rotations, giving red-black trees the edge over AVL in write-heavy workloads
  • Ordered operations: floor, ceiling, range scan, and successor all run in O(log n); a hash map cannot answer these at any cost
  • In every major stdlib: Java TreeMap, C++ std::map, Linux CFS scheduler, and Java 8 HashMap bucket overflow chains are all red-black trees under the hood

You've already called a red-black tree today. Probably before breakfast. Every map[key] in C++, every TreeMap.get() in Java, every SortedDictionary in C# runs through one. The Linux scheduler uses it to pick which process runs next. Java 8's HashMap converts bucket overflow chains to it. C++'s std::map and std::set are literally wrappers around it. Without much argument, this is the most widely deployed balanced BST in existence.

A red-black tree is a BST where every node carries one extra bit, a color (red or black), and five invariants on that bit force the height to stay at most 2 log₂(n+1). Insert, delete, and search are all O(log n) in the worst case. Not amortized. Not average. The hard, unconditional worst case, every time.

Use one when you need ordered key-value storage with frequent updates, or operations like floor, ceiling, and range iteration. For plain lookup, a hash map is faster. For sorted order, this is your structure.


Thanos meme: "Fun isn't something one considers when balancing binary search trees"

At least it's O(log n). Guaranteed.


Where It Came From

Rudolf Bayer invented a precursor in 1972: the "symmetric binary B-tree," a binary tree encoding of a B-tree of order 4. Every node stored between 1 and 3 keys. The structure was perfectly balanced by construction, but it was not a binary search tree.

Leonidas Guibas and Robert Sedgewick translated this idea into a proper BST in their 1978 FOCS paper "A Dichromatic Framework for Balanced Trees." The coloring scheme they introduced represented the 2-3-4 tree structure in binary form: a red node and its black parent together represent a single multi-key B-tree node.

Why red? Because the color laser printer at Xerox PARC printed red beautifully on paper. That is the whole story. A data structure now running inside every major operating system, every standard library, and roughly a billion lines of production code was named after what looked good on an office printer in 1978.

Sedgewick revisited the structure in 2008 with the Left-Leaning Red-Black Tree (LLRB), a variant that adds one more constraint: red nodes can only be left children. The implementation is shorter but carries subtle bugs that took years to notice. The standard form described here, sometimes called the "symmetric" variant, is what you find in Java's standard library, the Linux kernel, and GCC's STL.


Five Invariants You Cannot Break

A red-black tree is a BST where every node has a color field, and the tree must satisfy all five of these properties at all times:

  1. Every node is either red or black.
  2. The root is black.
  3. Every leaf (the sentinel NIL nodes at the boundary of the tree) is black.
  4. If a node is red, both its children are black. (No two consecutive red nodes on any path.)
  5. For any node, all simple paths from that node to descendant leaves contain the same number of black nodes. This count is called the black-height.

Properties 4 and 5 together are what force balance. Property 5 says every path to a leaf has the same number of black nodes. Property 4 says you can't put two reds in a row. So the longest possible path (alternating red-black) is at most twice the shortest possible path (all black). The tree can never become lopsided enough to degrade toward O(n). One extra bit per node is buying you all of this.

A valid red-black tree: every root-to-NIL path crosses exactly 2 black nodes, no red node has a red child

Every path from root to a NIL leaf crosses exactly 2 black nodes. Count them.


What Is Actually in Memory

Each node stores:

FieldTypeSize (64-bit)
keyvariesvaries
valuevariesvaries
colorbool/enum4 bytes (usually)
leftpointer8 bytes
rightpointer8 bytes
parentpointer8 bytes

That is roughly 32 bytes of overhead per node beyond the key and value, versus 24 bytes for a plain BST (no color field).

The color bit is often free. Because all allocated heap objects are aligned to at least 8 bytes on a 64-bit system, the low 3 bits of any pointer are always zero. The Linux kernel's rbtree.h packs the color into the low bit of the parent pointer. One machine word stores both the parent address and the color. The red-black tree adds literally zero memory overhead over a plain BST.

Most implementations use a sentinel NIL node instead of NULL pointers. There is one shared, statically allocated black node. Every "leaf" pointer and every boundary pointer points to this sentinel. This eliminates null-checks in the rotation code: you can always dereference a pointer because the worst that happens is you read the sentinel's color (which is always black). The sentinel's own left, right, and parent pointers all point back to itself.

Sentinel NIL node memory layout: one shared instance, all three pointers point back to itself, always returns BLACK

The sentinel eliminates every null-check in rotation code. Every "leaf" pointer routes here.


The Rotation: One O(1) Shape Change

A rotation rewires three pointers. It preserves the BST ordering property while changing the tree shape. Every fixup operation calls rotations to repair invariant violations.

Left rotation at node x moves y up, x down. B slides from y's left to x's right. The BST ordering A ≤ x ≤ B ≤ y ≤ C holds before and after.

Left rotation at node x: y rises, x descends, B transfers from y.left to x.right, 5 pointer assignments total

Right rotation is the exact mirror. Both cost O(1): five pointer assignments regardless of tree size.


What Every Operation Costs

OperationTimeSpaceNotes
SearchΘ(log n)O(1)BST walk, always worst case
InsertO(log n)O(log n)≤ 2 rotations, O(log n) recolorings
DeleteO(log n)O(log n)≤ 3 rotations, O(log n) recolorings
Minimum/MaximumO(log n)O(1)Leftmost or rightmost node
Successor/PredecessorO(log n)O(1)In-order neighbors
Floor/CeilingO(log n)O(1)Largest key ≤ query, smallest ≥

All logarithms are base 2. The space column is for recursive implementations (call stack depth); an iterative version uses O(1) auxiliary space.

Why Search Is Always O(log n), Worst Case

This follows from the height bound. CLRS Lemma 13.1 states: a red-black tree with n internal nodes has height at most 2 log₂(n+1).

The proof has two steps.

First, show that any subtree rooted at node x contains at least 2^bh(x) - 1 internal nodes, where bh(x) is the black-height. The base case: a NIL leaf has bh = 0 and 0 internal nodes, and 2^0 - 1 = 0. For the inductive step: each child of x has black-height at least bh(x) - 1 (at least, because if the child is red, its black-height equals x's; only a black child decreases it by one). By induction, each child subtree has at least 2^(bh(x)-1) - 1 internal nodes. The whole subtree has at least 1 + 2*(2^(bh(x)-1) - 1) = 2^bh(x) - 1.

Second, since no two consecutive nodes on a path can be red (property 4), at least half the nodes on any root-to-leaf path are black. So the black-height of the root is at least h/2, where h is the tree height.

Combining: n ≥ 2^(h/2) - 1, which gives h ≤ 2 log₂(n+1).

Search is just a BST walk: at each node, go left if key < node.key, right otherwise. Since height ≤ 2 log₂(n+1), this terminates in O(log n) steps.

Why Insert Is O(log n) With at Most Two Rotations

Insertion has two phases. First, standard BST insertion: walk the tree to find the insertion point in O(log n) steps, insert the new node colored RED. Coloring it red preserves property 5 (black-height unchanged), but may violate property 4 if its parent is also red.

The fixup loop, insertFixup, runs while the new node z's parent is red. There are three cases, distinguished by the color of z's uncle (z's parent's sibling).

Insert fixup three cases: Case 1 recolors and climbs, Case 2 rotates into Case 3, Case 3 rotates and terminates, at most 2 rotations total

The uncle's color is the branch. Red uncle: recolor and repeat. Black uncle: one or two rotations and done.

Case 1: Uncle is RED.

Before:                        After:
      B:G                          R:G   ← z moves here
     /   \                        /   \
   R:P   R:U           →        B:P   B:U
   /                             /
 R:z                           R:z

Recolor parent P and uncle U to black, grandparent G to red. Move z up to G and repeat the loop. Black-heights are unchanged because P and U together absorb the "extra" black that G gives up. The loop may continue for O(log n) iterations climbing up the tree.

Case 2: Uncle is BLACK, z forms a "triangle" with parent and grandparent.

Before:          After Case 2:     → Case 3
    B:G               B:G
   /   \             /   \
 R:P   B:U  →     R:z   B:U
   \               /
   R:z           R:P

Rotate at z's parent (left-rotate here since z is the right child). This converts Case 2 into Case 3. One rotation, no change in loop counter.

Case 3: Uncle is BLACK, z forms a "line" with parent and grandparent.

Before:           After Case 3:
    B:G               B:P
   /   \             /   \
 R:P   B:U  →     R:z   R:G
 /                         \
R:z                        B:U

Rotate at grandparent G (right-rotate here). Swap colors of P and G. The while loop terminates because z's new parent (P) is now black.

The rotation bound: Case 1 does no rotations and moves z up by 2. Cases 2 and 3 together do at most 2 rotations and always terminate the loop. So insertion does at most 2 rotations, period. The O(log n) work is entirely in Case 1 recolorings, which are single-bit writes.

After the loop, force the root to black. This is always safe and restores property 2 if Case 1 propagated all the way up.

The Deletion Dance: at Most Three Rotations

Deletion is where the fun stops. The algorithm has four sub-cases that hinge on the color of your node's sibling and both of the sibling's children. Grit your teeth. The outline:

  1. BST deletion. If the target has two children, replace it with its in-order successor (the minimum of its right subtree) and delete the successor instead. The successor has at most one (right) child.
  2. Splice out the node by connecting its child to its parent.
  3. If the spliced node was RED, nothing breaks. Black-height is unchanged, no consecutive reds introduced. Done.
  4. If the spliced node was BLACK, one path lost a black node. The replacement node (its child, possibly the NIL sentinel) is conceptually "double-black." deleteFixup must restore property 5.

deleteFixup loops while x is double-black and not the root. It handles four cases based on the color of x's sibling w and w's children.

Case 1: Sibling w is RED. Rotate at x's parent, swap parent's and w's colors. Now w is black. Reduces to Cases 2, 3, or 4. One rotation.

Case 2: w is BLACK, both w's children are BLACK. Recolor w to red. The double-black propagates up to x's parent. May repeat the loop. No rotations.

Case 3: w is BLACK, w's near child is RED, far child is BLACK. Rotate at w, swap w's and near child's colors. Reduces to Case 4. One rotation.

Case 4: w is BLACK, w's far child is RED. Rotate at x's parent. Copy parent's color to w, set parent and far child to black. Done. One rotation.

Cases 1, 3, and 4 each contribute at most one rotation. So deletion does at most 3 rotations. Case 2 propagates upward without rotating. The O(log n) work is again in the recolorings.

The Amortized Insight: Rotations Are O(1) per Operation

This is the result that makes the implementation worth the complexity. In any sequence of m insertions and deletions on a red-black tree, the total number of rotations is O(m). Amortized O(1) per operation.

Recolorings cost O(log n) for a single operation in the worst case, but each recoloring is a single-bit write. On modern hardware, bit writes are cheap enough to treat as noise. The structural changes, the pointer rewirings that actually touch cache lines, are bounded by a constant per operation.

By comparison, AVL tree deletion may require O(log n) rotations in a single operation. The balance factor violation can propagate all the way to the root, triggering a rotation at each level. In a write-heavy workload, a red-black tree does O(m) total rotations versus O(m log n) for AVL.

This is why Java, C++, and the Linux kernel all chose red-black over AVL. For read-heavy workloads, AVL is worth considering: its worst-case height is 1.44 log₂(n+2), about 28% shorter than a red-black tree's 2 log₂(n+1), which matters for cache performance. For mixed or write-heavy workloads, the amortized O(1) rotation bound makes red-black the right default.


One Structure, Every Language

All implementations below show insert (with full fixup) and search. Deletion follows the same pattern using the four cases above. The complete algorithm is in CLRS Chapter 13. Nobody writes this from scratch in production, but understanding the structure helps you use the wrapper types correctly.

from enum import Enum, auto class Color(Enum): RED = auto() BLACK = auto() class Node: def __init__(self, key, value=None): self.key = key self.value = value self.color = Color.RED self.left = self.right = self.parent = None class RBTree: def __init__(self): self.NIL = Node(None) self.NIL.color = Color.BLACK self.NIL.left = self.NIL.right = self.NIL.parent = self.NIL self.root = self.NIL def _left_rotate(self, x): y = x.right x.right = y.left if y.left is not self.NIL: y.left.parent = x y.parent = x.parent if x.parent is self.NIL: self.root = y elif x is x.parent.left: x.parent.left = y else: x.parent.right = y y.left = x x.parent = y def _right_rotate(self, y): x = y.left y.left = x.right if x.right is not self.NIL: x.right.parent = y x.parent = y.parent if y.parent is self.NIL: self.root = x elif y is y.parent.right: y.parent.right = x else: y.parent.left = x x.right = y y.parent = x def insert(self, key, value=None): z = Node(key, value) z.left = z.right = z.parent = self.NIL y, x = self.NIL, self.root while x is not self.NIL: y = x x = x.left if z.key < x.key else x.right z.parent = y if y is self.NIL: self.root = z elif z.key < y.key: y.left = z else: y.right = z self._insert_fixup(z) def _insert_fixup(self, z): while z.parent.color == Color.RED: if z.parent is z.parent.parent.left: y = z.parent.parent.right # uncle if y.color == Color.RED: # Case 1 z.parent.color = Color.BLACK y.color = Color.BLACK z.parent.parent.color = Color.RED z = z.parent.parent else: if z is z.parent.right: # Case 2 z = z.parent self._left_rotate(z) z.parent.color = Color.BLACK # Case 3 z.parent.parent.color = Color.RED self._right_rotate(z.parent.parent) else: y = z.parent.parent.left # uncle (mirror) if y.color == Color.RED: # Case 1 z.parent.color = Color.BLACK y.color = Color.BLACK z.parent.parent.color = Color.RED z = z.parent.parent else: if z is z.parent.left: # Case 2 z = z.parent self._right_rotate(z) z.parent.color = Color.BLACK # Case 3 z.parent.parent.color = Color.RED self._left_rotate(z.parent.parent) self.root.color = Color.BLACK def search(self, key): x = self.root while x is not self.NIL: if key == x.key: return x.value x = x.left if key < x.key else x.right return None

What a Red-Black Tree Actually Gets You

Ordered maps and sets with O(log n) updates. This is what std::map, TreeMap, and every other standard ordered-map type actually are. Java's TreeMap and TreeSet, C++'s std::map, std::set, std::multimap, and std::multiset are all red-black trees. They give you sorted iteration, O(log n) insert/delete/lookup, and all the ordered query operations that hash maps cannot provide.

Floor and ceiling queries. Given a key k, find the largest key ≤ k (floor) or smallest key ≥ k (ceiling). A hash map cannot do this at any cost. A red-black tree does it in O(log n) by walking the tree and tracking the best candidate seen so far.

Range scans. Iterate over all keys in [a, b] in sorted order. Walk to the leftmost key ≥ a in O(log n), then in-order traverse until you exceed b. Total cost O(log n + k) where k is the number of results.

The Linux scheduler. The Completely Fair Scheduler stores every runnable task as a node in a red-black tree, keyed by virtual runtime (vruntime). The leftmost node is always the task that has run least. pick_next_task_fair() calls rb_first() to grab it in O(1) (the leftmost node is cached). When a task's vruntime changes, it is repositioned via rb_erase() + rb_insert_color(). The kernel also uses red-black trees for virtual memory areas (VMAs), high-resolution timers, the epoll event subsystem, and network packet scheduling.

Java HashMap collision chains. Since Java 8, when any hash bucket accumulates at least 8 entries (TREEIFY_THRESHOLD), Java converts that bucket's linked list into a red-black tree. Lookup in that bucket drops from O(n) to O(log n), preventing the O(n) worst case that plagued Java 7 under adversarial hash collision attacks.

Augmented BSTs. A red-black tree can be augmented to store extra information per node (CLRS Chapter 14). The interval tree is the canonical example: store the maximum endpoint in any node's subtree, and you can query "which intervals overlap [a, b]?" in O(log n). The Linux kernel does exactly this for VMA management via its maple_tree (a B-tree successor) and previous rbtree implementations.


When Do You Actually Need One?

In practice, you almost never instantiate a red-black tree directly. You call TreeMap, std::map, or SortedDictionary, and the tree sits underneath, invisible. The real question is when to reach for the ordered map over a hash map. You want one when:

  1. You need sorted order preserved across inserts and deletes. If you only need lookup, use a hash map.
  2. You need floor, ceiling, successor, or predecessor queries. These are O(n) for hash maps, O(log n) here.
  3. You need range iteration in O(k + log n), not O(n) plus filtering.
  4. You need a balanced BST with frequent mutations. AVL trees are fine for read-heavy workloads; red-black trees are the right default for mixed or write-heavy ones.
  5. You are augmenting a BST with per-node aggregate information (max, min, sum in subtree). The CLRS augmentation theorem says: if a field is computable from a node and its two children in O(1), you can maintain it through rotations without changing the O(log n) bounds.

Worked Example 1: Count of Smaller Numbers to the Right

Problem: Given an array nums, return an array counts where counts[i] is the number of elements to the right of nums[i] that are smaller than nums[i].

LeetCode 315. The brute force is O(n²). The standard solution uses merge sort, but the red-black tree approach is more direct.

Walk the array right to left. Maintain an order-statistics tree (a red-black tree augmented with subtree size). For each element, query "how many elements currently in the tree are less than nums[i]?" which is the rank of nums[i] minus one. Then insert nums[i].

Each insert and rank query is O(log n). Total: O(n log n). The critical property is that you need both ordered insertion and rank queries. A hash map cannot give you rank. A sorted array can give you rank (binary search) but not O(log n) insertion. A red-black tree gives you both.

Worked Example 2: Interval Point Queries via Floor

Problem: Given a sorted array of intervals, for each query [l, r] find the interval that contains l and report its right endpoint.

Build a TreeMap keyed by interval start. For a query point l, call floorKey(l) to find the last interval that started ≤ l. Check if that interval's endpoint ≥ l. If yes, return it. If no, no interval contains l.

The floor operation is the reason you use an ordered tree here. A hash map requires scanning all intervals. A red-black tree finds the answer in O(log n) per query.

This pattern: "find the largest key that satisfies a monotone condition on keys" is precisely what floor and ceiling operations provide. Every time you see that shape, you want an ordered map backed by a red-black tree.


Before Your Interview: The Short Version

  • A red-black tree is a BST with one extra bit per node. Five invariants on that bit guarantee height ≤ 2 log₂(n+1).
  • The height bound follows from a two-step proof: subtrees contain exponentially many nodes relative to black-height, and no consecutive reds means black-height ≥ height/2.
  • Search is O(log n) worst case. Insert and delete are O(log n) worst case.
  • Insertion does at most 2 rotations. Deletion does at most 3. The O(log n) work is recolorings, which are single-bit writes.
  • In any sequence of m operations, the total rotation count is O(m) (amortized O(1) per operation).
  • AVL trees have shorter height (1.44 log n vs 2 log n) but may need O(log n) rotations per deletion. Red-black trees win in write-heavy workloads.
  • Every major language's ordered map/set is a red-black tree under the hood. Java 8's HashMap falls back to one for large buckets.
  • Signal phrases: "sorted iteration," "floor/ceiling," "range query," "augmented BST."

Knowing the structure is one thing. Explaining why you would use it, on the spot, in a live interview, while someone is watching you think, is a different skill entirely. SpaceComplexity runs voice-based mock interviews where you walk through data structure tradeoffs exactly like this, then get rubric-based feedback on how well your explanation landed.

For more on how red-black trees compare to other self-balancing BSTs, see AVL Tree vs Red-Black Tree: Ordered O(log n), No Exceptions. For the plain BST invariant that red-black trees build on, see The Binary Search Tree: One Invariant, Every Operation. For when to choose a hash map over an ordered tree, see Hash Map Time Complexity: How O(1) Really Works.


Further Reading