What Is a B-Tree? The Data Structure Behind Every Database Index

Nobody knows what the "B" in B-tree stands for. Balanced, bushy, Boeing, Bayer: Rudolf Bayer and Ed McCreight invented the structure in 1970 and never said. Which is, honestly, kind of perfect for a data structure that silently powers every database you have ever touched.

The real question they answered: how do you find a record in a billion-row table in three storage reads or fewer?

The Problem B-Trees Were Built to Solve

A binary search tree gives you O(log n) search. For a billion records, that's about 30 traversals. Sounds fast.

But databases store data on disk, not in RAM. An in-memory lookup takes nanoseconds. A spinning disk read takes 5 to 10 milliseconds. An NVMe SSD is faster at around 70 microseconds per random read. Still thousands of times slower than memory.

Latency Numbers Every Programmer Should Know, disk I/O is four to seven orders of magnitude slower than L1 cache

When people say "disk is slow," this chart is what they mean. That millisecond gap between RAM and rotational disk is why B-trees exist.

In a binary search tree, each node holds one key. A billion-row database needs a billion nodes, and finding a single row means traversing about 30 of them, each a separate storage access. Thirty disk reads per query. Your users will notice.

The fix is to make each node much wider. If one node holds 1,000 keys instead of 1, the tree needs far fewer levels. A billion records collapse into a tree of height 3. Three storage reads instead of thirty.

That's a B-tree.

What a B-Tree Actually Is

A B-tree is a self-balancing search tree where each node can hold multiple keys and point to multiple children. The order of the tree, m, defines the maximum number of children a node can have. Each node therefore holds up to m-1 keys.

An internal node in an order-5 B-tree looks like this:

[14 | 33 | 58 | 82]
  |    |    |    |    |
 c0   c1   c2   c3   c4

Every key in child c0 is less than 14. Every key in c1 falls between 14 and 33. Every key in c2 falls between 33 and 58. And so on. It's the BST invariant stretched across multiple keys per node. Each child pointer leads to another node with the same structure, all the way down to the leaves.

The Four Invariants

A B-tree of order m follows four rules:

Every node has at most m children and m-1 keys.
Every non-root internal node has at least ⌈m/2⌉ children. Nodes must stay at least half full.
The root has at least 2 children, unless it is a leaf.
All leaf nodes sit at the same depth.

That fourth rule is the key to everything. All leaves at the same level means every search path has exactly the same length. The tree stays balanced not through rotations (like AVL trees or red-black trees) but through splits and merges: when a node overflows, it splits and pushes the median key upward; when it underflows, it borrows from a sibling or merges.

The tree grows taller only when the root splits. It shrinks only when the root's two children merge. No rotations. No repainting. Just clean, dignified arithmetic.

Building One: A Concrete Example

Let's build an order-3 B-tree. Order 3 means at most 2 keys and 3 children per node. A node with 3 keys overflows and must split.

Start empty. Insert 10, 20, 30.

After 10 and 20, the root holds [10 | 20]. Valid.

Insert 30: the root now has 3 keys, which overflows. Split it. The median (20) gets promoted to a new root, and the remaining keys become two children.

      [20]
     /    \
  [10]   [30]

Insert 5 and 15. Insert 5 into the left child: [5 | 10]. Fine. Insert 15 into the same node: [5 | 10 | 15]. Overflow. Split, promote the median (10) upward.

    [10 | 20]
   /    |    \
 [5]  [15]  [30]

Insert 25. The root routes us right (25 > 20), landing in [30]. Insert: [25 | 30]. Valid.

    [10 | 20]
   /    |    \
 [5]  [15]  [25 | 30]

Six values. Height 2. Every leaf at the same depth. No rotations anywhere. The tree just rearranged itself by pushing one key up and splitting one node. That's the whole mechanism.

How Search Works

At each node, scan the keys to find where the target falls, then follow the matching child pointer.

def search(node, key):
    i = 0
    while i < len(node.keys) and key > node.keys[i]:
        i += 1

    if i < len(node.keys) and node.keys[i] == key:
        return (node, i)  # found

    if node.is_leaf:
        return None  # not in the tree

    return search(node.children[i], key)

With m-1 keys per node, you do at most m-1 comparisons per level. For large nodes, use binary search within the node to keep this to O(log m) comparisons per level. Either way, the number of levels visited is O(log_m n).

With order 1000 and a billion records: log_1000(1,000,000,000) = 3. Three node accesses. A balanced BST over the same data needs log_2(1,000,000,000) ≈ 30. The branching factor makes all the difference.

Insert and Delete

Insert pushes a key down to the right leaf, then fixes any overflow upward:

Follow the search path to the correct leaf.
Insert the key (keeping the node sorted).
If the leaf overflows, split it: push the median key up to the parent and divide the remaining keys into two nodes.
If the parent overflows, split it too. Keep going.
If the root splits, create a new root. This is the only time the tree grows taller.

Delete has three cases:

The key is in a leaf and the leaf stays at least half-full after deletion. Just delete it.
The key is in an internal node. Replace it with the in-order successor or predecessor (which lives in a leaf), then delete from that leaf.
Deletion causes underflow. Fix by borrowing a key from a sibling (a rotation), or by merging the node with a sibling and pulling a key down from the parent.

Merges can cascade upward just as splits do during insertion. If the root ends up empty, its only child becomes the new root. Both operations run in O(log n). The tree never gets out of balance because balance is enforced at every step, not repaired afterward.

Time and Space Complexity

Operation	Time Complexity
Search	O(log n)
Insert	O(log n)
Delete	O(log n)
Min / Max	O(log n)
Range scan	O(log n + k)

Space: O(n). Because nodes stay at least half full, space utilization sits between 50% and 100%.

Height: O(log_m n). In MySQL InnoDB's default configuration, each node (page) is 16 KB. With 8-byte integer keys and 8-byte child pointers, a single node holds over 1,000 keys. A three-level B-tree can store roughly 1,000 × 1,000 × 1,000 = one billion records. That's the practical payoff: a constant number of storage reads regardless of table size.

Why Production Databases Use B+ Trees

Almost every production database uses a B+ tree, not a plain B-tree. The difference is where data lives.

In a plain B-tree, every node stores actual row data alongside keys. In a B+ tree, internal nodes hold only routing keys. All actual data lives in the leaf nodes. The leaves link together in a doubly-linked list.

B+ tree structure: internal nodes route with keys only, leaf nodes hold data and chain together for range scans

Because internal nodes carry no data, they fit more routing keys per page, making the tree shallower. Range queries also become a linked-list walk: find the starting key in O(log n), then follow the leaf list to collect the next k records without going back to the root. A query like WHERE age BETWEEN 25 AND 35 is fast because the database finds 25 once and walks forward.

MySQL InnoDB, PostgreSQL, SQLite, and most file systems (NTFS, ext4, HFS+, Btrfs) use B+ trees. When someone says "database index" in a system design interview, they mean a B+ tree.

How it feels writing SQL directly against a well-indexed B+ tree instead of fighting an ORM

Direct SQL hits the B+ tree leaf list exactly once. The ORM generates seven joins and three subqueries.

For a full breakdown of both variants, see the B-Tree and B+ Tree deep dive. If you want the direct comparison with binary search trees, B-tree vs Binary Search Tree covers that specifically.

What You Actually Need for Interviews

In coding rounds, you will almost never implement a B-tree from scratch. The full insert and delete logic is too involved for a 45-minute session, and interviewers know it.

In system design rounds, you will need to explain the tradeoffs:

Why B+ trees over hash maps? Hash maps give O(1) lookup but don't support range queries. B+ trees handle both point lookups and range scans.
Why B+ trees over BSTs? BSTs put one key per node. For a billion records, that's 30 storage accesses per lookup. B+ trees collapse that to 3.
What makes B-trees balanced? All leaves at the same depth. Non-root nodes stay at least half full. Splits on overflow, merges on underflow. No rotations at all.
What's the write cost? Each insert or delete can trigger a chain of splits or merges up the tree, but the number of nodes touched is O(log n) in the worst case. Databases manage write amplification by batching changes and using techniques like copy-on-write.

These questions come up whenever you design anything that touches persistent storage: relational databases, file systems, key-value stores. Database indexing for system design interviews walks through how indexes appear in full design problems.

Explaining B-tree tradeoffs under pressure is a different skill from reading about them. SpaceComplexity runs voice-based mock interviews where you defend architectural decisions in real time and get rubric-based feedback on whether your explanation actually lands.