B-tree vs Binary Search Tree: Why One Node Holds a Thousand Keys

- B-tree nodes hold up to 2t-1 keys each; a balanced BST holds exactly 1, making B-trees 7-10x shallower for billion-row datasets.
- Disk I/O, not comparison count, is what separates them: a billion-row BST needs ~30 page reads per query, a B-tree with t=400 needs ~4.
- Height bound h ≤ log_t((n+1)/2) means four levels cover a billion records at branching factor 400.
- Node splits keep all leaves at the same depth without rotations; the tree grows taller only when the root splits.
- B+ tree adds a linked leaf layer so range scans become one root-to-leaf traversal plus cheap sequential page reads, far better than BST inorder traversal on disk.
- A B-tree with t=2 is a 2-3-4 tree, isomorphic to a red-black BST, showing the two structures are the same family at different branching factors.
- Cache locality explains why Rust's
BTreeMapuses a B-tree with branching factor 6 even for purely in-memory data.
Both structures search ordered data in O(log n). The B-tree vs binary search tree question comes down to which base that log uses, and why that base turns a fast in-memory query into a 300-millisecond embarrassment the moment your data moves to disk.
A binary search tree node holds one key. A B-tree node holds hundreds. That isn't an implementation quirk. It's the whole design, chosen deliberately to match the cost model of the storage layer underneath.
A Balanced BST Makes Thirty Disk Reads
A binary search tree is satisfying to reason about. Each node holds one key and two child pointers: left for values smaller than the key, right for larger ones. Finding a target means comparing and descending, halving the search space at every step. With n nodes and a balanced tree, you do at most log₂ n comparisons.
For a million records, that's about 20 comparisons. For a billion, it's 30.
Thirty sounds small until you think about where those thirty comparisons happen.
The problem shows up when those 30 comparisons happen across 30 different disk pages.
If the tree sits in RAM, 30 comparisons take a few hundred nanoseconds and you never think about it. But the moment your dataset outgrows physical memory, each pointer follow can trigger a disk seek. Hard drives have roughly 10 ms of random seek latency. Thirty of those in sequence is 300 ms per query. The algorithm is correct and the complexity is technically O(log n), but the constant factor has become catastrophic. Your users experienced this as "the app is slow." They're not wrong, and they're not coming back.
One Node, Many Keys
A B-tree trades width for height. Instead of one key per node, a B-tree node holds many keys packed into a sorted array, with one extra child pointer per key plus one.
The structure is defined by a parameter called the minimum degree t. In any B-tree:
- Each node (except the root) holds between t-1 and 2t-1 keys
- Each internal node has between t and 2t children
- Every leaf sits at the same depth
So a node is just a sorted array of keys with matching child pointers on either side. To search, you binary-search the key array within the node, then follow the appropriate child pointer down.
Left: BST node, one key, two children. Right: B-tree node, up to 2t-1 keys and 2t children, each node one disk page.
With t = 400 (close to what InnoDB achieves on its default 16 KB pages with 8-byte integer keys), a single B-tree node holds up to 799 keys and 800 children. InnoDB is not showing off. It is matching node size to disk page size, which is the kind of pragmatic engineering decision that textbooks skip because it requires knowing that storage hardware exists. A three-level tree can index roughly 400³ = 64 million rows. Add a fourth level and you cover 25 billion.
A balanced BST holding 64 million nodes would be roughly 26 levels deep. The B-tree is 3. Same data, a factor of nine fewer node accesses.
For a billion records: BST needs roughly 30 levels, B-tree needs four. Every level is one disk read at 10 ms each.
Why Four Levels Covers a Billion Records
You can derive the bound directly. Ask: for a B-tree of height h, what is the minimum number of keys it must contain?
- The root has at least 1 key
- Level 1 has at least 2 nodes, contributing at least 2(t-1) keys
- Level 2 has at least 2t nodes, contributing at least 2t(t-1) keys
- Level k has at least 2t^(k-1) nodes, contributing at least 2t^(k-1)(t-1) keys
Sum from root to level h:
n ≥ 1 + 2(t-1) × (1 + t + t² + ... + t^(h-1))
= 1 + 2(t-1) × (t^h - 1) / (t - 1)
= 1 + 2(t^h - 1)
= 2t^h - 1
Rearranging gives the height bound:
h ≤ log_t((n + 1) / 2)
This is the result that makes B-trees practical. Plug in t = 400 and n = 10⁹:
h ≤ log_400(5 × 10⁸)
= log(5 × 10⁸) / log(400)
≈ 8.70 / 2.60
≈ 3.3
Four levels cover a billion records. A balanced BST requires 30. The B-tree is 7.5x shallower with this branching factor. You can explain this to a skeptic who has only ever worked in memory and they will not believe it until they benchmark it themselves.
Same Comparisons, Fewer Disk Reads
| Operation | BST (balanced) | B-tree |
|---|---|---|
| Total comparisons | O(log₂ n) | O(log n) (same asymptotically) |
| Disk I/Os | O(log₂ n) | O(log_t n) |
| Balance mechanism | Rotations (AVL/RB) | Node splits and merges |
| Space | O(n) | O(n) |
The comparison counts are asymptotically identical. A B-tree does O(log t) comparisons per node to binary-search the key array, multiplied by O(log_t n) levels, which works out to O(log n) total. The BST does one comparison per level across O(log₂ n) levels.
The difference is in disk I/Os. A BST traversal to a leaf touches O(log₂ n) nodes. A B-tree traversal touches O(log_t n) nodes, each of which is one disk page read. With t = 400, log_t n is about 7.5x smaller than log₂ n. That 7.5x is on disk, where each unit costs 10 ms.
CPU work costs nanoseconds. Disk I/O costs milliseconds. Optimizing disk reads at the expense of extra CPU is always the right trade when the dataset lives on storage. CPU cycles are cheap. Disk seeks are not.
Balance Without Rotations
Self-balancing BSTs like AVL trees and red-black trees stay at O(log n) height by rotating nodes after insertions and deletions. See AVL trees vs. red-black trees for the mechanics. Each rotation fixes a local imbalance and cascades at most O(log n) steps up the tree.
B-trees stay balanced differently. They never rotate. Instead, all leaves always sit at the same depth, guaranteed by the split and merge operations.
When you insert into a B-tree and encounter a full node on the way down, you split it immediately. The median key promotes to the parent, and the two halves become separate nodes. This is CLRS's "proactive split" approach: handle fullness on the descent so the insertion itself never needs to backtrack up the tree.
Node split: the median key rises to the parent. Two half-size children replace the full node. No rotations needed.
The tree grows taller only when the root splits, and when the root splits, every path gains one level simultaneously. No rotations. No cascade. No awkward post-insertion repair that backtracks up the tree. The invariant, all leaves at the same depth, is maintained by construction.
Range Queries Expose the Real Gap
Find all values between 500 and 900. On a BST, you traverse to the node containing 500, then walk the inorder traversal until you exceed 900. Every downward step to a new subtree costs a potential cache miss or disk read.
Real database implementations use B+ trees, a variant where all data lives in the leaf layer and leaves are linked in a sorted doubly-linked list. A range scan becomes two phases:
- Tree traversal from root to the leaf containing 500 (3-4 disk reads, one per level)
- Sequential scan forward along the leaf chain until you pass 900
Sequential reads are far cheaper than random seeks. On an HDD, sequential throughput can be 100x faster than random I/O because the disk head does not need to reposition. On an SSD, the gap is smaller but sequential prefetching still applies. The operating system's read-ahead mechanism will load the next leaf pages before you even ask for them, which is about as close to free as disk I/O gets.
A BST has no linked-leaf optimization. To enumerate a range, it repeatedly descends from intermediate nodes to subtrees. The traversal is correct but wastes the sequential locality that B+ trees exploit deliberately.
The Part Most People Miss
A B-tree with minimum degree t = 2 is exactly a 2-3-4 tree. Nodes hold 1 to 3 keys and have 2 to 4 children. A 2-3-4 tree is, in turn, isomorphic to a red-black BST: every 2-3-4 node maps to 1, 2, or 3 BST nodes connected by red edges.
So the BST isn't a different species from the B-tree. It's the t = 2 case. Someone set the minimum degree to 2 and then gave it a different name and a separate Wikipedia page.
This also explains why Rust's standard library BTreeMap uses a B-tree with branching factor 6 rather than a red-black tree for its ordered map. At in-memory scale, a node with 11 keys fits in a handful of cache lines. Touching fewer cache lines per query beats the asymptotically equivalent but pointer-heavier red-black tree. The argument from disk still holds, just applied to L1 vs L2 vs DRAM instead of RAM vs HDD.
# One level of B-tree descent def search_node(node, target): # Binary search within the sorted key array lo, hi = 0, len(node.keys) - 1 while lo <= hi: mid = (lo + hi) // 2 if node.keys[mid] == target: return node.values[mid] elif node.keys[mid] < target: lo = mid + 1 else: hi = mid - 1 # lo is the index of the correct child if node.is_leaf: return None return search_node(node.children[lo], target)
The inner loop is plain binary search. After resolving the child index, you descend once. A BST descends after a single comparison per level. The B-tree does more work per level but in far fewer levels.
B-tree vs Binary Search Tree: Which One to Reach For
The choice is almost entirely about whether your data lives on disk.
Use a BST (AVL, red-black, or similar) when:
- Data fits comfortably in memory
- Access is mostly random point queries
- You want a simpler, battle-tested structure
- Predecessor, successor, or rank queries matter and range throughput does not
Use a B-tree (or B+ tree variant) when:
- Data lives on disk or NVMe storage
- Range queries are a primary access pattern
- You are building or studying a database index, file system, or on-disk key-value store
- Even in memory, you want better cache behavior and can tune the branching factor
If you are unsure which situation you are in, run a range scan over a few hundred million rows indexed with a balanced BST. You will figure it out quickly.
The binary search tree is the right foundation to understand first. The B-tree and B+ tree post covers the full insertion algorithm and all 14 language implementations if you need the mechanics in depth.
Recap
- BST nodes hold 1 key. B-tree nodes hold up to 2t-1 keys and 2t children.
- Balanced BST height is O(log₂ n). B-tree height is O(log_t n). With t = 400 and n = 10⁹, that is 30 vs 4 levels.
- Both use O(log n) comparisons total. B-trees reduce disk I/Os by a factor of log₂ t, which is 8 or more at database branching factors.
- B-trees maintain perfect balance via node splits, not rotations. All leaves always sit at the same depth.
- A B-tree with t = 2 is a 2-3-4 tree, structurally equivalent to a red-black BST.
- B-trees improve cache locality in memory too, which is why Rust's
BTreeMapuses one.
If you want to practice explaining this kind of tradeoff out loud, SpaceComplexity runs voice-based mock interviews where you work through data structure and system design questions with rubric-based feedback. Knowing the theory is one thing. Explaining it coherently under pressure is different, and that is the skill the interview actually tests.