The Heap Data Structure: A Complete Binary Tree Hiding in an Array

You have a list of a million log events. Every second a new one arrives. You need the ten most critical ones, always, instantly. Sorting the whole list after every insert is obviously out (your users will notice, your CPU will file a complaint). A balanced BST works but burns extra memory on pointers. The right answer is a heap data structure, and once you understand how it works internally, it becomes one of the most satisfying tools to reach for.

The mental model: a heap is a complete binary tree where every parent beats its children on some ordering, stored entirely inside a flat array with no pointers at all.

That flat-array layout is where the practical speed comes from. Let's pull it apart.

The Array That Thinks It's a Tree

A complete binary tree fills each level left-to-right before starting the next. That regularity is what lets you ditch pointers entirely. Given a node at zero-based index i:

Left child lives at 2i + 1
Right child lives at 2i + 2
Parent lives at floor((i - 1) / 2)

No memory allocation per node. No cache misses chasing pointers. The whole tree sits in one contiguous block of memory, which is why heap operations are faster in practice than a BST with the same asymptotic complexity. Your L1 cache does a little dance.

For a min-heap of [1, 3, 5, 7, 4, 8, 9]:

Array index:   0   1   2   3   4   5   6
Value:         1   3   5   7   4   8   9

Tree shape:
              1          (index 0)
           /     \
          3       5      (index 1, 2)
         / \     / \
        7   4   8   9    (index 3, 4, 5, 6)

The array [1,3,5,7,4,8,9] shown as both a flat array (with index labels) and as its equivalent binary tree, with dashed lines separating the two views. The root node carries an amber ring, indicating it holds the minimum. Same data, two views. The index arithmetic is the only "pointer" you need.

Node 1 (value 3) has children at indices 3 (value 7) and 4 (value 4). Its parent is at index 0 (value 1). Every parent is smaller than both its children: that is the min-heap property.

A max-heap flips this. Every parent is larger than both its children. The root is always the maximum.

The last non-leaf node sits at index floor(n/2) - 1 in a zero-based array. This matters enormously for build-heap. Forgetting the - 1 is the kind of off-by-one that gets you mid-interview while your brain screams "IT SHOULD WORK" and the tree says "no."

Sift-Up and Sift-Down: The Two Moves

Everything a heap does reduces to two primitive operations.

Sift-up fixes a violation where a node is smaller (in a min-heap) than its parent. You swap up the tree until the property is restored or you hit the root. Used after inserting a new element.

Insert 2 into [1, 3, 5, 7, 4, 8, 9]:

Append to end: [1, 3, 5, 7, 4, 8, 9, 2]
                                       ^-- index 7, parent at index 3 (value 7)

2 < 7, swap:   [1, 3, 5, 2, 4, 8, 9, 7]
                       ^-- index 3, parent at index 1 (value 3)

2 < 3, swap:   [1, 2, 5, 3, 4, 8, 9, 7]
                    ^-- index 1, parent at index 0 (value 1)

2 >= 1, stop.

Four-step sift-up animation: each row is an 8-cell array. The current node is highlighted in blue, its parent in amber. Step 1 appends 2 at index 7 and compares with parent 7. Step 2 swaps up to index 3, compares with parent 3. Step 3 swaps up to index 1, parent 1 wins. Step 4 shows the settled heap in green. The new element bubbles up until its parent wins. Two comparisons here. In the worst case it's log₂n comparisons.

Sift-down fixes a violation where a node is larger than one or both of its children. You swap down toward leaves, always taking the smaller child (in a min-heap). Used after extracting the minimum.

Extract min from [1, 2, 5, 3, 4, 8, 9, 7]:

Swap root with last element: [7, 2, 5, 3, 4, 8, 9]
Remove last.

Sift-down 7 at index 0:
  Children: index 1 (2), index 2 (5). Min child = 2.
  7 > 2, swap: [2, 7, 5, 3, 4, 8, 9]

  At index 1. Children: index 3 (3), index 4 (4). Min child = 3.
  7 > 3, swap: [2, 3, 5, 7, 4, 8, 9]

  At index 3. Children: index 7 (out of bounds). No children. Stop.

Three-step sift-down animation: each row shows a 7-cell array after extract-min. The demoted root (7) is in blue, children under comparison in amber. Steps 1 and 2 show swaps; step 3 shows 7 landing at a leaf with no children. Always swap with the smaller child. Otherwise you'd promote the larger child and break the property immediately.

Both operations run in O(log n) time, bounded by the tree's height.

Core Operations

Operation	Time	Space	Notes
`peek` (min/max)	O(1)	O(1)	Root is always at index 0
`push` / `insert`	O(log n)	O(1)	Append + sift-up
`pop` / `extract-min`	O(log n)	O(1)	Swap root+last, shrink, sift-down
`heappushpop`	O(log n)	O(1)	Push then pop, one traversal
`heapreplace`	O(log n)	O(1)	Pop then push, one traversal; faster than two ops
`build-heap`	O(n)	O(1)	See below; not O(n log n)
`heapsort`	O(n log n)	O(1)	Build-heap then n extractions
`decrease-key`	O(log n)	O(1)	Needs index tracking; not built in to most stdlib implementations

Peek is O(1) because the min (or max) is always at index 0. No searching. That is the entire point of the heap property.

Insert is O(log n) in the worst case: a new element bubbles from a leaf to the root, traversing the full height. The same bound applies to extraction.

The build-heap operation is the surprising one. Read on.

Why Build-Heap Is O(n), Not O(n log n)

The naive approach: insert each element one by one. That is n insertions at O(log n) each, giving O(n log n) total. Williams published this in 1964. Correct, but not optimal. That same year Floyd published a smarter method and Williams' approach quietly became a footnote.

Floyd's method starts from the last non-leaf node at index floor(n/2) - 1 and sift-downs each node, working backwards to the root. The insight is that most nodes are near the bottom of the tree, where sift-down is cheap.

Here is the accounting. A heap of n nodes has:

About n/2 nodes at height 0 (leaves). Sift-down cost: 0.
About n/4 nodes at height 1. Sift-down cost: 1 each.
About n/8 nodes at height 2. Sift-down cost: 2 each.
...
1 node at height log(n). Sift-down cost: log(n).

Total work:

T(n) = Σ (h=0 to log n) ⌈n / 2^(h+1)⌉ × h
     ≈ n × Σ (h=0 to ∞) h / 2^h

Now apply the identity for the sum of a geometric series. Starting from:

Σ (k=0 to ∞) x^k = 1 / (1 - x)   for |x| < 1

Differentiate both sides and multiply by x:

Σ (k=0 to ∞) k × x^k = x / (1 - x)²

Substitute x = 1/2:

Σ (k=0 to ∞) k / 2^k = (1/2) / (1/2)² = 2

So T(n) ≈ n × 2 = O(n). Build-heap uses at most 2n comparisons.

Two-panel diagram: left shows a 15-node binary tree with nodes color-coded by processing order (blue at height 1, amber at height 2, red at root). Numbered labels show the right-to-left sift-down sequence. Right shows a cost table: n/2 leaves at cost 0, n/4 nodes at cost ≤1, etc., summing to O(n). A contrast box shows the naive O(n log n) approach in red. Most nodes sit at the bottom where sift-down terminates quickly. The math works out to exactly 2n comparisons.

This is not O(n log n), and the gap matters. Initializing from an existing collection is the common case, and Floyd's method cuts the setup cost in half asymptotically.

The Non-Obvious Gotcha: No Decrease-Key

Binary heaps do not support efficient decrease-key. You can find any element only by scanning the entire array (O(n)), because the heap property only guarantees the parent-child relationship. It says nothing about the ordering between siblings. The node at index 3 could be larger or smaller than the node at index 5, and the heap does not care.

This is why Dijkstra's algorithm with a binary heap runs in O((V + E) log V) instead of the theoretical O(E + V log V) achievable with a Fibonacci heap. In practice, binary heaps win anyway because Fibonacci heaps have brutal constant factors and cache behavior that makes your profiler weep.

The standard workaround is "lazy deletion": push a new entry with the updated priority and mark the old one as invalid. Pop until you hit a valid entry. This works fine for most interview problems and real systems.

For cases where you genuinely need decrease-key, the indexed priority queue pattern (Princeton's IndexMinPQ) maintains a position map alongside the heap so you can find any element in O(1) and then sift in O(log n).

One Structure, Every Language

Python 3 (heapq)

import heapq

# Python's heapq is a min-heap operating on plain lists.
heap = []
heapq.heappush(heap, 5)
heapq.heappush(heap, 1)
heapq.heappush(heap, 3)

print(heap[0])            # 1 -- peek without removing
print(heapq.heappop(heap)) # 1 -- extract min

# Build from existing list in O(n)
data = [5, 3, 1, 4, 2]
heapq.heapify(data)        # mutates in-place

# Max-heap workaround: negate values
max_heap = []
for v in [5, 1, 3]:
    heapq.heappush(max_heap, -v)
print(-heapq.heappop(max_heap))  # 5

# nlargest / nsmallest: efficient for small k
print(heapq.nlargest(2, data))   # [5, 4]
print(heapq.nsmallest(2, data))  # [1, 2]

# Tie-breaking with tuples (priority, counter, item)
import itertools
counter = itertools.count()
task_heap = []
heapq.heappush(task_heap, (1, next(counter), "high priority task"))
heapq.heappush(task_heap, (1, next(counter), "also high priority"))
# counter breaks ties so items are never compared directly

import heapq

# heapq API is identical in Python 2 but no heappushpop_max or heapreplace_max
heap = []
heapq.heappush(heap, 5)
heapq.heappush(heap, 1)
heapq.heappush(heap, 3)

print heap[0]                    # 1
print heapq.heappop(heap)        # 1

# Build heap in O(n)
data = [5, 3, 1, 4, 2]
heapq.heapify(data)

# Max-heap: negate
max_heap = [-v for v in data]
heapq.heapify(max_heap)
print -heapq.heappop(max_heap)   # 5

# heapq.nlargest and nsmallest work the same
print heapq.nlargest(2, data)    # [5, 4]

What Problems Does a Heap Solve?

A heap earns its place in three situations.

Repeated access to the current minimum or maximum. A sorted structure works too, but a heap updates faster. Every insert or delete is O(log n) with no rebalancing rotations.

Top-k or bottom-k over a stream. You do not need to store all n elements. A fixed-size heap of k elements processes the stream in O(n log k) with O(k) memory.

Scheduling and simulation. Any system where events have priorities or deadlines and arrive dynamically: OS task schedulers, Dijkstra's shortest path, Prim's MST, A*, event-driven simulations.

When to Reach for a Heap Data Structure

Signals that a heap is what you want:

You keep needing the minimum or maximum of a changing set.
You need the k smallest/largest from a large or infinite input.
You are merging sorted streams or ordering events by time/priority.

Example 1: Kth Largest Element in an Array

Problem: Given an unsorted array of n integers, find the kth largest. The array might be very large.

Why a heap fits: sorting costs O(n log n) and is wasteful if you only care about one rank. A min-heap of size k costs O(n log k). Walk the array. If the heap has fewer than k elements, push. If the new element is larger than the heap's minimum, pop and push. At the end, the heap's minimum is the kth largest.

import heapq

def kth_largest(nums: list[int], k: int) -> int:
    heap = []
    for num in nums:
        heapq.heappush(heap, num)
        if len(heap) > k:
            heapq.heappop(heap)
    return heap[0]

This runs in O(n log k) time and O(k) space. If k is small (say, 10 out of a billion), the heap never grows large.

Example 2: Median of a Data Stream

Problem: Design a data structure that supports inserting integers and returning the current median at any time.

Why a heap fits: you need the middle value of a dynamic set. A single sorted structure works, but two heaps answer it more cleanly with O(log n) insert and O(1) median query.

Maintain a max-heap of the lower half and a min-heap of the upper half. Keep their sizes equal (or the lower half larger by at most one). The median is the max-heap's root if sizes differ, or the average of both roots if they are equal.

Two-heap median diagram: the left column is a max-heap holding values 1,3,4,4,5 with 5 at the root (amber); the right column is a min-heap holding 6,7,8,9,11 with 6 at the root (amber). Green arrows from both roots point to a central box showing median = (5+6)/2 = 5.5. Invariant text: max(lower) ≤ min(upper). The max-heap keeps track of the lower half; the min-heap keeps the upper half. The median is always sitting right at their boundary.

import heapq

class MedianFinder:
    def __init__(self):
        self.lower = []  # max-heap (negate values)
        self.upper = []  # min-heap

    def add_num(self, num: int) -> None:
        heapq.heappush(self.lower, -num)
        # Balance: largest in lower must be <= smallest in upper
        if self.upper and -self.lower[0] > self.upper[0]:
            heapq.heappush(self.upper, -heapq.heappop(self.lower))
        # Keep sizes balanced
        if len(self.lower) > len(self.upper) + 1:
            heapq.heappush(self.upper, -heapq.heappop(self.lower))
        elif len(self.upper) > len(self.lower):
            heapq.heappush(self.lower, -heapq.heappop(self.upper))

    def find_median(self) -> float:
        if len(self.lower) > len(self.upper):
            return float(-self.lower[0])
        return (-self.lower[0] + self.upper[0]) / 2.0

Recap

A heap is a complete binary tree stored in a flat array. No pointers. Parent at (i-1)//2, children at 2i+1 and 2i+2.
Min-heap: every parent is smaller than its children. Max-heap: the reverse.
Insert: append, sift-up. O(log n).
Extract-min: swap root with last, shrink, sift-down. O(log n).
Peek: read index 0. O(1).
Build-heap (Floyd's method) is O(n), not O(n log n). Most nodes are at the bottom and sift cheaply.
No native decrease-key. Use lazy deletion or an indexed priority queue if you need it.
Python's heapq is a min-heap. C++'s priority_queue is a max-heap by default. Getting this backwards in an interview is a fast track to a polite rejection email.
Common patterns: top-k elements, streaming median with two heaps, Dijkstra/A*, event scheduling.

If you want to practice these patterns under realistic interview pressure, SpaceComplexity runs voice-based mock interviews with rubric feedback covering exactly this: recognizing the right structure and explaining your reasoning out loud.

If you're building up to heap problems, the articles on dynamic programming and sliding window cover patterns you will often reach for alongside heaps.