Fibonacci Heap: Lazy Heaps, O(1) Decrease-Key, and Why It Works

Every graph algorithm course eventually mentions the Fibonacci heap and then backs away slowly. The asymptotic bounds are stunning: insert in O(1), decrease-key in O(1), extract-min in O(log n), all amortized. That drops Dijkstra from O(E log V) to O(E + V log V), making it theoretically optimal on dense graphs. Then the professor mutters something about "large constants" and retreats to binary heaps, never to return.

That dismissal is fair in practice but intellectually unsatisfying. The Fibonacci heap is a masterclass in amortized analysis. Its trick is paying for expensive work using potential banked during cheap work. Same idea that powers dynamic arrays, union-find, and splay trees. Understanding it is worth the effort even if you never deploy one.

Interviewer asks for nth Fibonacci number. Programmer writes the closed-form Binet's formula. Interviewer is unimpressed. Theoretically optimal. Practically questionable. A theme.

This is the complete reference: structure, every operation with its proof, all fourteen language implementations, and the exact problems where the theory pays off.

What a Fibonacci Heap Actually Is

A Fibonacci heap is a collection of heap-ordered trees stored in a circular doubly linked root list, with a single pointer to the node holding the minimum key.

That sentence hides a lot. There is no constraint on the shape of the trees. After many inserts, you might have a heap made entirely of single-node trees, each with its own root. Trees only get consolidated (merged together) lazily, when you extract the minimum. This lazy strategy is what makes insert and union free, and it's what makes the amortized analysis work.

The "Fibonacci" name comes from the proof of the degree bound, not from the shape of the trees. Fibonacci numbers appear when you compute the minimum possible size of a subtree rooted at a degree-k node, and this bound is the reason the structure doesn't degenerate.

Inside the Structure

Node Structure

Every node carries seven fields:

key     : the stored value
degree  : number of children
mark    : boolean flag (explained under decrease-key)
parent  : pointer to parent (NULL if root)
child   : pointer to any one child (NULL if leaf)
left    : left sibling in circular doubly linked list
right   : right sibling in circular doubly linked list

Your binary heap node has one field. A Fibonacci heap node has seven. That's six more things to get wrong, which is part of why nobody ships one in production.

The circular doubly linked list is the workhorse. It lets you splice two lists together in O(1), remove a node in O(1), and iterate the root list without a separate header node.

Fibonacci heap node with all 7 fields labeled, showing the 6 pointer fields and the mark bit One node, one cache line, six pointers. Cache-friendly this is not.

Each node in the root list is the root of a min-heap-ordered tree. A tree of degree k has its root with k children, and the root's key is the smallest in that subtree.

Each child also participates in a circular doubly linked list (the child list of its parent). So there are two kinds of circular doubly linked lists in the structure: the root list, and the child list of every internal node.

A Heap After Several Inserts

After inserting 3, 17, 24, 7, 21 with no extract-min ever called, the heap looks like five single-node trees in the root list. Insert is pure laziness. Do nothing, defer everything.

Five single-node trees in the circular doubly linked root list after inserting 3, 7, 17, 21, 24, with min pointer at 3 Five inserts. Zero consolidation. This is what O(1) insert looks like.

This is valid because each single node trivially satisfies the min-heap property. The consolidation work has been punted to whoever calls extract-min next.

After Extract-Min: Consolidate

Extracting 3 promotes its children to the root list, then calls consolidate. Consolidate merges trees of equal degree until every degree appears at most once. The degree array acts as a bucket: slot d holds the unique tree of degree d. If two trees share a degree, the one with the larger root becomes a child of the other (min-heap order), and the degree ticks up.

All that deferred work from five O(1) inserts gets paid for right here, in one amortized O(log n) bill.

Before and after consolidation: four degree-0 trees merge into one degree-2 tree rooted at 7, with 17 and 21 as children and 24 as a grandchild Four lazy trees walk into consolidate. One balanced tree walks out.

Fibonacci Heap Time Complexity: Every Operation

Operation	Amortized	Worst Case
`find-min`	O(1)	O(1)
`insert`	O(1)	O(1)
`union`	O(1)	O(1)
`decrease-key`	O(1)	O(log n)
`extract-min`	O(log n)	O(n)
`delete`	O(log n)	O(n)
Space	O(n)	O(n)

All amortized bounds use the potential function described below. The worst-case column reflects pathological inputs where, for example, every node is in the root list simultaneously.

The Potential Function: Where the Magic Lives

The entire amortized analysis rests on one function:

Φ(H) = t(H) + 2·m(H)

where t(H) is the number of trees in the root list and m(H) is the number of marked nodes. The heap starts empty with Φ = 0.

Think of it as a bank account. Cheap operations deposit potential. Expensive operations spend it. The trick is making sure the ledger never goes negative.

For each operation, amortized cost = actual cost + ΔΦ. If ΔΦ is negative, the operation is cheaper than its actual work suggests. We're drawing down the stored potential. If ΔΦ is positive, we're paying forward for future work.

Table showing amortized cost analysis for insert, extract-min, and decrease-key with the potential function tracking how t and m change per operation t terms cancel in extract-min. c terms cancel in decrease-key. That's the whole trick.

Insert: O(1) amortized

Actual cost: O(1) (create node, splice into root list, update min pointer if needed).

ΔΦ = +1 (one new tree, no new marks).

Amortized = O(1) + 1 = O(1).

Find-Min: O(1) actual

The min pointer is maintained at all times. No potential change.

Union: O(1) amortized

Splice two root lists together in O(1) by rewiring four pointers. Update min pointer if needed.

ΔΦ = 0 (same total trees, same marks).

Amortized = O(1).

Extract-Min: O(log n) amortized

This is where deferred work gets done.

Remove the minimum node z from the root list.
Add all of z's children (at most D(n) of them, where D(n) is the max degree) to the root list.
Consolidate: traverse the root list, merge trees of equal degree. Cost is O(t + D(n)) where t is the number of trees before the extract.
After consolidation, at most D(n)+1 trees remain (one per possible degree value).

ΔΦ = (D(n)+1) - t (trees after minus trees before).

Amortized = O(t + D(n)) + (D(n)+1 - t) = O(D(n)) = O(log n).

The t terms cancel exactly. The actual work scanning the root list is paid for by the decrease in potential when those trees get merged away. Only D(n) = O(log n) remains.

Decrease-Key: O(1) amortized

This operation is what makes the whole structure worth building.

To decrease node x's key:

Set x.key = new value.
If x is a root or the new key doesn't violate min-heap order with its parent, update the min pointer if needed and stop.
Otherwise, call CUT(x, parent) and then CASCADING-CUT(parent).

CUT removes x from its parent's child list, adds x to the root list, clears x.mark (roots are never marked).

CASCADING-CUT(y):

If y's parent is NULL (y is a root), do nothing.
If y.mark is FALSE: set y.mark = TRUE. Stop.
If y.mark is TRUE: CUT(y, y.parent), then CASCADING-CUT(y.parent) recursively.

Let c be the number of cascading cuts (not counting the mandatory first cut of x). Total cuts: c+1.

Actual cost: O(c+1).

ΔΦ:

Trees increase by c+1 (each cut adds a new root). ΔΦ from trees = +(c+1).
Each cascading cut takes a marked node and makes it a root (clears its mark). ΔΦ from marks = -2c.
The stopping ancestor (the first unmarked non-root) gets marked. ΔΦ from marks += +2.
Net ΔΦ = (c+1) + (-2c) + 2 = -c + 3.

Amortized = O(c+1) + (-c+3) = O(1).

The c cascading cuts cost O(c) actual work but reduce potential by c (two units per marked node cleared, partially offset by the one tree added per cut). The potential function accounts for every cascading cut in advance, deposited when the nodes were first marked. Each marked node carries 2 units of stored potential, exactly enough to pay for its own future cut and the new tree it becomes.

Why the Degree Is O(log n)

The maximum degree D(n) controls extract-min's cost. If degrees could grow unboundedly, the guarantee breaks. The Fibonacci numbers are why they can't.

Lemma: If x is any node of degree k, its i-th child y_i (in the order it was added to x's child list) has degree at least i-2.

Proof. When y_i was linked to x during consolidate, x and y_i had the same degree (consolidate only links equal-degree trees). At that point x had degree i-1 (it already had i-1 children), so y_i had degree i-1 at the time of linking. Since then, at most one child could have been cut from y_i before a cascading cut would remove y_i itself. So y_i's current degree is at least (i-1) - 1 = i-2.

Lemma: Let s_k = minimum number of nodes in a Fibonacci heap tree rooted at a degree-k node. Then s_k >= F_{k+2}, where F_j is the j-th Fibonacci number (F_1=1, F_2=1, F_3=2, ...).

Proof by induction. s_0 = 1 = F_2. s_1 = 2 = F_3.

For k >= 2, x has at least 2 children (y_1 with degree >= -1, so >= 0, and y_2 with degree >= 0, and y_i with degree >= i-2 for i >= 3). Counting x itself plus each child's subtree:

s_k >= 1 + 1 + s_0 + s_1 + ... + s_{k-2}
     = 2 + s_0 + s_1 + ... + s_{k-2}

The Fibonacci recurrence gives F_{k+2} = 2 + sum of F_{j+2} for j from 0 to k-2, which matches the recurrence for s_k. By induction s_k >= F_{k+2}.

Corollary: Since F_{k+2} >= phi^k (where phi = (1+sqrt(5))/2 ≈ 1.618) and size(x) >= phi^k, we get n >= phi^{D(n)}, so:

D(n) <= log_phi(n) = ln(n) / ln(phi) ≈ 1.44 * log_2(n)

The Fibonacci numbers appear because the minimum subtree size satisfies the same recurrence. The name of the data structure is a tribute to this degree bound proof, not to any visual resemblance to Fibonacci spirals.

Table of minimum subtree sizes by degree: s_0=1=F_2, s_1=2=F_3, s_2=3=F_4, matching Fibonacci numbers, with diagram of minimum degree-3 tree Minimum subtree sizes: 1, 2, 3, 5, 8, 13... You've seen those numbers before.

The Decrease-Key Trick, Demystified

The mark bit is the non-obvious piece. Here is what it guards against.

Without cascading cuts, you could call decrease-key on every child of a root, cutting them all away and making them new roots. After enough of those, the "tree" of degree k might have zero children, violating the degree bound that consolidate relies on. The degree bound proof assumes degree-k nodes have enough subtree mass. Without it, consolidate might produce a tree of degree 60 when n = 10, and the entire log-n guarantee collapses.

The mark bit enforces this invariant: a non-root node loses at most one child via decrease-key before it is itself cut and promoted to a root. The first child lost marks the node. The second child lost triggers the cascading cut that moves the node to the root list, where it starts fresh with mark = FALSE.

Every non-root node retains all but one of the children it had when it was last linked. That one lost child is the "-2" in the degree bound lemma. The structure is carefully calibrated to allow exactly one failure before resetting.

Cascading Cut in Action

Before decrease-key on node D (new key 1, parent C has key 5):

root list: [3] ──── [7]
                     │
                    [9] (degree 2)
                   /   \
                 [C=5]  [11]
                 / \
              [D=8] [E=6]
              (D is marked: already lost one child)

After decrease-key(D, 1):
  1. CUT(D, C): D moves to root list, C.degree--.
  2. C is now marked (first child lost? No: C.mark was FALSE, so just mark C).

root list: [1] ─── [3] ─── [7]
(D is now root with key 1)
                    [9]
                   /   \
                 [C=5]  [11]
                  |
                 [E=6]
(C is now marked)

If we now decrease-key on E (key goes below 5):
  1. CUT(E, C): E moves to root list, C.degree--.
  2. CASCADING-CUT(C): C.mark is TRUE, so CUT(C, [9]). C moves to root list.
     C.mark = FALSE (roots are not marked).
  3. CASCADING-CUT([9]): [9].mark was FALSE, so mark [9].

root list: [1] ─── [C=5] ─── [E=...] ─── [3] ─── [7]
                    [9] (degree 1, now marked)
                     │
                    [11]

Before and after cascading cut: decreasing key E triggers CUT on E then cascading CUT on marked parent C, both becoming new roots, with 7 getting marked for losing C Marked node loses a second child. Both shoot up to the root list. The potential function already had their bail money ready.

Cascading cuts keep the structure honest by ensuring no node retains a badly degraded subtree.

The Practical Catch

So you have this beautiful theoretical machine. Then you run it on real hardware and things get awkward.

Problem one: six pointers per node. A binary heap node stores one value. A Fibonacci heap node carries six pointers plus a key and a boolean. On 64-bit hardware that's roughly 64 bytes. One full cache line per node. Binary heaps live in flat arrays where the CPU prefetcher loads siblings for free. Fibonacci heaps chase pointers scattered across heap memory, triggering a cache miss every step. The CPU is not impressed by your amortized analysis.

Problem two: high constant factors. The O(1) decrease-key involves pointer surgery on at least four nodes per cut. In practice, for graphs with V = 10^5 and E = 10^6, a well-tuned binary heap beats a Fibonacci heap in wall time, because the log factor you're saving costs less than the pointer chasing you're adding.

Problem three: the crossover point is basically never hit. The asymptotic advantage appears when E >> V log V, meaning decrease-key calls dominate extract-min calls. On road networks (sparse, E ≈ V), both reduce to O(V log V) anyway. On complete graphs, the constant factors wipe out the gain. Real workloads almost never sit in the sweet spot.

Pairing heaps match Fibonacci heap amortized bounds experimentally and are far simpler to implement. Their theoretical decrease-key cost is still an open problem, but benchmarks consistently favor them. This is the real punchline: a simpler structure we can't fully analyze beats the fully analyzed one in practice.

The right framing: Fibonacci heaps matter for theoretical lower bounds and for algorithms that depend on decrease-key in their complexity proofs (network flow, some matching algorithms). In production, reach for a binary heap. If profiling shows decrease-key is your actual bottleneck, reach for a pairing heap. The Fibonacci heap is mostly for understanding the theory, which is itself a good reason to understand it.

One Structure, Every Language

Each implementation covers insert (returning a node handle), find-min, extract-min, decrease-key (using the node handle), and union. The node handle pattern is essential. Decrease-key needs a direct pointer to the node, not a key lookup. If you lose the handle, you've lost your ability to use the one feature that makes this structure worthwhile.

import math

class FibNode:
    __slots__ = ('key', 'degree', 'mark', 'parent', 'child', 'left', 'right')

    def __init__(self, key):
        self.key = key
        self.degree = 0
        self.mark = False
        self.parent = None
        self.child = None
        self.left = self
        self.right = self


class FibHeap:
    def __init__(self):
        self.min_node = None
        self.n = 0

    def insert(self, key):
        node = FibNode(key)
        self._add_to_root_list(node)
        if self.min_node is None or node.key < self.min_node.key:
            self.min_node = node
        self.n += 1
        return node  # caller holds this handle for decrease_key

    def find_min(self):
        return self.min_node.key if self.min_node else None

    def union(self, other):
        """Merge other into self in O(1). other is invalidated."""
        if other.min_node is None:
            return
        if self.min_node is None:
            self.min_node = other.min_node
        else:
            # Splice the two circular root lists
            self_last = self.min_node.left
            other_last = other.min_node.left
            self_last.right = other.min_node
            other.min_node.left = self_last
            other_last.right = self.min_node
            self.min_node.left = other_last
            if other.min_node.key < self.min_node.key:
                self.min_node = other.min_node
        self.n += other.n

    def extract_min(self):
        z = self.min_node
        if z is None:
            return None
        if z.child:
            children = []
            cur = z.child
            while True:
                children.append(cur)
                cur = cur.right
                if cur is z.child:
                    break
            for child in children:
                self._add_to_root_list(child)
                child.parent = None
        self._remove_from_root_list(z)
        if z.right is z:
            self.min_node = None
        else:
            self.min_node = z.right
            self._consolidate()
        self.n -= 1
        return z.key

    def decrease_key(self, node, new_key):
        assert new_key <= node.key
        node.key = new_key
        parent = node.parent
        if parent and node.key < parent.key:
            self._cut(node, parent)
            self._cascading_cut(parent)
        if node.key < self.min_node.key:
            self.min_node = node

    def delete(self, node):
        self.decrease_key(node, float('-inf'))
        self.extract_min()

    # ---- internal helpers ----

    def _add_to_root_list(self, node):
        if self.min_node is None:
            node.left = node.right = node
        else:
            node.right = self.min_node
            node.left = self.min_node.left
            self.min_node.left.right = node
            self.min_node.left = node

    def _remove_from_root_list(self, node):
        node.left.right = node.right
        node.right.left = node.left

    def _consolidate(self):
        max_deg = int(math.log(self.n) / math.log(1.618)) + 2
        A = [None] * (max_deg + 2)

        roots = []
        cur = self.min_node
        while True:
            roots.append(cur)
            cur = cur.right
            if cur is self.min_node:
                break

        for w in roots:
            x = w
            d = x.degree
            while A[d] is not None:
                y = A[d]
                if x.key > y.key:
                    x, y = y, x
                self._link(y, x)
                A[d] = None
                d += 1
            A[d] = x

        self.min_node = None
        for node in A:
            if node is None:
                continue
            node.left = node.right = node
            self._add_to_root_list(node)
            if self.min_node is None or node.key < self.min_node.key:
                self.min_node = node

    def _link(self, y, x):
        """Make y a child of x. y's degree must equal x's degree before call."""
        y.parent = x
        if x.child is None:
            x.child = y
            y.left = y.right = y
        else:
            y.right = x.child
            y.left = x.child.left
            x.child.left.right = y
            x.child.left = y
        x.degree += 1
        y.mark = False

    def _cut(self, x, y):
        if x.right is x:
            y.child = None
        else:
            if y.child is x:
                y.child = x.right
            x.left.right = x.right
            x.right.left = x.left
        y.degree -= 1
        x.left = x.right = x
        self._add_to_root_list(x)
        x.parent = None
        x.mark = False

    def _cascading_cut(self, y):
        z = y.parent
        if z is None:
            return
        if not y.mark:
            y.mark = True
        else:
            self._cut(y, z)
            self._cascading_cut(z)

What Problems a Fibonacci Heap Actually Solves

O(1) amortized decrease-key combined with O(1) union. Those two properties matter in exactly the algorithms where you repeatedly relax edges (Dijkstra, Prim's) or merge priority queues during execution.

Dijkstra's algorithm. With a binary heap: O((E + V) log V). With a Fibonacci heap: O(E + V log V). On dense graphs where E = O(V^2), this is the difference between O(V^2 log V) and O(V^2). On sparse graphs where E = O(V), both reduce to O(V log V). The Fibonacci heap wins whenever E >> V, which in practice means E = Omega(V log V).

Prim's MST algorithm. Same analysis. V extract-mins cost O(V log V). E decrease-keys cost O(E) instead of O(E log V). Total: O(E + V log V).

Network flow (Goldberg-Tarjan). The successive shortest paths algorithm for minimum-cost flow runs in O(VE log(V^2/E)) with Fibonacci heaps, improving from O(VE log V) with binary heaps.

Meldable priority queues. When you need to merge two heaps without rebuilding, the O(1) union operation is uniquely efficient. Binary heaps require O(n) to merge. Binomial heaps take O(log n). Fibonacci heaps do it in O(1).

When to Reach for a Fibonacci Heap

Five Signals

The problem reduces to shortest paths or MST on a graph with E >> V log V.
You need to reprioritize existing queue entries (not just add new ones). This is the decrease-key use case.
You need to merge priority queues during execution.
The algorithm description says "decrease the key of a node" in its pseudocode.
You're implementing network flow, where augmenting paths must be found repeatedly.

When a Binary Heap Wins

If every item is inserted once and extracted once, with no key changes in between, a binary heap is just as good and far faster in practice. Fibonacci heaps earn their complexity only through decrease-key. Without that operation, you've built a seven-field node just to hold a number.

Worked Example: Dijkstra With Dense Graphs

Problem: find shortest paths from source in a graph with V = 10,000 vertices and E = 20,000,000 edges (complete-ish graph).

Binary heap: O(E log V) = 20M * 13 ≈ 260M operations for decrease-key. Fibonacci heap: O(E + V log V) = 20M + 10K * 13 ≈ 20M operations total.

The decrease-key calls (one per edge relaxation) go from O(log V) each to O(1) amortized each. On graphs this dense, this is a 13x reduction in decrease-key cost. In practice, cache effects erode much of this gain, but for graph algorithms running on supercomputers or processing road networks of entire continents, the theoretical bound is meaningful.

Worked Example: Offline Event Simulation

Problem: simulate N events where events can spawn new events or reprioritize existing ones. The priority is event time, and events often get postponed (key increased) or accelerated (key decreased).

A Fibonacci heap handles decrease-key in O(1) amortized. Increase-key requires delete + insert: O(log n) amortized. If most updates are decreases (accelerations), the Fibonacci heap wins by keeping the event loop at O(E + N log N) instead of O((E + N) log N).

The Short Version

A Fibonacci heap is a lazy collection of heap-ordered trees with all structural work deferred to extract-min. Laziness is the design, not a bug.
The potential function Φ = t + 2m (trees plus twice the marked nodes) pays for expensive operations using potential built up during cheap ones. Like a credit card, but the debt is mathematically bounded.
Cascading cuts ensure no non-root node loses more than one child before it gets promoted, bounding the maximum degree to O(log n) via Fibonacci number arguments.
The O(1) amortized decrease-key is the killer feature. It reduces Dijkstra and Prim to O(E + V log V), beating O(E log V) by a factor of log V when E >> V.
In practice, large constant factors and poor cache locality mean binary heaps or pairing heaps are usually faster for typical graph sizes (V under 10^6).
No major language standard library ships a Fibonacci heap. This is probably fine. It is a theoretical tool that matters for algorithms papers, large-scale graph processing, and understanding why amortized analysis is actually interesting.

If you want to practice explaining data structure tradeoffs out loud, under time pressure, with an interviewer waiting, SpaceComplexity runs voice-based DSA mock interviews with rubric-based feedback. The Fibonacci heap is a great test case for trade-off communication. Try it before your next loop.

For more on the amortized analysis techniques underlying Fibonacci heaps, the amortized analysis deep dive covers the potential method and accounting method in full detail. The heap data structure reference covers binary heaps and their array layout, which is the practical alternative to reach for first.