Memoization vs Tabulation: Two Ways to Cache the Same Recursion

May 26, 202611 min read
dsaalgorithmsinterview-prepdynamic-programming
Memoization vs Tabulation: Two Ways to Cache the Same Recursion
TL;DR
  • Memoization (top-down) caches results as recursion unwinds; easy to write from any recursive definition but accumulates call-stack frames
  • Tabulation (bottom-up) fills a pre-allocated table in topological order; faster constant factors and no recursion limit
  • Same asymptotic time complexity: both compute each subproblem once, at O(states × work per state)
  • Rolling arrays only work with tabulation: the memo cache holds all states simultaneously and can't safely discard any of them
  • Memoization wins for sparse state spaces where not all subproblems are needed; tabulation wins when you need memory below O(total states)
  • Python's 1000-frame recursion limit makes tabulation required for n above roughly 900 without explicit overrides

You write the recursive solution. It's exponential. You add a cache. Now it's polynomial. The interviewer smiles. "Can you optimize the space?" You stare at your own code like it belongs to a stranger.

Memoization vs tabulation: they solve the same problems. They have the same asymptotic complexity. They are not interchangeable, though. The difference becomes visible the moment someone asks you to cut memory usage below O(states).

The mental model: both approaches avoid recomputing subproblems by storing results. Memoization goes top-down, caching results on the way back up through the call stack. Tabulation goes bottom-up, filling a table from base cases toward the answer. Which direction you travel matters more than you'd expect.

Before we dig into why, here is what you get if you skip memoization entirely:

Finding the 87th Fibonacci number using recursion with no memoization on a computer capable of 1 billion operations per second. This little maneuver is gonna cost us 51 years.

51 years seems like a lot. It is. fib(87) requires about 2^87 calls without caching.

The Subproblem Graph Is the Same for Both

Every DP problem has a directed acyclic graph underneath it. Nodes are states. Edges say "this state depends on that state." Solving the full problem means visiting nodes in topological order, computing each once, and reusing results whenever an edge points somewhere already computed.

Memoization enters from the root (the answer you want) and demands subproblems recursively, caching results on the way back up. Tabulation enters from the leaves (base cases) and fills every node in order before anything needs it.

fib(5) call tree with CACHE HIT labels on repeated calls vs a left-to-right array being filled with Fibonacci values, showing dependency arrows Left: memoized recursion prunes the repeated sub-calls. Right: tabulation fills left-to-right, no stack involved.

Same total work. Same final answer. Different control flow.

Memoization: Write the Recursion, Annotate It

The implementation pattern is mechanical. Write the natural recursive solution. Add a cache. Stop.

from functools import cache @cache def fib(n: int) -> int: if n <= 1: return n return fib(n - 1) + fib(n - 2)

Python's @cache (from functools, added in 3.9) is a thin wrapper around a dictionary keyed by argument tuples. Each call checks the dict first. If the result is there, return it immediately. If not, recurse, store the result, and return it.

The cost of this convenience: every call that is not a cache hit pushes a frame onto the call stack. For fib(n), the longest unresolved chain runs n frames deep before it hits the base case. Each frame holds the return address, the saved frame pointer, and local variables. The cache shortens the total unique calls from exponential to linear. It does nothing for the depth of the live call stack.

Call stack at peak depth for fib(6): frames stacked vertically from fib(6) at top to fib(1) base case at bottom, annotated with roughly 64 bytes per frame and a brace showing depth O(n) The cache eliminates duplicate work. The stack depth is still proportional to n.

Python's default recursion limit is 1000 frames. fib(1000) raises a RecursionError before it computes a single result. You can raise it with sys.setrecursionlimit, but that is the software equivalent of putting a wet-floor cone next to the problem instead of mopping it up.

Tabulation: Think About Order, Then Fill the Table

No recursion. Work out which states you need, in what order they become computable, and fill them iteratively.

def fib(n: int) -> int: if n <= 1: return n dp = [0] * (n + 1) dp[1] = 1 for i in range(2, n + 1): dp[i] = dp[i - 1] + dp[i - 2] return dp[n]

No stack frames accumulate. No recursion limit applies. The loop fills states in exactly the order their dependencies become available. That is what "topological order" means in practice: you never compute a state before the states it reads.

The upfront cost: you have to reason about the fill order before writing the loop. For simple 1D problems this is obvious. For 2D problems it takes an extra minute of thought, which is an extra minute of thought under interview pressure.

The Asymptotic Complexity Is the Same

CLRS states this precisely, and it is one of the few moments where a textbook sounds almost casual about it: "These two approaches yield algorithms with the same asymptotic running time, except in unusual circumstances where the top-down approach does not actually recurse to examine all possible subproblems."

Both compute each subproblem once. Time is O(states × work per state). The difference between them is constant factors and peak memory, not big-O. See the memoization time complexity breakdown if you want to derive the formula for a specific problem.

Tabulation has two concrete constant-factor advantages when all subproblems are needed. First, no function call overhead: each Python function call costs around one microsecond in frame allocation and teardown, and those costs stack up across thousands of states. Tabulation replaces recursive calls with array writes. Second, a pre-allocated array is faster to index than a dictionary. @cache hashes the argument tuple and handles collisions; dp[i] is a direct memory access.

Memoization Wins When the State Space Is Sparse

Tabulation always fills every cell. If you do not need every cell, you are paying for work nobody requested.

Consider a path-counting problem on a grid where large regions are blocked by obstacles. Memoization visits only reachable cells. Tabulation iterates every cell in topological order regardless.

If the number of subproblems you actually need is much smaller than the full state space, memoization computes only what it needs. This is uncommon in standard interview problems, where the constraints are designed to make the state space dense and bounded. But for problems with early-exit feasibility conditions, or for graph DP where many states are structurally unreachable, the gap is real.

A concrete example: edit distance between two strings has an O(m × n) state space, but a variant that terminates as soon as the minimum possible edit count exceeds a threshold k will only visit a diagonal band of width O(k) across that table. Memoization naturally avoids the rest of the cells. Tabulation fills every cell regardless, including the ones that can never produce a better answer than you already have.

The other case for memoization is ergonomics. For interval DP or tree DP, the recursive structure mirrors how you reason about the problem. You write down the recurrence, add @cache, and move on. Tabulation forces you to separately reason about fill order, which is another potential mistake site.

Tabulation Wins When You Need to Compress Space

This is the asymmetry that trips people up in interviews. Rolling arrays only work with tabulation. Memoization holds all computed states simultaneously and cannot safely discard any of them.

A rolling array works when the DP transition only reads from the previous row. Instead of keeping the full dp[m][n] table in memory, you keep two rows and overwrite one with the other on each outer-loop iteration.

Longest Common Subsequence normally uses O(m × n) space. The transition is:

dp[i][j] = dp[i-1][j-1] + 1             if s[i] == t[j]
           max(dp[i-1][j], dp[i][j-1])  otherwise

It only reads the previous row. Two rows are sufficient:

def lcs(s: str, t: str) -> int: m, n = len(s), len(t) prev = [0] * (n + 1) for i in range(1, m + 1): curr = [0] * (n + 1) for j in range(1, n + 1): if s[i - 1] == t[j - 1]: curr[j] = prev[j - 1] + 1 else: curr[j] = max(prev[j], curr[j - 1]) prev = curr return prev[n]

Space drops from O(m × n) to O(n). The same trick compresses 0/1 knapsack from O(n × W) to O(W), as covered in the bottom-up DP space optimization guide.

LCS table: left shows the full m×n grid with all rows in memory; right shows only two rows labeled prev and curr with a swap arrow at the end of each outer loop pass You only ever look back one row. Two rows is all you need. The rest is just RAM you are holding onto for no reason.

With memoization, the @cache dictionary holds every result simultaneously until the entire recursion finishes. dp[i][j] is still in the cache when you are computing dp[i+1][k], because Python cannot know whether a future recursive call will need it. Discarding it would require a carefully designed eviction strategy. Nobody does this. They convert to tabulation first.

Interval DP: Where Tabulation's Explicit Order Helps

Problems like matrix chain multiplication or burst balloons split an interval [i, j] at some pivot k, combining two sub-intervals whose answers you have already computed. The recurrence is:

dp[i][j] = min over k in (i, j):  dp[i][k] + dp[k+1][j] + cost(i, j, k)

The correct fill order for tabulation is by interval length: solve all intervals of length 1, then length 2, and so on:

for length in range(2, n + 1): # outer loop: interval length for i in range(n - length + 1): # inner loop: starting position j = i + length - 1 dp[i][j] = min( dp[i][k] + dp[k + 1][j] + cost(i, j, k) for k in range(i, j) )

Upper-triangular interval DP table with 5 diagonals color-coded by interval length: green for base cases on the main diagonal, then blue, purple, gold toward the top-right answer cell Fill along the diagonals. Each cell at length L reads only cells at length less than L. The answer sits at dp[0][n-1].

Memoization works here too, and the recursion is often the first version you write when working out the recurrence. The length-based outer loop in tabulation makes the dependency order explicit and harder to mess up. You cannot accidentally access a cell before its dependencies are ready; the loop structure prevents it.

When to Use Which

Neither approach is universally better. A practical decision rule:

  • Start with memoization when the recursion is natural and the state space is unclear. Write the recurrence first. @cache is then a one-line addition.
  • Switch to tabulation when you need space optimization, when n exceeds around 900 in Python, or when the fill order is simple and explicit (linear DP, interval DP by length).
  • Prefer tabulation for interval DP: the length-based sweep is standard and prevents ordering mistakes.
  • Prefer memoization for graph DP or tree DP: when the state space is the structure of a DAG you are traversing, recursion fits the shape of the problem more naturally.

One pattern from competitive programming: write memoization first to get the transition right, then convert to tabulation and apply rolling arrays if the memory limit is tight. The conversion is mechanical once the recurrence is correct.

If you want to understand when DP applies at all before choosing between these two implementations, the when to use dynamic programming guide covers the overlapping-subproblems and optimal-substructure tests in detail.

Recap

  • Memoization: top-down recursion plus a dictionary cache. Easy to write from a recursive definition. Accumulates function call overhead. Python's 1000-frame recursion limit applies.
  • Tabulation: bottom-up iteration plus a pre-allocated table. Faster constant factors when all subproblems are needed. Requires reasoning about fill order upfront.
  • Both have the same asymptotic complexity. Time is O(states × work per state).
  • The decisive asymmetry: memoization holds every computed state simultaneously. Tabulation can discard states once they are no longer needed, which is why space-optimized DP is almost always written bottom-up.
  • Memoization wins for sparse state spaces. Tabulation wins when you need to compress memory below O(total states).

Knowing which to reach for, and being able to explain the rolling-array argument out loud when an interviewer pushes back, is what separates a good answer from a strong-hire answer. If you want to practice that explanation under interview conditions, SpaceComplexity runs voice-based mock DSA interviews with rubric-based feedback on reasoning like this.

Further Reading