Python Recursion Bugs: Your Solution Compiles. It's Still Broken.

Recursion clicks and you feel like a wizard. Ten lines, one mental trace, a leap of faith, the whole thing works. Then you submit and get a list of eight identical empty arrays.

You trace through it again. Looks right. You add print statements inside the function. They print the right thing every time. You stare at the output for two minutes. It is wrong. The code is correct. The output is wrong. You are fine.

These are not the obvious bugs. Not "forgot the base case." You know that one. These are the subtle ones that survive a mental trace, pass your examples, and only surface when you run the actual code. In an interview. In front of someone.

The List Reference That Corrupts Your Entire Output

This one produces the most baffling output of any Python bug. You build a path list as you recurse. When you hit a valid solution, you append it to results. Sensible plan.

def subsets(nums):
    result = []
    path = []

    def dfs(index):
        result.append(path)   # bug: appending the reference
        for i in range(index, len(nums)):
            path.append(nums[i])
            dfs(i + 1)
            path.pop()

    dfs(0)
    return result

Run this. You get [[], [], [], [], [], [], [], []]. Eight empty lists. Print path inside the function and it shows the right thing at every step. The prints lie. The output tells the truth.

When you append path instead of path[:], every entry in result points to the same list object. The list mutates as you recurse. By the time dfs returns, all eight entries in result are pointing at the same empty list, because that's what path looks like after the last backtrack.

The fix is one character: result.append(path[:]). That creates a shallow copy at the current moment in time. One character. Possibly hours of confusion. This is software.

Diagram showing all result entries pointing to the same path object on the left (broken), versus each entry holding its own copy on the right (fixed)

The same bug appears in Java: result.add(path) vs result.add(new ArrayList<>(path)). Same cause, different syntax. The root is Python's object model, covered in depth at pass by value vs reference in Python.

The Return Value You Silently Threw Away

This one is quiet. No crash, no error message. Just wrong answers and a growing sense that something is off.

def max_depth(node):
    if node is None:
        return 0
    max_depth(node.left)    # computed, immediately discarded
    max_depth(node.right)   # computed, immediately discarded
    return 1                # always 1, for every node, forever

The recursive calls run. You just never asked what they returned. This function reports every non-null node as depth 1, regardless of how deep the tree goes. It is technically doing work. Very useless work.

The recursive call ran. You ghosted it.

This happens because preorder problems (print every node, serialize a tree) genuinely don't need return values. You write enough of those, and not capturing returns becomes muscle memory. Then you hit a postorder problem where the result has to bubble up from children, and your muscle memory betrays you.

def max_depth(node):
    if node is None:
        return 0
    left = max_depth(node.left)
    right = max_depth(node.right)
    return max(left, right) + 1

Diagnostic question: does my answer at this node depend on what the subtree returns? If yes, capture the return value.

Your Base Cases Have an Order Problem

Missing a base case is obvious. Having the right base cases in the wrong order is less obvious, and produces identical symptoms: infinite recursion or silently wrong output.

If your function reduces input by 2 each call, you need two base cases. One for even inputs landing on 0, one for odd inputs landing on 1.

def sum_to_zero(n):
    if n == 0:
        return 0
    return n + sum_to_zero(n - 2)  # sum_to_zero(1) -> sum_to_zero(-1) -> ...

This works for even inputs and recurses infinitely on odd ones. Fibonacci has the same structure: both n == 0 and n == 1 are required because fib(1) calls fib(-1) without the second guard. Your function is fine. Your input is not even. These are not the same problem.

Ordering is a separate failure mode. A success check placed after a failure check can silently discard valid results. In subset sum:

def dfs(index, remaining):
    if index == len(nums):   # checked first: exhausted array
        return remaining == 0
    if remaining == 0:       # never reached when hitting target at last element
        return True

Swap those two. The target-hit condition should come first, because you can reach exactly zero on the last element, and checking array exhaustion first means you return False when the answer is True.

The Undo Step Backtracking Requires

Backtracking has a rhythm: mark, recurse, unmark. The mark and recurse are obvious. The unmark is where people slip, and it's where the output turns into complete nonsense.

def permutations(nums):
    result = []
    used = [False] * len(nums)

    def dfs(path):
        if len(path) == len(nums):
            result.append(path[:])
            return
        for i in range(len(nums)):
            if used[i]:
                continue
            used[i] = True
            path.append(nums[i])
            dfs(path)
            # missing: path.pop()
            # missing: used[i] = False

    dfs([])
    return result

Without the undo steps, branch two inherits the state from branch one. Mark before recursing. Unmark after. Every time, without exception. This applies to boolean arrays, sets, path lists, and any external state you're threading through.

used[i] = True
path.append(nums[i])
dfs(path)
path.pop()
used[i] = False

Every mutation has a mirror. If you're writing one side without the other, that's the bug. The cleanup has to happen regardless of what the recursive call did.

Three Memoization Landmines

The first is the mutable default argument. Python evaluates default parameter values once, at function definition time, not at each call.

def fib(n, memo={}):   # same dict object across every call, forever
    if n <= 1:
        return n
    if n not in memo:
        memo[n] = fib(n - 1, memo) + fib(n - 2, memo)
    return memo[n]

memo={} persists across separate invocations. In an interview this is usually invisible. In a LeetCode Solution class with multiple test cases, later cases inherit the earlier cache and produce wrong results. Fix: if memo is None: memo = {}, or use @functools.lru_cache and let Python handle it.

The second trap: lru_cache requires all arguments to be hashable. Lists are not. If you write a cached DFS that takes a path list, you get a TypeError on the first call. Convert to tuples before passing: dfs(tuple(path)).

The third, and the sneakiest: passing nums[1:] instead of an index. This allocates a new list at O(n) cost per call. Your "O(n)" solution is secretly O(n²) in memory. Pass an index. dfs(index + 1) costs nothing. dfs(nums[1:]) allocates a new list every frame, and at depth n you've burned 1 + 2 + ... + n bytes of memory.

Your Stack Has a Budget

Python's recursion limit is 1000 frames. Not 100,000. Just 1,000. This is a deliberate software limit set to prevent hitting the OS stack limit, which causes a segfault instead of a Python exception. Python is trying to protect you. It draws the line at 1,000.

What does this mean in practice? A balanced BST with 10,000 nodes has depth roughly 14. Fine. A skewed BST with 10,000 nodes has depth 10,000. Python crashes with RecursionError before you finish the traversal.

LeetCode problems with constraints like 1 <= n <= 10^4 on trees include degenerate test cases. If your recursive solution passes balanced examples and crashes on edge cases, this is why. The input is technically valid. Your solution just assumed the tree would be polite about its shape.

sys.setrecursionlimit(10**6) silences the RecursionError. It does not fix anything. Set it high enough to exhaust the real OS stack and you get a segfault, no error message, just a dead process. The actual fix is iterative DFS with an explicit stack on the heap. The space complexity math is in recursion space complexity.

Python also does not perform tail call optimization. Guido van Rossum documented this explicitly, arguing it destroys stack traces. So the accumulator pattern does not save you:

# Still creates n stack frames in Python, even though it's tail-recursive
def factorial(n, acc=1):
    if n == 0:
        return acc
    return factorial(n - 1, acc * n)

Functions with multiple recursive calls can never be tail-recursive regardless of language, because only one call can be the final action.

Postorder vs. Preorder: The Information Flow Bug

Two recursion models exist. Top-down (preorder): pass state downward through parameters. Bottom-up (postorder): aggregate results upward through return values. They are not interchangeable. Mixing them produces wrong answers and no error to point at the problem.

The tree diameter problem (LeetCode 543) is the canonical case. You need to track two things: the height of the current subtree (returned to the parent) and the diameter through the current node (compared against a global maximum).

def diameter_of_binary_tree(root):
    self.max_diameter = 0

    def height(node):
        if node is None:
            return 0
        left = height(node.left)
        right = height(node.right)
        self.max_diameter = max(self.max_diameter, left + right)
        return max(left, right) + 1

    height(root)
    return self.max_diameter

The function returns height (local value, bubbles up through return) while updating self.max_diameter (global comparison only, never returned). Candidates who try to return diameter instead of height break the contract the parent node relies on, and the whole thing collapses.

If your result depends on what your children return, the answer flows upward through return values, not downward through parameters.

Ask yourself: "Do I need to know something about the subtree before I can answer for the current node?" If yes, you're in postorder territory.

This is also exactly the kind of confusion that surfaces in real time when you're talking through your solution out loud. SpaceComplexity runs voice-based mock interviews that catch this before the actual interview does.

Python Recursion Bug Checklist

Reference copy bug: result.append(path[:]) not result.append(path)
Discarded return value: capture left = f(node.left), always
Base case ordering: success check before failure check; two bases for n-2 recurrences
Missing undo step: every mark has a mirror unmark in backtracking
Mutable default memo: if memo is None: memo = {}, not memo={}
Slice allocation: pass indices, not nums[1:] (O(n²) space otherwise)
Recursion depth: skewed inputs hit the 1000-frame Python limit; know when to go iterative
Postorder vs. preorder: if the result depends on children, it flows upward through return values

Most of these survive a mental trace because you're checking one level at a time. The corruption shows up in aggregate. When your output looks structurally right but the values are wrong, start here.