What Is a Pure Function? The Property Behind Memoization

Your memoized solution is producing wrong answers on test case 7 of 10. The cache is populated. The logic looks right. You've stared at it for twenty minutes. Then you notice: the function reads a global variable. And now you understand everything.

A pure function always returns the same output for the same input and causes no side effects. Two rules. Both required. Miss either one and your caching breaks, your DP solution produces wrong answers, and your complexity analysis stops holding.

This sounds abstract until you see what happens when you violate it. Then it explains an entire class of bugs that are hard to reproduce and even harder to track down.

Same Input, Same Output

Given the same arguments, a pure function returns the same result. Always. No reading global variables, no depending on the current time, no randomness.

# Pure: depends only on its arguments
def add(a, b):
    return a + b

add(2, 3)  # 5, every single time

# Impure: depends on external state
total = 10

def add_to_total(x):
    return total + x

add_to_total(5)  # 15
# total = 99
add_to_total(5)  # 104, same call, different result

The second function is not wrong, but it is not pure. Its output depends on something outside its arguments. That means you cannot cache it, memoize it, or reason about it in isolation. It is, effectively, a random number generator wearing a function costume.

No Side Effects

A side effect is anything the function does beyond returning a value. Modifying a global variable, mutating an argument, writing to a file, printing to the console, making a network request: all side effects.

# Pure: returns a new list, leaves the original untouched
def square_all(nums):
    return [x * x for x in nums]

original = [1, 2, 3]
result = square_all(original)
# original is still [1, 2, 3]
# result is [1, 4, 9]

# Impure: mutates the argument the caller passed in
def square_all_in_place(nums):
    for i in range(len(nums)):
        nums[i] = nums[i] * nums[i]

original = [1, 2, 3]
square_all_in_place(original)
# original is now [1, 4, 9], silently changed

The in-place version is faster (no new list allocation). That is a real tradeoff. It also silently modified your data while you were busy trusting it not to. This is the programming equivalent of borrowing a book and returning it highlighted in red.

The practical test: call the function twice with identical arguments. If the second call behaves differently, something impure is happening.

Why Memoization Requires Pure Functions

This is the coding interview payoff. Memoization works by caching the return value of a function call and replaying it the next time the same arguments appear. That strategy only holds if the cached value is correct the second time, which requires the function to be pure.

from functools import lru_cache

@lru_cache(maxsize=None)
def fib(n):
    if n <= 1:
        return n
    return fib(n - 1) + fib(n - 2)

lru_cache stores fib(10) = 55 and returns that immediately next time. This works because fib is pure: n = 10 always produces 55. If fib read from a counter that changed with each call, you would cache a wrong answer, replay it hundreds of times, and spend an afternoon debugging what looks suspiciously like a compiler bug.

Every dynamic programming solution depends on this. Overlapping subproblems are only worth caching because the subproblem answer does not change between calls. The dynamic programming framework shows how this principle anchors the broader DP pattern. And the reason memoized recursion collapses from O(2^n) to O(n) is that we compute each pure subproblem exactly once and never repeat it.

Referential Transparency: Your Function Call Becomes a Value

The property pure functions give you is called referential transparency. A call is referentially transparent if you can replace it with its return value and the program behaves identically.

result = add(2, 3) + add(2, 3)

# Referentially transparent, so this is equivalent:
result = 5 + 5

Compilers exploit referential transparency to eliminate redundant computations. Testing becomes trivial: same input, same output, nothing external to set up or tear down. You can also explain it clearly in an interview, which is worth something.

# Not referentially transparent
import random
result = random.randint(1, 10) + random.randint(1, 10)
# You cannot replace either call with a constant and get a correct program

Where Purity Bites You in Interviews

You will rarely be asked to define a pure function in an interview. You will often write impure functions by accident and then wonder why your solution fails test case 7 of 10.

The most common trap: mutating a list or dictionary argument. The caller's data changes under them, and the bug only surfaces on specific inputs that happen to share that object.

# Bug: appends to the same default list on every call
def add_item(item, items=[]):
    items.append(item)
    return items

add_item(1)  # [1]
add_item(2)  # [1, 2], the first call's list came back
add_item(3)  # [1, 2, 3]

Python evaluates default mutable arguments once at function definition time. Not once per call. Once, ever. The list is shared across every invocation that uses the default, and it quietly accumulates state across calls like a variable with a grudge. This surprises roughly 100% of developers the first time they see it, including, historically, the person who wrote the tutorial that taught them Python.

The fix is to make it pure:

def add_item(item, items=None):
    if items is None:
        items = []
    return items + [item]  # new list every time, no mutation

Backtracking is another common site of accidental mutation. Append to a path and forget to remove it before exploring the next branch, and you are leaking shared state across recursive calls. Lists and dicts in Python are passed by reference, so mutations inside a function affect the caller's object. For the broader picture on when mutation is expected versus when it is a bug, see mutable vs immutable data structures.

Pure function thinking also helps you recognize cacheable subproblems. Before you apply memoization, ask: given the same arguments, will this function always return the same value? If yes, cache it. If no, redesign before the cache makes things worse.

When Impure Is the Right Call

Most useful programs need side effects. You need I/O to produce output. You need mutation for in-place algorithms that trade purity for O(1) space. You need randomness for shuffle algorithms and randomized data structures.

The goal is not to eliminate all impurity. Isolate it. Keep computation logic in pure functions and push I/O, mutation, and randomness to the outer edges. In interviews, in-place array sort is deliberately impure to save space. That is a valid design decision, and saying so out loud signals that you actually understand the tradeoff rather than just reciting the algorithm.

One Structure, Every Language

Same concept across all fourteen major interview languages. Each example shows a pure function and an impure version that violates either the determinism rule or the no-side-effects rule.

# Pure
def multiply(x: int, y: int) -> int:
    return x * y

# Impure (reads and writes global state)
counter = 0

def multiply_and_count(x: int, y: int) -> int:
    global counter
    counter += 1
    return x * y

Key Takeaways

A pure function has exactly two properties: same inputs always produce the same output, and the function causes no side effects
Memoization requires pure functions because it replays cached return values without re-running the function
Every dynamic programming optimization relies on subproblems being pure
Common impurity traps in interviews: mutating a list argument, reading global state, Python's mutable default arguments, Go slice aliasing
Impure functions are not wrong. Isolate them to the edges of your code and keep computation logic pure

Explaining tradeoffs like these out loud, under pressure, is a different skill from understanding them on paper. SpaceComplexity runs voice-based mock interviews with rubric-scored feedback on how clearly you reason through design decisions, complexity tradeoffs, and edge cases.