Deep Copy vs Shallow Copy: The Bug Every Interview Hides

You write a backtracking solution. It runs. It produces output. You look at the results array and every single entry is identical. Same list, repeated. You stare at your logic. The logic is fine. Your algorithm is correct. Your variable names are even meaningful, which is already better than most.

The bug is in a concept from your first week of CS that you thought you'd fully absorbed. Turns out you hadn't. Neither had most of us.

This is the deep copy vs shallow copy trap. Understanding what copy operations actually do at the memory level is what separates a 20-minute debug session from a 3-hour one where you start questioning your career.

b = a Is Not a Copy. It's Two Names for One Thing.

This confusion is the root of everything else.

a = [1, 2, 3]
b = a
b.append(4)
print(a)  # [1, 2, 3, 4]

You named a variable b. You felt organized. And then a changed anyway.

Think of it like this: you have one cat. You call it Whiskers. Your roommate calls it Mittens. Your roommate shaves the cat. Whiskers is now bald. You have been sharing a cat this whole time. There was never a second cat.

When you write b = a, you didn't copy anything. You created a second name pointing at the same object.

Both a and b are aliases for one list in memory. Mutate through either name and you see the change through both. This is not a Python quirk. It's how reference semantics work in nearly every modern language. The variable holds an address, not a value.

This becomes dangerous the moment you start passing lists into functions or collecting results in loops. See pass by value vs pass by reference for a deeper look at how this plays out across function boundaries.

Shallow Copy: New Box, Borrowed Contents

A shallow copy creates a new container object, but the elements inside it are still the same references.

import copy

original = [[1, 2], [3, 4]]
shallow = copy.copy(original)

shallow[0].append(99)
print(original)  # [[1, 2, 99], [3, 4]]
print(shallow)   # [[1, 2, 99], [3, 4]]

The outer list is different. shallow is not original evaluates to True. You got a new box. But the inner lists are shared. Mutate one and you mutate both, because you never actually made copies of those inner lists.

Shallow copy is one level deep. The container is new. Everything inside is borrowed.

Common shallow copy operations in Python: list.copy(), a[:], copy.copy(), and the list constructor list(a). They all behave identically for this purpose. a[:] looks like it's doing something thorough. It is not.

Shallow vs Deep Copy memory layout

Shallow copy gets you a new outer list but the inner objects are still shared. Mutate them and both sides feel it.

Deep Copy Actually Does What You Think Copy Does

A deep copy walks the entire object graph recursively and duplicates every object it finds. The result is completely independent of the original.

import copy

original = [[1, 2], [3, 4]]
deep = copy.deepcopy(original)

deep[0].append(99)
print(original)  # [[1, 2], [3, 4]]  -- untouched
print(deep)      # [[1, 2, 99], [3, 4]]

No shared references anywhere in the tree. copy.deepcopy() is the only standard Python call that guarantees full independence for arbitrarily nested structures.

The danger lives entirely in mutable objects. You can't mutate an integer or a string, so sharing references to those is fine. The copy distinction only matters when inner objects can change.

The One-Byte Bug That Kills Your Backtracking Solution

This is the interview landmine. Interviewers have seen it hundreds of times. They watch for it specifically. Here it is:

def subsets(nums):
    result = []
    path = []

    def backtrack(start):
        result.append(path)  # BUG: appends the same list object every time
        for i in range(start, len(nums)):
            path.append(nums[i])
            backtrack(i + 1)
            path.pop()

    backtrack(0)
    return result

Run this and result ends up as a list of identical empty lists. Every entry points to the same path object. By the time backtracking finishes, path has been mutated back to empty, and all your stored references reflect that final state.

Your algorithm was correct. Your copy was wrong.

The fix is two characters: result.append(path[:]). That [:] creates a shallow copy of path at the moment you append. Since path contains integers (immutable), shallow is enough here.

def backtrack(start):
    result.append(path[:])  # correct: snapshot of current state
    for i in range(start, len(nums)):
        path.append(nums[i])
        backtrack(i + 1)
        path.pop()

This pattern shows up in every backtracking problem: subsets, permutations, combinations, word search. See backtracking algorithm for the full template, and backtracking bugs for a taxonomy of the most common mistakes beyond this one.

When You Don't Need the Nuclear Option

Deep copy isn't always the right call. It's slower and uses more memory. You need it only when inner objects are mutable and you plan to mutate them after copying.

If your data structure holds only primitives (integers, strings, booleans), shallow copy is fine. The references are shared, but you can't mutate an integer, so there's no problem. Most backtracking problems fall into this category because paths contain integers or characters.

Use shallow copy when elements are immutable. Use deep copy when elements are mutable objects you'll modify independently after the copy.

A 1D list of numbers? Shallow is fine. A list of lists where you'll modify the inner lists separately? Deep copy. A graph node with nested children? Deep copy.

What This Actually Costs You

Shallow copy of a container with n top-level elements is O(n) time and O(n) space. You copy exactly the top-level references.

Deep copy of a structure with N total objects (across all nesting levels) is O(N) time and O(N) space. Every object in the graph gets duplicated.

In backtracking, you take a snapshot at each node of the recursion tree, so the total copy cost sums over all O(2^n) or O(n!) states, which is already dominated by the algorithm's own complexity. The copy cost doesn't change your asymptotic complexity for backtracking, but it adds a constant multiplier. For very deep or wide nested structures, copy.deepcopy() can be surprisingly slow because it handles cycles and complex object graphs with a memo dict internally.

One Problem, Every Language's Opinion

Python

import copy
shallow = copy.copy(nested_list)   # or nested_list[:]
deep    = copy.deepcopy(nested_list)

a[:] is shallow. Always. The slice syntax looks thorough. It is not.

JavaScript

// Shallow
const shallow = { ...original };
const shallow2 = Object.assign({}, original);

// Deep
const deep = structuredClone(original);  // modern, handles cycles
const deep2 = JSON.parse(JSON.stringify(original));  // breaks on Date, undefined, circular refs

structuredClone() is the right tool for deep copies in JS now. JSON.parse(JSON.stringify()) works until it doesn't. It silently drops undefined, explodes on circular references, and turns Date objects into strings. It's the duct-tape approach: fine until someone finds the duct tape.

Java

// 1D primitive array: deep copy
int[] copy = Arrays.copyOf(original, original.length);

// 2D array: must copy row by row
int[][] deep = new int[original.length][];
for (int i = 0; i < original.length; i++) {
    deep[i] = Arrays.copyOf(original[i], original[i].length);
}

Java has no general-purpose deep copy. The Cloneable interface is notoriously broken for complex objects. It's been broken since Java 1.1 and remains broken to this day as a monument to early API design decisions. For anything beyond primitive arrays, most Java developers use serialization or copy constructors.

C++

// std::vector copy constructor does deep copy for value types
std::vector<int> v2 = v1;  // completely independent

// But raw pointers in structs are shallow-copied by default
struct Node { int val; Node* next; };
Node a = {1, nullptr};
Node b = a;  // b.next and a.next point to the same Node

std::vector and most STL containers do the right thing for value types. Raw pointers require a custom copy constructor. If you forget, the compiler won't warn you, and the bug will appear at runtime in the most inconvenient place possible.

// Slice header copy: shares backing array
b := a  // b and a share the same underlying array

// Independent copy using the copy builtin
b := make([]int, len(a))
copy(b, a)  // now b is independent

In Go, a slice is a header (pointer, length, capacity) over a backing array. Assigning a slice copies the header, not the array. Always use copy() when you need an independent slice in Go. The distinction trips up nearly every developer coming from Python or JavaScript.

The Three Places This Ruins Your Interview

Backtracking result collection

Already covered above. Always path[:] when appending to results. The bug is so common it has its own section in most backtracking guides.

DP with memoization on mutable state

If your memo key is a mutable object (like a list), you'll get aliasing bugs in your memo table. The fix is converting to a tuple before using it as a dict key:

memo = {}
def dp(state):
    key = tuple(state)  # immutable, hashable, safe as dict key
    if key in memo:
        return memo[key]
    # ... compute result ...

Graph clone (LeetCode 133)

Cloning a graph requires a visited map to handle cycles. The map doubles as your deep copy mechanism:

def cloneGraph(node):
    if not node:
        return None
    visited = {}

    def clone(n):
        if n in visited:
            return visited[n]
        copy = Node(n.val)
        visited[n] = copy
        for neighbor in n.neighbors:
            copy.neighbors.append(clone(neighbor))
        return copy

    return clone(node)

The visited dict is both cycle detection and your "already cloned" cache. Without it, cyclic graphs recurse forever.

If you want to practice catching this bug under time pressure, SpaceComplexity runs voice-based mock interviews where you have to both spot the error and explain why it happens out loud. That explanation step is where the real understanding shows. Anyone can fix path to path[:]. Fewer people can say why without hesitation.