Debugging in a Coding Interview: Your Approach Was Right. Your Code Has a Bug. Now What?

You laid out the algorithm perfectly. The interviewer nodded. You started coding, ran a trace in your head, and submitted. Wrong answer.

Now your brain does what it always does under pressure: panic, immediately followed by activity that looks like debugging but is actually just thrashing. Your hand hovers over the keyboard, itching to change a < to a <= and pray to whatever deity watches over off-by-one errors. Do not do this. Random edits are the fastest way to turn a fixable bug into a mess you can no longer explain, in front of someone watching to see how you think under pressure.

Debugging in a coding interview is a procedure, not a stroke of luck. You can execute it calmly, out loud, in under five minutes.

The Wrong Instinct Is to Edit

When your code produces wrong output, the instinct is to modify something and re-run mentally. This is a greedy strategy and it fails for the same reason greedy algorithms fail: the local move (change one line) does not guarantee the global goal (correct program). You might stumble onto the fix, but the interviewer saw you thrash, not diagnose.

We have all done this at 2am on a personal project with no consequences. In an interview, the consequences are the interviewer's entire mental model of how you work.

What interviewers at companies like Google and Microsoft actually score is structured investigation, not speed. Interview rubrics explicitly penalize "lucky fixes" the candidate cannot explain. A slow, focused debugger consistently outperforms a fast, noisy one.

A tweet that reads "Am I testing my code or is it testing me" - the eternal question of every developer staring at a wrong answer

The feeling at minute three of an interview when the output still doesn't match.

Stop. Pick a Small Input. Run Your Code by Hand.

Before changing a single character, pick a small test case: three to five elements. Not the one in the problem statement, which is designed to make the approach obvious. Pick one that captures the edge of the problem: odd length, duplicates, all-same values, or whatever structural property your algorithm actually touches.

Then execute your code line by line like a dumb machine. Not like someone reading, which fills in assumptions and skips steps. Like a machine, which does exactly what the instruction says.

The physical trick that works: put your cursor (or finger on paper) on the current line. Do not move it until you have written down every variable's new value. Then move to the next line.

Write a simple trace table. Columns are your variables, rows are execution steps. Three columns for a two-pointer problem: left, right, s. Four rows for a four-element array. That is all it takes to catch the bug below.

# Bug: off-by-one on the condition
def two_sum_sorted(nums, target):
    left, right = 0, len(nums) - 1
    while left <= right:           # should be left < right
        s = nums[left] + nums[right]
        if s == target:
            return [left + 1, right + 1]
        elif s < target:
            left += 1
        else:
            right -= 1
    return []

Run this on [2, 7, 11, 15], target = 9. At iteration three, left = right = 1. The condition left <= right is True, so you enter the loop. nums[1] + nums[1] = 14. Not equal to 9, greater than 9, so right -= 1 gives right = 0. You exit next iteration. But the real danger is with duplicates like [3, 3], target = 6: you return [1, 1], claiming you used element 0 twice.

In the trace table, the bug is obvious by step three: left and right collide, and the condition lets you continue. Without the table, you read left <= right as "while we have room to check" and miss the equal case entirely.

The Coding Interview Debugging Checklist

When the dry run surfaces a wrong value, you know the bug is somewhere before that step. Now you narrow it down. These are the categories to check, roughly in order of frequency:

Off-by-one and loop bounds. Did you use < where you needed <=, or vice versa? Is the initial value one too high or too low? The fencepost problem is everywhere: ten fence segments need eleven posts. Check your loop's first and last iteration explicitly, not just the middle.

Wrong initial state. Is your accumulator initialized correctly? Is a pointer at zero when it should be at one? Is a running max initialized to 0 when inputs could all be negative? This one is especially common in "maximum subarray" style problems where a zero-initialized max silently corrupts the answer.

Update order. Did you use a variable and then update it, or update it and then use the old value? In a sliding window this bites constantly: if you update right before recording nums[right], you process a different element than intended.

Edge cases you did not test. Empty input. Single element. Two elements (especially important for two-pointer problems). All duplicates. All same value. The condition that triggers your edge-case branch should be tested explicitly, not assumed to work.

Integer overflow. In Java, C, or C++, this is silent and catastrophic. The canonical example: mid = (low + high) / 2 in binary search. If low = 1_000_000_000 and high = 1_500_000_000, their sum overflows a 32-bit integer. This bug hid in Jon Bentley's Programming Pearls for twenty years. Unnoticed. In a book about programming. By the person who designed the algorithm. It then stayed in the JDK binary search for nine more years after publication. The fix is mid = low + (high - low) / 2. Python integers do not overflow, but Java and C++ candidates hit this constantly.

Wrong comparator. If you passed a custom sort comparator, does it implement strict weak ordering? Using <= instead of < violates irreflexivity. In C++, this is undefined behavior. In Java, you get wrong output or an exception. Check that cmp(a, a) returns false and that the relation is transitive.

Mutation and aliasing. The hardest to spot without a trace.

The Bug That Hides in Plain Sight: Mutation

A classic backtracking bug:

def subsets(nums):
    result = []
    path = []

    def backtrack(start):
        result.append(path)          # BUG: appends the reference, not a copy
        for i in range(start, len(nums)):
            path.append(nums[i])
            backtrack(i + 1)
            path.pop()

    backtrack(0)
    return result

This looks correct. The path grows and shrinks as expected. But result.append(path) appends the same list object every time. By the time backtrack returns all the way, path is empty, so every entry in result is [].

You run your algorithm, it completes, and you inspect the output: a list of empty lists, uniform as a stack of resumes for a job that got cancelled. Dry run this with nums = [1, 2] and write the object identity of path at each append step. You see it immediately: every row in result points to the same address. The fix is result.append(path[:]) or result.append(list(path)), creating a snapshot instead of a reference.

This class of bug appears in backtracking, in graph algorithms where you pass a visited set and mutate it across recursive calls, and in dynamic programming when you try to cache a mutable container. The tell: all your collected results look the same at the end, or your memoization is returning stale values.

Narrate the Trace Out Loud

While you run the dry run, say it out loud. Not every micro-step, but the logical checkpoints: "At the top of iteration two, left is one and right is three. The sum is eleven, less than the target, so left moves to two. I expected right to stay at three. It did."

This is not performance for the interviewer. The act of linearizing your thoughts into sentences forces you to execute literally rather than read optimistically. It is rubber duck debugging where the duck happens to be the person deciding whether to hire you. The duck, in this case, is taking notes.

When you reach the first point where actual differs from expected, you have found the region of the bug. Say that too: "I expected left to be two here but it is three. Something went wrong in the previous update. Let me look at the condition again." The interviewer now knows you are reasoning from evidence, not guessing.

Anime meme: devs completely unable to debug their own code, but sprinting at full speed the moment a fellow dev says they can't debug theirs either

Narrating your bug out loud works the same way. Something about an audience makes the problem obvious.

The Loop That Ends the Search

Once you find the first wrong value in your trace, apply this three-step loop:

Form one hypothesis about what caused it. "I think the condition lets the pointers overlap."
Check only the code that relates to that hypothesis. Do not re-read the whole function.
Either confirm and fix, or rule it out and form the next hypothesis.

Do not change more than one thing at a time. If you patch two lines simultaneously and the output becomes correct, you do not know which fix did it. You cannot explain it. The interviewer asks why it works now, and you have to guess. That is a different kind of failure than the original bug.

One fix. One verify. Move on.

Before You Change Anything

The wrong instinct under pressure is to edit randomly. Resist it.
Pick a small, representative input and execute line by line with a trace table.
Check the usual suspects in order: loop bounds, initial state, update order, edge cases, integer overflow, wrong comparator, mutation and aliasing.
Narrate the trace out loud. The first point where actual diverges from expected is the bug's location.
Fix one thing at a time and verify. An unexplained fix is a different kind of failure.

Debugging is the skill that separates candidates who can code from candidates who can engineer. The algorithm being correct was the baseline. Fixing the implementation under pressure, calmly and systematically, is the part that gets you the offer.

If you want to practice this loop until it is automatic, try a few sessions on SpaceComplexity. The rubric-based feedback will tell you exactly where your debugging narration breaks down.