Python Interview Cheat Sheet: Built-ins, Patterns, Gotchas

Python is the interview language of choice for a reason. It's not that Python developers are smarter. It's that they're done ten minutes before the Java people finish typing ArrayList<Integer> list = new ArrayList<Integer>();.

But that advantage evaporates if you're reimplementing things Python already gives you. Know which built-in to reach for and your implementation time drops in half, with code readable enough to explain out loud. This cheat sheet covers every stdlib tool that shows up in DSA interviews, the patterns that make Python fast to write, and the gotchas that quietly wreck otherwise correct solutions.

The List Is Already a Dynamic Array

Python's list is a dynamic array. Amortized O(1) append, O(1) random access. You get a stack for free.

stack = []
stack.append(x)    # push, O(1)
top = stack.pop()  # pop, O(1)
top = stack[-1]    # peek without removing

list.pop(0) and list.insert(0, x) are both O(n) because every element shifts. Anything queue-shaped needs deque.

A few patterns that come up constantly:

sorted(nums)        # returns new sorted list, stable, O(n log n)
nums.sort()         # in-place, same complexity
nums[::-1]          # reversed copy, O(n)
nums.reverse()      # in-place, O(n)
[0] * n             # list of n zeros, O(n)

One trap bites 2D grid problems every few weeks. [[0] * n] * m looks like m independent rows. It isn't. It's m pointers to the same row. Mutate one cell and you've mutated the whole column. The bug is invisible until you run your grid code on a 3x3 example and half the board is wrong for no apparent reason.

# Wrong: all rows are the same object
grid = [[0] * 3] * 3
grid[0][0] = 1  # grid[1][0] and grid[2][0] are also 1 now

# Right
grid = [[0] * 3 for _ in range(3)]

collections: The Four That Matter

deque

collections.deque is a doubly-linked list with O(1) operations at both ends. Use it everywhere you'd use a list as a queue.

from collections import deque

q = deque()
q.append(x)      # enqueue right, O(1)
q.appendleft(x)  # push left, O(1)
q.pop()          # dequeue right, O(1)
q.popleft()      # dequeue left, O(1)

BFS with list.pop(0) is a silent O(n²) bug. You won't notice it on a 10-node graph. Your interviewer will notice it on a 10,000-node graph. With deque.popleft() it's O(V+E) as intended. See BFS vs DFS for how this plays out across graph problems.

The other useful feature: maxlen. A deque with a fixed max length evicts from the opposite end automatically, making fixed-size sliding windows trivial without manual index math.

window = deque(maxlen=k)  # oldest element dropped when full

That's the backbone of the monotonic deque approach from sliding window problems.

Counter

Counter is a dict subclass that counts hashable objects. It eliminates most frequency-map boilerplate.

from collections import Counter

freq = Counter("abracadabra")   # {'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1}
freq = Counter([1, 2, 2, 3])    # {2: 2, 1: 1, 3: 1}

freq.most_common(k)   # top-k elements as [(element, count), ...]
freq[missing_key]     # returns 0, never KeyError
freq.total()          # sum of all counts (Python 3.10+)

Counters support arithmetic. The - operator drops zero and negative counts, which makes "minimum steps to make two strings anagrams" a two-liner. Two lines. Interviewers remember two-liners.

c1 = Counter(s)
c2 = Counter(t)
steps = sum((c1 - c2).values()) + sum((c2 - c1).values())

defaultdict

defaultdict(factory) returns a default value instead of raising KeyError. The main win is building adjacency lists without the existence check.

from collections import defaultdict

graph = defaultdict(list)
for u, v in edges:
    graph[u].append(v)
    graph[v].append(u)

Common factories: list, set, int, lambda: 0, lambda: float('inf'). For counting without Counter: defaultdict(int) means d[key] += 1 just works.

OrderedDict

Plain dict in Python 3.7+ preserves insertion order, so OrderedDict is mostly obsolete. The one remaining interview use case is LRU cache. Two methods make it work in O(1): move_to_end(key) marks a key as most recently used, and popitem(last=False) evicts the least recently used.

from collections import OrderedDict

cache = OrderedDict()
cache[key] = value
cache.move_to_end(key)          # O(1), mark as recently used
cache.popitem(last=False)       # O(1), evict LRU entry

Screenshot of a coding video comment section where viewers are outraged at a brute-force O(n²) nested-loop solution, with comments reading "there's no way this answer is acceptable, it's literally just brute forcing the problem"

The comment section when you use list.pop(0) in BFS.

heapq Is Min-Heap Only. Negate for Max.

heapq operates on a plain list. No heap class. Push is O(log n), pop is O(log n), peek is O(1) via heap[0].

import heapq

heap = []
heapq.heappush(heap, val)      # O(log n)
min_val = heapq.heappop(heap)  # O(log n)
min_val = heap[0]              # peek, O(1)
heapq.heapify(nums)            # convert list to heap in O(n), in-place
heapq.nsmallest(k, nums)       # k smallest, O(n log k)
heapq.nlargest(k, nums)        # k largest, O(n log k)

Push -val, pop and negate the result. That's your max-heap. Yes, this feels wrong. Do it anyway.

heapq.heappush(heap, -val)
max_val = -heapq.heappop(heap)

For custom objects, push tuples. Tuples compare lexicographically, so the first element acts as priority. If two priorities are equal and the second element isn't comparable, you'll get a TypeError. Add a monotone counter as a tie-breaker.

import itertools
counter = itertools.count()
heapq.heappush(heap, (priority, next(counter), item))

The heap data structure article covers the array representation and sift operations if you want the full picture.

bisect: Binary Search, Already Written

bisect gives you O(log n) binary search on any sorted list.

import bisect

a = [1, 3, 5, 7, 9]

bisect.bisect_left(a, 5)   # 2: index of 5, or where it would go
bisect.bisect_right(a, 5)  # 3: one past the rightmost 5
bisect.insort(a, 6)        # insert 6 maintaining sorted order

bisect_left returns the index of the element itself (if present). bisect_right returns one past it. They differ only when the value already exists in the list.

To check if a value exists:

idx = bisect.bisect_left(a, x)
exists = idx < len(a) and a[idx] == x

To count elements strictly less than x: bisect.bisect_left(a, x). To find the first element >= x: a[bisect.bisect_left(a, x)].

bisect.insort maintains sorted order but still does O(n) shifting on insert. For frequent inserts and searches both, you need a balanced BST or a sorted container. The invariant reasoning is the same loop-invariant logic in binary search. If you find yourself reimplementing binary search by hand in an interview, reach for bisect first.

@lru_cache Turns Recursion Into DP

@functools.lru_cache(maxsize=None) memoizes any function on its arguments. Use @cache (Python 3.9+) as a shorthand. A single decorator converts any recursive solution into top-down DP with no extra data structure needed.

from functools import lru_cache

@lru_cache(maxsize=None)
def dp(i, remaining):
    # base cases and recursive relation
    ...

lru_cache hashes arguments. Lists aren't hashable. Convert them to tuples before passing. If you pass a list and get a TypeError, this is why.

@lru_cache(maxsize=None)
def dp(i, path):  # path must be a tuple, not a list
    ...

Total time is O(unique states * work per state). Memoization time complexity has the full derivation.

functools.cmp_to_key converts an old-style comparator into a key function. Useful when sort order involves a non-trivial relationship between elements.

from functools import cmp_to_key

def compare(a, b):
    return int(a + b) - int(b + a)  # largest number concatenation order

nums.sort(key=cmp_to_key(compare))

Python's default recursion limit is 1000 frames. Any tree or graph problem with depth beyond a few hundred will hit RecursionError. Add this once at the top of your solution file and forget about it:

import sys
sys.setrecursionlimit(10**6)

Gotchas That Kill Candidates

Forum post titled "Found Bug in Sorted Built in Function of Python" from an ML Engineer who is sorting strings like "9%", "83%", "25%" and confused that Python returns lexicographic order instead of numeric order

Not a Python bug. A you bug. This section is for the you bugs.

Modulo on negatives is different from Java and C++. In Python, -13 % 3 is 2. In Java, C++, and Go, it's -1. Python always returns a non-negative result when the divisor is positive. Switch languages mid-prep and this will catch you exactly once. If you need Java-style behavior: ((x % m) + m) % m.

Integers don't overflow in Python. They're arbitrary precision. Write low + (high - low) // 2 anyway. It signals that you know the history.

String concatenation in a loop is O(n²). Strings are immutable. Every result += ch allocates a new string. Use ''.join(parts) instead.

# Slow: O(n²) allocations
result = ""
for ch in chars:
    result += ch

# Fast: O(n)
result = "".join(chars)

is vs == for integers. CPython caches integers from -5 to 256. For values in that range, a is b may return True even for independent variables. For values outside it, a is b is False even when a == b. Always use == for value comparison. is is for identity checks and is None only.

Mutable default arguments are shared across calls. This one ruins your afternoon.

# Wrong: list persists between calls
def append_to(val, lst=[]):
    lst.append(val)
    return lst

# Right
def append_to(val, lst=None):
    if lst is None:
        lst = []
    lst.append(val)
    return lst

Shallow copy vs deep copy in 2D structures. grid[:] copies the outer list but leaves inner lists as references. For a true independent copy: [row[:] for row in grid] or copy.deepcopy(grid).

in on a list is O(n). For repeated membership checks, convert to a set first. The difference matters at n = 10^5 with a nested loop.

Python Interview Cheat Sheet: Quick Reference

Need	Tool	Complexity
Stack	`list` + `.append()` / `.pop()`	O(1)
Queue	`deque` + `.append()` / `.popleft()`	O(1)
Fixed-size window	`deque(maxlen=k)`	O(1)
Frequency map	`Counter`	O(n) build
Graph adjacency list	`defaultdict(list)`	O(1) add
LRU eviction	`OrderedDict` + `move_to_end`	O(1)
Min-heap	`heapq.heappush` / `heappop`	O(log n)
Max-heap	`heapq` with negated values	O(log n)
Top-k elements	`heapq.nlargest(k, iterable)`	O(n log k)
Binary search (sorted list)	`bisect.bisect_left` / `bisect_right`	O(log n)
Sorted insert	`bisect.insort`	O(log n) + O(n) shift
Memoization	`@functools.lru_cache(maxsize=None)`	O(1) per cached call
Custom sort key	`functools.cmp_to_key(compare)`	O(n log n) sort
Combinations / permutations	`itertools.combinations`, `permutations`	O(C(n,k)) or O(n!)

When Python Actually Hurts

Python is the right call for most interviews. The builtins are expressive, the code is readable, and interviewers aren't benchmarking wall-clock time. Python runs about 30x slower than C++, but that almost never matters at n = 10^5 with a reasonable algorithm.

Two places where it genuinely costs you.

Sorted multisets. Java has TreeMap. C++ has std::multiset. Python's stdlib has nothing equivalent. If a problem needs ordered insertions, rank queries, or "kth smallest in a stream," you either reach for sortedcontainers.SortedList from PyPI (which may not be available on the interview platform) or you spend 10 minutes implementing a segment tree or Fenwick tree from memory while silently screaming. Python candidates lose time here that Java or C++ candidates never spend.

Extremely tight time limits. Standard interviews almost never hit this. But for competitive-programming-adjacent companies (Jane Street, Citadel, some Google rounds), an O(n log n) Python solution can TLE where C++ passes. Know your environment before you commit to a language.

For everything else: Python's brevity means working code faster, which means more time on trade-offs and follow-ups. That's where offers are won.

If you want to practice both the code and the narration at the same time, SpaceComplexity runs voice-based DSA mock interviews with rubric feedback across coding, communication, and problem-solving. The gap between solving something quietly and explaining it clearly under pressure is where most candidates lose points.