Data Structures Complexity Cheat Sheet: Every Big-O You Need

June 6, 202610 min read
dsaalgorithmsinterview-prepdata-structures
Data Structures Complexity Cheat Sheet: Every Big-O You Need
TL;DR
  • Heap build (heapify) is O(n), not O(n log n); most nodes sit near leaves where the sift-down work is trivial.
  • Hash map worst case is O(n) per operation when all keys collide; Python, Java, and Go randomize hash seeds to block HashDoS attacks.
  • Union-Find with both optimizations runs in O(α(n)) amortized, effectively O(1) for any input that can physically exist.
  • Unbalanced BSTs degrade to O(n); inserting a sorted sequence into a vanilla BST produces a linked list, not a tree.
  • Trie operations are O(k) where k is key length, independent of how many keys are stored, making prefix lookups effectively O(1).
  • Segment tree arrays need 4n slots, not 2n, when n is not a power of two.
  • Amortized O(1) and average O(1) are different guarantees; dynamic array append is amortized, hash map lookup is average.

You have a coding interview in two hours. You've been grinding for weeks. And your brain just decided that now is the perfect time to forget whether a heap insert is O(log n) or O(1). Classic move.

This is your cheat sheet. Every major data structure, every operation, the non-obvious traps that interviewers specifically love to probe. Skim it, internalize the surprises, go ace the interview.


Data Structures Complexity: The Master Reference Table

Data StructureOperationAverageWorstSpace
Dynamic ArrayAccessO(1)O(1)O(n)
SearchO(n)O(n)
Insert/Delete at endO(1) amortizedO(n)
Insert/Delete at middleO(n)O(n)
Singly Linked ListAccess / SearchO(n)O(n)O(n)
Insert/Delete at headO(1)O(1)
Insert/Delete at tailO(n)O(n)
Insert/Delete at tail (tail ptr)O(1)O(1)
Stack / Queue / DequePush / Pop / Enqueue / DequeueO(1)O(1)O(n)
SearchO(n)O(n)
Hash Map / Hash SetInsert / Delete / SearchO(1)O(n)O(n)
BST (unbalanced)Insert / Delete / SearchO(log n)O(n)O(n)
Balanced BST (AVL / Red-Black)Insert / Delete / SearchO(log n)O(log n)O(n)
Binary HeapInsertO(log n)O(log n)O(n)
Delete min/maxO(log n)O(log n)
Peek min/maxO(1)O(1)
Build from array (heapify)O(n)O(n)
TrieInsert / Search / DeleteO(k)O(k)O(n * ALPHABET_SIZE * k)
Graph (adjacency list)BFS / DFSO(V + E)O(V + E)O(V + E)
Graph (adjacency matrix)Edge lookupO(1)O(1)O(V²)
BFS / DFSO(V²)O(V²)
Union-FindFind / UnionO(α(n))O(α(n))O(n)
Segment TreeBuildO(n)O(n)O(n)
Range query / Point updateO(log n)O(log n)
Fenwick Tree (BIT)BuildO(n log n)O(n log n)O(n)
Prefix query / Point updateO(log n)O(log n)

Sorting Quick Reference

AlgorithmBestAverageWorstSpaceStable?
Merge SortO(n log n)O(n log n)O(n log n)O(n)Yes
Quick SortO(n log n)O(n log n)O(n²)O(log n) avgNo
Heap SortO(n log n)O(n log n)O(n log n)O(1)No
Counting SortO(n + k)O(n + k)O(n + k)O(k)Yes
Radix SortO(d(n+b))O(d(n+b))O(d(n+b))O(n+b)Yes
Tim SortO(n)O(n log n)O(n log n)O(n)Yes

Tim Sort is the default in Python (list.sort()) and Java (Arrays.sort for objects). Radix Sort's variables: d = number of digits, b = base (usually 10 or 256).


Six Things That Actually Trip People Up

The table is the easy part. These are the subtleties that separate candidates who passed the round from the ones who thought they did.

Heap Build Is O(n). Yes, Really.

Most people hear "heap" and think O(n log n). That's because inserting n elements one at a time costs O(log n) per insert, which gets you O(n log n) total. Totally reasonable guess. Also wrong for heapify.

Heapify, the algorithm that builds a heap from an existing unsorted array in-place, runs in O(n). The trick: heapify-down starts at index n/2 and works toward the root, and most nodes sit near the leaves where the sift-down is cheap. Only the root ever travels all log n levels. Half the nodes are leaves and do zero work. When you sum the series, it converges to O(n). Python's heapq.heapify() exploits this. If you use heapify in your solution, say "this is O(n), not O(n log n)" out loud. Interviewers notice.

For a deeper look, see Heap Data Structure.

Hash Map O(1) Is a Useful Fiction

The O(1) average is real and it's what you depend on every day. But the theoretical worst case is O(n), because every key can hash to the same bucket and turn into a linked list.

Modern runtimes defend against this: Python 3.3+ randomizes hash seeds per-process (PYTHONHASHSEED), Java HashMap adds extra bit mixing, and Go uses a per-map random seed. Without that randomization, an adversary who knows your hash function can craft inputs that all collide. This actually happened. The "HashDoS" attack in 2011 hit PHP and Ruby after researchers at 28C3 demonstrated that 65,536 crafted parameters could hang a server for 30 seconds. Know the worst case. Know why modern runtimes defend against it. That's the complete answer to any follow-up.

Read more on the internals in Hash Map Time Complexity.

The Inverse Ackermann Function Is Your New Party Trick

Union-Find with just union by rank gets O(log n). With just path compression, O(log n) amortized. Apply both and you get O(α(n)) amortized, where α is the inverse Ackermann function.

α(n) is less than 5 for every input you will ever encounter in the real world, including inputs with more atoms than exist in the observable universe. For interview purposes, say "essentially O(1) amortized." If an interviewer pushes on the exact bound, say "O(α(n)), inverse Ackermann, which is practically constant." Most candidates have never heard the name. Saying it earns points.

See Union-Find Algorithm for the full treatment with path compression and union by rank.

Your Unbalanced BST Is a Linked List

This catches people who learned BST complexity from a textbook diagram showing a perfectly symmetric tree. That tree is a lie. The O(log n) average assumes the tree is reasonably balanced, and an unbalanced BST makes no such guarantee.

Insert [1, 2, 3, 4, 5] into an unbalanced BST in sorted order and you get a linked list. Every node has exactly one right child. Search, insert, and delete all degrade to O(n). When you reach for a BST in an interview, say "BST gives O(log n) average but degrades to O(n) without balancing, so I'd use AVL or Red-Black here." That's the answer. Don't leave it implied.

See Binary Search Tree for the full breakdown.

Trie Lookup Gets Faster As You Add More Keys

The complexity of a trie operation depends on k, the key length, not n, the number of keys. For most practical uses, like English words capped at 20-30 characters, k is bounded by a constant.

That makes trie insert and search effectively O(1) relative to the number of stored keys, which is counterintuitive. A hash map also gives O(1) average lookup, but a trie gives you prefix operations for free: autocomplete, prefix search, longest prefix match. If your problem involves prefixes or variable-length strings, a trie often beats a hash map. Just be ready to talk about the space tradeoff. O(n * ALPHABET_SIZE * k) can balloon fast if your alphabet is large.

"O(n) Space" for Segment Trees Actually Means 4n

The theoretical space for a segment tree is O(n). Fine. But if you implement it as a flat array (the standard interview approach), you need to allocate 4n nodes, not 2n.

A perfectly balanced tree on n leaves fits in 2n slots, but n is rarely a power of two. The next power of two above n can be up to 2n. Double it for the full array, and you're at 4n. In code, 4 * n is the safe allocation. Allocate 2 * n and n is, say, 5. You get an index-out-of-bounds bug under interview pressure. Finding that bug on the board is a rough way to spend your last ten minutes.


Adjacency List vs. Matrix: Almost Always List

Most graph problems use an adjacency list. BFS and DFS run in O(V + E), space is O(V + E). For a sparse graph (which most real-world graphs are), this is far better than the O(V²) space and O(V²) traversal cost of an adjacency matrix.

Use an adjacency matrix when you need O(1) edge existence checks and the graph is dense. Floyd-Warshall requires a matrix. For anything else, default to the list and know why you made that call.


Amortized vs. Guaranteed: They Are Not the Same

Three entries in the table look similar but fail differently. Mixing them up under follow-up questioning is the expensive mistake.

Dynamic array insert at end is O(1) amortized. Occasionally the array doubles and you pay O(n) for that single insert. Over n inserts the total work is O(n), so the average per insert is O(1). No single insert is guaranteed O(1).

Hash map insert and search are O(1) average, not amortized. Average means "over random inputs." Amortized means "over a sequence of operations." An adversarial input can force O(n) per operation on a naive hash map. Those are different failure modes and interviewers know the difference.

Union-Find Find and Union are O(α(n)) amortized. The cost is spread over a sequence of operations. Any single Find call is not guaranteed O(α(n)).

For a full breakdown of what amortized actually means and how to derive it, see Amortized Time Complexity.


Practice This Out Loud Before Your Interview

Knowing these facts on paper and knowing them under pressure are different skills. The complexity question shows up as a follow-up: "What's the time complexity of your solution?" or "Can you do better?" You need to answer immediately, without the pause.

SpaceComplexity runs you through full mock interviews with follow-up questions on complexity, so you practice explaining Big-O tradeoffs out loud, under time pressure, to a simulated interviewer. The gap between reading a cheat sheet and explaining your reasoning clearly on the spot is larger than most people expect.


Further Reading