C++ Built-In Data Structures for Coding Interviews: The STL Reference

May 29, 202612 min read
dsaalgorithmsinterview-prepdata-structures
C++ Built-In Data Structures for Coding Interviews: The STL Reference
TL;DR
  • vector<T> is your default: O(1) amortized push_back, O(n) mid-insert; use reserve(n) when you know the final size.
  • priority_queue is a max-heap by default: write greater<> every time you need a min-heap for Dijkstra or top-K-smallest problems.
  • map and set give O(log n) sorted operations; always use the member lower_bound/upper_bound, not the free-function versions from <algorithm>.
  • unordered_map[] creates an entry when the key is missing: use find() or count() to check existence without mutating the map.
  • vector<bool> is not a real vector: it packs bits and breaks reference semantics; use vector<int> or vector<char> instead.
  • Iterator invalidation after push_back is silent: a reallocation invalidates all iterators and pointers into the vector.
  • multiset::erase(val) removes every matching element; pass an iterator from find() to delete exactly one copy.

C++ gives you more containers than you'll ever use in an interview. That's the problem. Most engineers default to vector for everything, reach for map when they want fast lookup, and then spend the next 90 seconds confidently explaining O(1) lookup while the interviewer silently watches an O(log n) operation happen right there on the screen. The wrong container doesn't just add log factors. It kills your complexity argument before you finish making it.

This guide covers every container you'll actually reach for in a 45-minute session: what it does, what it costs, and the quiet ways it bites you. No ceremony, no history lesson about Bjarne Stroustrup. Just the STL.


One Rule Covers 80% of Problems

If you need ordered data or range queries, use set or map. If you need fast lookup without ordering, use the unordered variants. For everything else, start with vector.

That covers most interviews. The rest of this guide is for the cases where you need to actually think.


Sequences: vector First, Then Justify Switching

vector<T> Is Your Default

vector is a dynamic array: contiguous memory, O(1) random access, O(1) amortized push_back. It's cache-friendly and fast in practice. You pay O(n) for insert or erase anywhere except the back. It's boring and it's correct. That's the whole pitch.

vector<int> v = {1, 2, 3}; v.push_back(4); // O(1) amortized v.insert(v.begin(), 0); // O(n), shifts everything v.pop_back(); // O(1) int x = v[2]; // O(1)

Reserve capacity upfront with v.reserve(n) when you know the final size. It prevents repeated reallocations and keeps your constant factor honest.

string Appends in O(1), Concatenates in O(n²)

Use += to append, not +. String concatenation with + in a loop creates a new allocation every iteration, turning an O(n) task into O(n²). This is one of those bugs that looks right and runs wrong, and the test cases are usually too small to catch it. See why string concatenation in a loop is quadratic.

string s = "hello"; s += " world"; // O(n) amortized, good // s = s + " world"; // O(n²) if done in a loop, avoid

Useful: s.substr(pos, len), s.find("sub"), stoi(s), to_string(n).

Use deque<T> When You Need O(1) on Both Ends

deque gives O(1) push and pop at both front and back. The tradeoff: it's not contiguous in memory, so random access carries a higher constant than vector. Reach for it with sliding-window patterns that need fast front removal, or when implementing a monotonic deque.

deque<int> dq; dq.push_back(1); dq.push_front(0); dq.pop_front(); // O(1)

array<T, N> Lives on the Stack

Stack-allocated, no heap overhead, knows its size at compile time. Use it for lookup tables, fixed-dimension grids, or anywhere you'd write int arr[N] but want .size() and iterator support.

array<int, 4> dirs = {0, 1, -1, 0};

Container Adaptors: Less Is More

These three adapt other containers and restrict the interface down to the minimum you actually need. They don't support iteration. They exist to make intent explicit, not to unlock new complexity. When you reach for stack, your interviewer knows exactly what you're building. That communication has value.

stack<T> and queue<T>

Both wrap deque by default. stack is LIFO (push, pop, top). queue is FIFO (push, pop, front, back).

stack<int> stk; stk.push(1); int top = stk.top(); stk.pop(); // always check !stk.empty() first queue<int> q; q.push(1); int front = q.front(); q.pop();

For a broader comparison, see priority queue vs queue.

priority_queue<T> Is a Max-Heap, and It Will Ruin Your Dijkstra

This gets everyone. At least once. priority_queue gives you the largest element first. You implement Dijkstra, you run it, it returns wrong answers, you stare at your relaxation logic for ten minutes. The bug is one line above where you're looking. For Dijkstra or any top-k-smallest problem, you need a min-heap. The syntax is ugly but fixed:

// Max-heap (default) priority_queue<int> maxpq; // Min-heap priority_queue<int, vector<int>, greater<int>> minpq; // Min-heap of pairs: (distance, node) priority_queue<pair<int,int>, vector<pair<int,int>>, greater<>> pq;

Never call top() or pop() on an empty queue. Guard with if (!pq.empty()) or you get undefined behavior. In an interview, undefined behavior usually means a segfault at the worst possible moment. For the underlying mechanics, see heap data structure.


Ordered Containers: You're Paying O(log n) for Sorted Order

Both set and map are backed by red-black trees. Every operation is O(log n). They keep keys sorted, which is the only reason to use them. If you don't need sorted order, you're overpaying by a log factor and your interviewer will notice. For the tree mechanics, see AVL vs red-black tree.

set<T> and multiset<T>

set stores unique sorted elements. multiset allows duplicates. Both support O(log n) insert, erase, and find. Use the member lower_bound and upper_bound, not the free-function versions. The free functions from <algorithm> degrade to O(n) on tree iterators.

set<int> s = {3, 1, 4, 1, 5}; // {1, 3, 4, 5}, deduped s.insert(2); s.erase(3); auto it = s.lower_bound(4); // iterator to first element >= 4

For multiset, erasing by value removes ALL matching elements. To remove exactly one copy, pass an iterator: ms.erase(ms.find(val)).

map<K, V> and multimap<K, V>

map is an ordered key-value store. Iteration goes in sorted key order, which is useful when you need both fast lookup and ordered traversal.

map<string, int> freq; freq["apple"]++; // creates entry if missing, initializes to 0 freq.count("banana"); // 1 if key exists, 0 otherwise freq.find("apple"); // iterator, or freq.end() if missing for (auto& [k, v] : freq) { /* sorted by k */ }

Unordered Containers: Fast on Average, Embarrassing in the Worst Case

unordered_map and unordered_set use hash tables. Average O(1) for insert, find, and erase. Worst case is O(n) on adversarial inputs. In interview settings, average case is what matters, and these are almost always the right call when you don't need ordering. The "adversarial inputs" caveat is real in competitive programming but unlikely in a 45-minute interview.

unordered_map<int, int> memo; memo[5] = 10; if (memo.count(5)) { /* key exists */ } memo.erase(5);

Accessing a missing key with operator[] creates it with a zero-initialized value. Use count() or find() to check existence without mutating the map. For the full analysis, see hash map time complexity.


Algorithms You'll Reach For Every Session (Memorize These)

#include <algorithm> #include <numeric> sort(v.begin(), v.end()); // O(n log n) introsort sort(v.begin(), v.end(), greater<int>()); // descending stable_sort(v.begin(), v.end()); // preserves equal order lower_bound(v.begin(), v.end(), x); // first >= x upper_bound(v.begin(), v.end(), x); // first > x max_element(v.begin(), v.end()); // iterator to max accumulate(v.begin(), v.end(), 0); // sum fill(v.begin(), v.end(), 0); // zero out reverse(v.begin(), v.end());

lower_bound and upper_bound are O(log n) only on random-access iterators like vector. On set or map, call the member function versions. The free-function version degrades to O(n) on tree iterators, which defeats the whole point.

pair<> and Structured Bindings Show Up Everywhere

Nearly every code example in this guide uses pair<int,int>. It's the standard way to group two values in priority queues, adjacency lists, and return types when a full struct would be overkill.

pair<int,int> p = {1, 2}; auto [a, b] = p; // C++17 structured binding

Use structured bindings (auto& [k, v]) when iterating over maps. They work in range-for loops and keep the code readable. They also signal C++17 fluency to your interviewer, which costs you nothing and might help slightly.


Five Traps That Bite Candidates Who Think They Know C++

vector<bool> Is Not a Real Vector

vector<bool> is a space-optimization specialization that packs bits. Someone in the C++98 committee thought this was clever. It is not. It doesn't return bool& references, so taking a reference or pointer to an element silently breaks in ways that are genuinely confusing to debug. Use vector<int> or vector<char> instead.

vector<bool> vb = {true, false}; // auto& ref = vb[0]; // this doesn't give a real reference vector<int> vi = {1, 0}; // use this instead

map[] Creates an Entry (Yes, Even When You're Just Checking)

Calling map[key] when key doesn't exist inserts it with a zero-initialized value. Then your size is wrong, your iteration has phantom keys, and you spend five minutes certain the algorithm is broken when the data structure is lying to you. Use map.find(key) != map.end() or map.count(key) to check first. This one bites experienced C++ developers just as often as beginners.

Iterator Invalidation After push_back

A vector reallocates when it exceeds capacity, invalidating all iterators, pointers, and references. Storing a vector iterator then pushing back is undefined behavior. Sometimes it works anyway, which is even worse, because it teaches you the wrong lesson.

vector<int> v = {1, 2, 3}; auto it = v.begin(); v.push_back(4); // may reallocate, 'it' is now invalid

set, map, list, and the unordered containers don't have this problem on insert. Their iterators to existing elements stay valid.

Erasing from unordered_map While Iterating

Erasing a key during range-based iteration invalidates the current iterator. Undefined behavior again. This pattern shows up when you're cleaning up a frequency map after processing, and the fix is always the same: collect first, erase after.

// Wrong for (auto& [k, v] : m) { if (v == 0) m.erase(k); // undefined behavior } // Right vector<int> to_erase; for (auto& [k, v] : m) { if (v == 0) to_erase.push_back(k); } for (int k : to_erase) m.erase(k);

multiset::erase(val) Removes All Copies, Not One

If you're simulating a sliding-window sorted bag and you call ms.erase(5) when you want to drop one 5, every 5 in the multiset disappears. Quietly. Your window is now wrong. Pass an iterator to erase exactly one copy: ms.erase(ms.find(val)). This is one of those mistakes that corrupts your output in a way that looks like a logic bug, not a data structure bug.


Quick Reference: C++ Data Structures for Coding Interviews

ContainerInsertFindEraseOrdered?Notes
vector<T>O(1) amortized backO(n)O(n)norandom access
deque<T>O(1) front/backO(n)O(n) front/backnono contiguous memory
stack<T>O(1)n/aO(1)noLIFO only
queue<T>O(1)n/aO(1)noFIFO only
priority_queue<T>O(log n)n/aO(log n)partialmax-heap default
set<T>O(log n)O(log n)O(log n)yesno duplicates
multiset<T>O(log n)O(log n)O(log n)yesduplicates OK
map<K,V>O(log n)O(log n)O(log n)yessorted by key
unordered_map<K,V>O(1) avgO(1) avgO(1) avgnoO(n) worst case
unordered_set<T>O(1) avgO(1) avgO(1) avgnoO(n) worst case

What a Real Problem Looks Like (So You Stop Dreading It)

Dijkstra's needs a min-heap for the frontier and fast distance lookup. Here's the exact pattern. Read it once, recognize it in interviews, type it from memory:

#include <vector> #include <queue> using namespace std; vector<int> dijkstra(int src, int n, vector<vector<pair<int,int>>>& adj) { vector<int> dist(n, INT_MAX); priority_queue<pair<int,int>, vector<pair<int,int>>, greater<>> pq; dist[src] = 0; pq.push({0, src}); while (!pq.empty()) { auto [d, u] = pq.top(); pq.pop(); if (d > dist[u]) continue; // stale entry for (auto [w, v] : adj[u]) { if (dist[u] + w < dist[v]) { dist[v] = dist[u] + w; pq.push({dist[v], v}); } } } return dist; }

Min-heap via greater<>, stale-entry skip, INT_MAX as infinity. These three details together separate a working C++ Dijkstra from one that silently produces wrong answers. One more: never initialize distances to plain INT_MAX and then add to them. INT_MAX + w overflows. Use 1e9 or INT_MAX / 2 when edges have positive weight. You will forget this under pressure. Write it in the margin of your brain now.

Knowing the right container is half the job. The other half is narrating the choice clearly while the clock is running. That second part is harder than it sounds, and no amount of LeetCode prepares you for it. SpaceComplexity runs voice-based mock interviews with rubric-scored feedback on exactly this: not just whether your code is correct, but whether you said "I'll use unordered_map here for O(1) lookup since we don't need ordering" before you typed the first line.


The Short Version You'll Actually Remember

  • vector is the default. Switch away only when you have a concrete reason. "It felt right" is not a reason.
  • priority_queue is a max-heap. Write greater<> for min-heap every time. Tattoo it somewhere if you have to.
  • map[] and unordered_map[] create entries on access. Use find() or count() to check existence.
  • Use the member lower_bound/upper_bound on set/map, not the free function versions from <algorithm>.
  • vector<bool> is a packed-bit specialization, not a real vector. Use vector<int> instead and move on.
  • multiset::erase(val) removes all matching elements. Pass an iterator to erase exactly one copy.
  • Iterator invalidation after push_back on vector is real and silent. Don't store iterators across mutations.

Further Reading