Top 12 Hash Map Interview Problems: Every Pattern, One List

May 29, 202612 min read
dsaalgorithmsinterview-prepleetcode
Top 12 Hash Map Interview Problems: Every Pattern, One List
TL;DR
  • Complement lookup stores target - x instead of x, turning pairwise O(n²) comparison into O(n); Two Sum and 4Sum II both follow this shape
  • Frequency counting has two modes: absolute count (Group Anagrams, Top K Frequent) and relative diff (Valid Anagram), each requiring a different solution structure
  • Bijection check requires two maps, not one; a single s→t map misses many-to-one failures in t (Isomorphic Strings)
  • Sliding window + hash map uses the map to decide when to shrink; store the last-seen index, not just presence, or the guard last_seen[c] >= left breaks on cases like "abba"
  • Prefix sum + hash map handles non-monotonic subarrays where sliding window fails; initialize {0: 1} to catch subarrays starting at index 0 (Subarray Sum Equals K, Contiguous Array)
  • Streak detection skips O(n²) counting by only starting at streak minimums; the n-1 not in set guard is what makes Longest Consecutive Sequence O(n)

You can get pretty far knowing that hash maps are O(1) lookup. You cannot get a job offer that way.

The data structure is simple. The interviewer is not testing whether you know what a hash map is. They're testing whether you recognize when a problem is secretly asking for one, and which of the six patterns to reach for. Miss that recognition and your two-loop brute force compiles fine and does absolutely nothing for your career.

These 12 problems exhaust the six major patterns. Work through them in order and you'll start seeing the shape of a hash map question before you've finished reading the problem statement.

Hash Map Interview Problems: Six Patterns

ProblemDifficultyPattern
Two SumEasyComplement lookup
4Sum IIMediumTwo-array complement
Valid AnagramEasyFrequency counting
Majority ElementEasyFrequency counting
Group AnagramsMediumCanonical key bucketing
Top K Frequent ElementsMediumFrequency + bucket sort
Isomorphic StringsEasyBijection check
Longest Substring Without Repeating CharactersMediumSliding window + last-seen index
Minimum Window SubstringHardTwo-frequency-map sliding window
Subarray Sum Equals KMediumPrefix sum + complement count
Contiguous ArrayMediumBalance transform (0 to -1)
Longest Consecutive SequenceMediumSet membership for O(n) streaks

Don't Ask Everyone. Ask the One Who Knows.

Instead of checking every pair, precompute one side. By the time you reach the second element, you already know whether the first exists.

1. Two Sum (LC 1) | Easy

Your first instinct is two loops. No shame in it. You scan every pair, check if they sum to target, done. O(n²) and correct.

Then you notice the waste. For each element x, you don't need to scan everyone. You need exactly one number: target - x. If you already stored that number somewhere, the search is O(1) instead of O(n). The loop drops.

Store what you're looking for, not what you have. Check the map first. If the complement is there, return. If not, store x as the thing some future element will be looking for.

def twoSum(nums: list[int], target: int) -> list[int]: seen = {} for i, x in enumerate(nums): complement = target - x if complement in seen: return [seen[complement], i] seen[x] = i

This is the canonical hash map template. A nested loop does pairwise comparisons. A hash map lets you ask "does my counterpart exist?" in O(1). When brute force compares every pair, ask what you'd need to precompute to drop one of those loops.

2. 4Sum II (LC 454) | Medium

Four arrays. Find all tuples (a, b, c, d) where a + b + c + d = 0. Brute force is O(n⁴). Nobody is writing that in an interview and living it down.

Split the arrays in half, precompute one half's sums, query the other. Build a frequency map of every a + b sum from arrays one and two. Then for every c + d from arrays three and four, add freq[-(c + d)] to the count.

def fourSumCount(nums1, nums2, nums3, nums4): ab_sums = {} for a in nums1: for b in nums2: ab_sums[a + b] = ab_sums.get(a + b, 0) + 1 count = 0 for c in nums3: for d in nums4: count += ab_sums.get(-(c + d), 0) return count

O(n²). The same split-precompute-query move applies whenever you need to combine elements from multiple sources.


Frequency Counting Has Two Modes

Absolute count (how many times does X appear?) versus relative difference (is the count of X in A equal to X in B?). The mode determines the solution shape.

3. Valid Anagram (LC 242) | Easy

One map, one pass to increment, one pass to decrement. Fail as soon as any count goes negative.

def isAnagram(s: str, t: str) -> bool: if len(s) != len(t): return False freq = {} for c in s: freq[c] = freq.get(c, 0) + 1 for c in t: freq[c] = freq.get(c, 0) - 1 if freq[c] < 0: return False return True

Early exit on the decrement pass is the key. You don't need to finish scanning t once you know it has a character that s doesn't have enough of. Stop there.

4. Majority Element (LC 169) | Easy

The frequency map solution is O(n) time and O(n) space. Correct, and you should feel good about it.

The follow-up is the real interview. Know the hash map solution cold, then know that Boyer-Moore solves it in O(1) space. Interviewers almost always ask for the space optimization. Prepare only one and you'll get caught staring at the screen.

See the Boyer-Moore voting algorithm for the follow-up pattern.

5. Group Anagrams (LC 49) | Medium

Grouping requires a canonical key: all anagrams must produce the same one. Sorting the characters works (O(k log k) per string). A 26-element frequency tuple is faster (O(k) per string).

def groupAnagrams(strs: list[str]) -> list[list[str]]: groups = {} for s in strs: key = tuple(sorted(s)) groups.setdefault(key, []).append(s) return list(groups.values())

You're not keying by the string itself. You're keying by a normalized form that's identical for every member of the group. "eat", "tea", and "ate" are different strings but the same sorted tuple. Whenever you need to bucket objects by equivalence class, ask what canonical key identifies that class.

6. Top K Frequent Elements (LC 347) | Medium

Naive: count frequencies, sort by frequency, take the top k. That's O(n log n) and it's what most people write first.

The O(n) approach uses bucket sort. Create n+1 buckets where bucket i holds all elements with frequency i. Scan from the back.

def topKFrequent(nums: list[int], k: int) -> list[int]: freq = {} for n in nums: freq[n] = freq.get(n, 0) + 1 buckets = [[] for _ in range(len(nums) + 1)] for num, count in freq.items(): buckets[count].append(num) result = [] for i in range(len(buckets) - 1, 0, -1): result.extend(buckets[i]) if len(result) >= k: return result[:k]

Frequency ranges from 1 to n, which makes it a valid array index. That bounded range is what makes bucket sort applicable. Whenever your sort key has a known bounded integer range, bucket sort beats comparison-based sorting.

For the heap-based approach, see Top-K heap pattern.


One Map Leaves a Backdoor Open

7. Isomorphic Strings (LC 205) | Easy

"Cat" maps to "dog" (c→d, a→o, t→g). Valid. "Foo" cannot map to "bar" because both o's would need different targets.

You build one map, s→t, and it works. For a minute. Then you hit a test case where two different characters in t map to the same character in s, and the single map doesn't catch it.

def isIsomorphic(s: str, t: str) -> bool: s_to_t, t_to_s = {}, {} for cs, ct in zip(s, t): if s_to_t.get(cs, ct) != ct or t_to_s.get(ct, cs) != cs: return False s_to_t[cs] = ct t_to_s[ct] = cs return True

A single s→t map catches many-to-one failures in s but misses many-to-one failures in t. Checking both directions costs nothing extra and catches everything.

The Godfather - "Just when I thought I was out, they pull me back in" You and one-map isomorphism checking, every single time.

This bijection pattern recurs in Word Pattern (LC 290) and encode/decode problems. Every time. Both directions.


The Map Tells You When to Shrink

The hash map tracks what's inside the window. The window moves based on what the map says.

8. Longest Substring Without Repeating Characters (LC 3) | Medium

Store the last-seen index of each character, not just whether you've seen it. That distinction matters.

def lengthOfLongestSubstring(s: str) -> int: last_seen = {} left = 0 best = 0 for right, c in enumerate(s): if c in last_seen and last_seen[c] >= left: left = last_seen[c] + 1 last_seen[c] = right best = max(best, right - left + 1) return best

The guard last_seen[c] >= left is non-negotiable. A character that appeared before the window's left boundary is irrelevant. Without it, "abba" returns 1 instead of 2 when the second a forces left back past b. Store the index. Always check it's inside the window.

9. Minimum Window Substring (LC 76) | Hard

Two frequency maps: need for target counts, window for the current window. A formed counter tracks how many distinct characters have hit their required frequency.

def minWindow(s: str, t: str) -> str: need = {} for c in t: need[c] = need.get(c, 0) + 1 required = len(need) left = formed = 0 window = {} best = float("inf"), 0, 0 for right, c in enumerate(s): window[c] = window.get(c, 0) + 1 if c in need and window[c] == need[c]: formed += 1 while formed == required: if right - left + 1 < best[0]: best = (right - left + 1, left, right) lc = s[left] window[lc] -= 1 if lc in need and window[lc] < need[lc]: formed -= 1 left += 1 return "" if best[0] == float("inf") else s[best[1]:best[2] + 1]

The naive approach compares two frequency maps on every step. That's O(|t|) per right-pointer move. Fine for a first draft, embarrassing in an interview once the interviewer asks about complexity.

The formed counter is the whole trick. It increments only when a character exactly hits its target, which keeps the constraint check O(1). This "completion counter" pattern shows up in any problem where you need to track whether a set of constraints are all simultaneously satisfied.


Your Past Self Knew the Answer

Running prefix sums let you deduce subarray properties. The map tells you whether a particular running total appeared earlier.

10. Subarray Sum Equals K (LC 560) | Medium

Sliding window fails here because negative numbers make the sum non-monotonic. Prefix sums fix it.

If prefix[j] - prefix[i] = k, the subarray from i+1 to j sums to k. Rearranged: you need prefix[j] - k to have appeared before. Check the map, then store.

def subarraySum(nums: list[int], k: int) -> int: count = 0 prefix = 0 freq = {0: 1} for n in nums: prefix += n count += freq.get(prefix - k, 0) freq[prefix] = freq.get(prefix, 0) + 1 return count

Initialize with {0: 1}. This handles subarrays starting at index 0, where prefix[j] - k = 0 and the "earlier prefix sum" is the empty prefix. Forgetting the initialization is the most common bug in this family. Not forgetting it is the difference between "mostly working" and "actually correct."

See prefix sum patterns for the full family.

11. Contiguous Array (LC 525) | Medium

Find the longest subarray with equal 0s and 1s. Replace every 0 with -1. Now you need the longest subarray with sum 0. Same prefix sum pattern.

def findMaxLength(nums: list[int]) -> int: first_seen = {0: -1} balance = 0 best = 0 for i, n in enumerate(nums): balance += 1 if n == 1 else -1 if balance in first_seen: best = max(best, i - first_seen[balance]) else: first_seen[balance] = i return best

Store only the first occurrence of each balance value. When you see the same balance again, the subarray between those two indices has net change zero, meaning equal 1s and 0s. Maximum length means earliest first occurrence. This is why you store and skip on revisit, rather than updating.


Start the Streak Where It Actually Starts

12. Longest Consecutive Sequence (LC 128) | Medium

Sorting costs O(n log n). The hash set approach is O(n) by refusing to repeat work someone else will do.

Only start counting a streak when n - 1 is not in the set. Each streak gets counted exactly once, from its minimum element.

def longestConsecutive(nums: list[int]) -> int: num_set = set(nums) best = 0 for n in num_set: if n - 1 not in num_set: length = 1 while n + length in num_set: length += 1 best = max(best, length) return best

The if n - 1 not in num_set guard is what makes this O(n). Without it, every element triggers a full count. You're back at O(n²) with a hash set wrapper. With it, the inner while loop only runs for streak minimums. Each element gets touched at most twice across all iterations: once to skip, once to count.


Pattern recognition is what gets harder under pressure. You can have all six of these memorized and still freeze when a problem is framed sideways. The variable name is different, the array is sorted, the input is a string instead of integers. The pattern is the same. Your brain, three minutes in, is not.

SpaceComplexity runs rubric-graded voice interviews across all six of these hash map families, with feedback on both your solution and how you explained it. Pattern recognition under pressure is a skill. It responds to practice.


Further Reading

External Resources