Ruby Built-In Data Structures for Coding Interviews

May 29, 202612 min read
dsaalgorithmsinterview-prepruby
TL;DR
  • array.shift is O(n): using it inside a BFS loop turns the traversal O(n²); use a two-stack queue for any real-sized graph
  • Hash.new([]) is a shared-state trap: every missing key returns the same array object; use Hash.new { |h, k| h[k] = [] } for mutable defaults
  • SortedSet was removed in Ruby 3.0: replace it with a sorted array plus bsearch_index for O(log n) search
  • Ruby has no built-in heap: implement MinHeap yourself and have the ~30-line class memorized before any interview
  • Set#include? is O(1) vs Array#include?'s O(n); swap any visited-nodes array in a graph traversal for a Set
  • tally (Ruby 2.7+) builds a frequency map in one line: replaces the Hash.new(0) counting loop entirely

Ruby covers most coding interview patterns out of the box. The array and hash do most of the heavy lifting. There are a few gaps you need to fill yourself, one O(n) trap that absolutely will wreck a BFS solution if you forget about it, and one classic structure that was quietly deleted in Ruby 3.0 with basically no fanfare. Engineers who hadn't read the release notes found out the hard way.

This is the interview-focused cheat sheet: what exists, the real time complexity, the gotchas that won't show up until your interviewer is watching, and what to build when the built-ins fall short. If you're still deciding whether Ruby is worth using for interviews at all, the language comparison has the full breakdown.


What Ruby Actually Gives You (and What It Doesn't)

StructureRuby ClassRequire?Notes
Dynamic arrayArrayNoO(1) push/pop, O(n) shift/unshift
Hash mapHashNoInsertion-ordered since 1.9, O(1) lookup
Hash setSetRuby < 3.2 onlyO(1) membership, ships in stdlib
StackArrayNopush/pop
QueueArray or manualNoshift is O(n), see workaround below
Priority queueManualNoNo built-in heap
Sorted mapNone (removed)N/ASortedSet gone in 3.0

Array: The Workhorse With One Trap That Will Haunt You

Ruby's Array is the most flexible structure in the language, but it has an O(n) trap that will silently turn a perfectly good BFS into O(n²) while you watch.

a = [] a.push(1) # O(1) amortized, append to end a.pop # O(1), remove from end a.unshift(0) # O(n), prepend -- shifts every element right a.shift # O(n), remove from front -- shifts every element left a[3] # O(1), indexed access a.include?(5) # O(n), linear scan

push and pop are fast. shift and unshift are not. Ruby's Array has no ring buffer, so every shift physically moves every remaining element left by one slot. If you implement BFS with queue.shift, every dequeue is O(n). On a 10,000-node graph, that makes the whole traversal O(n²). Your code will look totally correct, run fine on small test cases, and then quietly fall apart the moment you hit anything big. This is the trap.

Array methods worth knowing cold

a.sort # O(n log n), stable in MRI Ruby a.sort_by { |x| x.length } # sort by computed key a.uniq # O(n), remove duplicates (preserves order) a.flatten # O(n), recursively flatten nested arrays a.flatten(1) # flatten one level only a.zip([1, 2, 3]) # pair elements from two arrays a.each_slice(3).to_a # split into chunks of 3 a.sum # O(n) sum, takes a block too a.tally # Ruby 2.7+, frequency count → Hash

tally deserves a moment. It is the fastest path to a frequency map in the language:

["a", "b", "a", "c", "b", "a"].tally # => {"a"=>3, "b"=>2, "c"=>1}

That one line replaces the classic hash = Hash.new(0); arr.each { |x| hash[x] += 1 } ceremony. Use it.


Hash: Fast, Ordered, and Has a Default-Value Bug That Will Confuse You at the Worst Possible Moment

Ruby's Hash has been insertion-ordered since version 1.9, which matters whenever a problem asks you to return elements in the order they were first seen. The underlying mechanics are the same as any open-addressing hash table: O(1) average, O(n) worst case if someone crafts adversarial keys (not your problem in an interview).

h = {} h[:a] = 1 # O(1) write h[:a] # O(1) read, returns nil if missing h.key?(:a) # O(1) membership check h.delete(:a) # O(1) delete h.each { |k, v| } # O(n), iteration in insertion order

The default value gotcha that everyone hits once

This is the most common Ruby interview bug, and it looks completely innocent until you inspect the output:

# WRONG: all keys share the same Array object bad = Hash.new([]) bad[:x] << 1 bad[:y] << 2 bad[:x] # => [1, 2], not [1] # RIGHT: each key gets its own Array good = Hash.new { |h, k| h[k] = [] } good[:x] << 1 good[:y] << 2 good[:x] # => [1]

When you pass a literal object to Hash.new, every missing key returns the exact same object. Mutations to one key mutate all of them. For Hash.new(0) this is fine because integers are immutable. For arrays or strings, it is a silent shared-state bug that produces wrong answers with no error. Use a default proc whenever the default value is mutable. This one sentence is worth memorizing.

Other Hash methods you'll actually reach for

h.merge(other) # returns new hash, other wins on conflicts h.merge!(other) # mutates h in place (also: update) h.fetch(:key, default) # raises KeyError if missing and no default given h.dig(:a, :b, :c) # safe nested access, returns nil instead of raising h.group_by { |k, v| v } # inherited from Enumerable h.min_by { |k, v| v } # returns [key, value] pair with min value h.transform_values { |v| v * 2 } # new hash with values mapped h.select { |k, v| v > 0 } # filter to new hash

fetch is a great production habit, but in interview code [] with a sensible default is usually cleaner and faster to write.


Set: O(1) Membership, One Require Line, Zero Excuses

For Ruby versions before 3.2, Set ships in the standard library but isn't loaded by default. One line fixes that. In 3.2 and later it just works.

require "set" # skip in Ruby >= 3.2 s = Set.new([1, 2, 3]) s.add(4) # O(1) s.include?(4) # O(1), this is the whole point s.delete(2) # O(1) s.size # O(1)

The set operations do exactly what you'd expect from a discrete math class:

a = Set.new([1, 2, 3]) b = Set.new([2, 3, 4]) a | b # union: {1, 2, 3, 4} a & b # intersection: {2, 3} a - b # difference: {1} a ^ b # symmetric difference: {1, 4}

Any time you check membership more than once, use Set over Array. array.include?(x) is O(n) because it scans the whole array. set.include?(x) is O(1). If your visited-nodes collection is an Array in a graph traversal, you are making the traversal slower than it has to be. Swap it.

When you need to store a value alongside each key (count, index, distance, anything), reach for Hash instead. A Set is just a Hash that doesn't bother keeping the values.


Stack, Queue, and Heap: One Built-in, Two Workarounds

Stack

Array has you covered. push appends, pop removes from the same end, both O(1). Nothing to write.

stack = [] stack.push(1) stack.push(2) stack.pop # => 2

Queue

Here is where it gets annoying. array.shift is O(n). Every call physically moves every remaining element left. For BFS on any real-sized graph, that makes the traversal O(n²). The code looks fine, the algorithm is correct in theory, and the complexity is terrible.

The two-stack approach gives you amortized O(1) per dequeue without any fancy data structures:

class Queue def initialize @inbox = [] @outbox = [] end def enqueue(val) @inbox.push(val) end def dequeue @outbox = @inbox.reverse.tap { @inbox.clear } if @outbox.empty? @outbox.pop end def empty? @inbox.empty? && @outbox.empty? end end

Each element moves from inbox to outbox exactly once, regardless of how many times you call dequeue. Amortized O(1).

For interviews where the input is small and you just need something working, array.shift is acceptable. Know the tradeoff and mention it to your interviewer so it doesn't look like you missed it.

Priority Queue (Heap): The Gap That Actually Hurts

Ruby has no built-in heap. No PriorityQueue, no MinHeap, no heapq equivalent, nothing. Java has PriorityQueue. Python has heapq. C++ has priority_queue. Ruby shrugs and hands you an empty Array. This is the single biggest structural disadvantage of using Ruby in interviews, because heap-based solutions come up constantly. The heap data structure guide explains why the property works if you want to understand it, not just use it.

Here is the minimum viable min-heap you need to have memorized before any interview:

class MinHeap def initialize @data = [] end def push(val) @data << val bubble_up(@data.size - 1) end def pop return nil if @data.empty? swap(0, @data.size - 1) min = @data.pop bubble_down(0) min end def peek = @data[0] def size = @data.size def empty? = @data.empty? private def bubble_up(i) parent = (i - 1) / 2 return if i.zero? || @data[parent] <= @data[i] swap(i, parent) bubble_up(parent) end def bubble_down(i) left = 2 * i + 1 right = 2 * i + 2 smallest = i smallest = left if left < @data.size && @data[left] < @data[smallest] smallest = right if right < @data.size && @data[right] < @data[smallest] return if smallest == i swap(i, smallest) bubble_down(smallest) end def swap(i, j) @data[i], @data[j] = @data[j], @data[i] end end

Both push and pop are O(log n). Write this out a few times until you can produce it from memory in under five minutes. If the problem says "K largest", "Kth smallest", or "merge K sorted lists", you are reaching for this.


SortedSet Is Gone and No One Warned You

SortedSet was removed in Ruby 3.0. Not deprecated with a warning. Not moved to a gem. Just gone. Try to use it and you get a runtime NameError pointing you at a third-party gem you obviously cannot install mid-interview. If you prepped on Ruby 2.7 and your interview environment runs 3.0 or later, you find out at the worst possible moment.

Here is what you do instead:

# Sorted iteration over a hash: just sort it sorted = hash.sort_by { |k, v| k } # Binary search on a sorted array (Ruby 2.0+) idx = arr.bsearch_index { |x| x >= target } # Keep an array sorted after inserts sorted_arr.insert(sorted_arr.bsearch_index { |x| x >= val } || -1, val)

bsearch and bsearch_index run in O(log n), which is what you wanted from SortedSet for lookups. The insert is still O(n) because of the element shift that follows it. For most interview problem sizes this holds up fine. If you need repeated O(log n) inserts and O(log n) rank queries, this is genuinely where Ruby is weaker than Java's TreeMap or C++'s std::map. That is just the reality. Know it and plan around it.


Enumerable Is Ruby's Secret Weapon

Every Array, Hash, and Set includes Enumerable. That module contains a lot of the methods other language users write by hand. These come up constantly:

# Frequency count (Ruby 2.7+) arr.tally # => {"a"=>3, "b"=>2} # Group by computed key arr.group_by { |x| x % 2 == 0 ? :even : :odd } # => {:odd=>[1,3], :even=>[2,4]} # Find element by computed property arr.min_by { |x| x.length } arr.max_by { |x| x.length } # Build a hash while iterating arr.each_with_object(Hash.new(0)) { |x, h| h[x] += 1 } # Flatten one level after map arr.flat_map { |x| [x, x * 2] } # Accumulate arr.reduce(0) { |sum, x| sum + x } arr.reduce(:+) # same thing, shorter

each_with_object is worth memorizing specifically. It is the clean pattern for building any data structure from an enumerable without fighting inject when your accumulator is mutable. The accumulator is the second block parameter, it gets passed through every iteration, and you mutate it in place. No awkward return value juggling.


The Decision Table

You needReach for
Fast key-value lookupHash
Membership test without valuesSet
LIFO (DFS, undo, parens)Array with push/pop
FIFO (BFS, sliding window)Two-stack Queue or Array for small inputs
Sorted iterationarray.sort, hash.sort_by
Frequency countarray.tally
Top-K or K smallestManual MinHeap
Ordered mapNone built-in, sorted array + bsearch

The Things That Will Actually Trip You Up

Let's just list them, because these are the ones that show up mid-interview when it's too late to look them up:

  • push/pop are O(1). shift/unshift are O(n). Using shift inside a BFS loop means your traversal is O(n²). It will not throw an error. It will just be slow.
  • Hash.new([]) shares one array object across every missing key. Mutations to one key mutate all of them. Use a block instead: Hash.new { |h, k| h[k] = [] }.
  • SortedSet was removed in Ruby 3.0. It is not deprecated. It is gone. Sorted arrays with bsearch are your substitute.
  • Ruby has no built-in heap. Write the MinHeap from the section above until you can produce it in under five minutes without looking.
  • Ruby integers have arbitrary precision. No overflow, ever. That is one thing you genuinely do not have to worry about.

For a broader look at Ruby idioms that save time in interviews, see Ruby for coding interviews.

If you want to find out whether you actually have these patterns down or just think you do, SpaceComplexity runs voice-based mock interviews with rubric-based feedback. The Hash default bug and the missing heap are exactly the kind of thing that costs points in real interviews and is easy to catch before it matters.


Further Reading