Bitmask Subset Enumeration: How Integers Map to Every Subset

You have three elements. You need every possible subset. Your first instinct is probably to write a recursive function, feel smart about it, and then stare at a stack overflow for twenty minutes.

Nobody mentions this in school, but an integer from 0 to 2^n minus 1 maps to exactly one subset of an n-element set. You already have the subsets. They are sitting right there in your CPU, wearing integer costumes. Counting from 0 to 7 visits all eight subsets of a three-element set. No recursion tree. No "oops I forgot the base case." Just counting.

Once this bijection clicks, you can read the complexity of any subset-exhausting algorithm at a glance, recognize n ≤ 20 as a problem setter whispering "use a bitmask," and explain bitmask DP in an interview without watching the interviewer's face slowly fall.

Bit i Means Element i (And This Is Not a Riddle)

Take {A, B, C}. Assign indices: A is 0, B is 1, C is 2. A subset is just a three-bit number where bit i is 1 if element i made the cut.

Integer	Binary	Subset
0	`000`	`{}`
1	`001`	`{A}`
2	`010`	`{B}`
3	`011`	`{A, B}`
4	`100`	`{C}`
5	`101`	`{A, C}`
6	`110`	`{B, C}`
7	`111`	`{A, B, C}`

The integers are not an encoding of the subsets. They are the subsets. Each bit position is independent, each has two states, and n independent bits give 2^n states total. That is the power set of n elements. Binary was doing this the whole time. You just weren't looking.

Counting Is the Algorithm

The encoding makes enumeration trivially boring, in the best way. To visit every subset of an n-element collection, count integers:

elements = ['A', 'B', 'C']
n = len(elements)

for mask in range(1 << n):          # 0 through 2^n - 1
    subset = []
    for i in range(n):
        if mask & (1 << i):         # bit i is set
            subset.append(elements[i])
    print(f"{mask:0{n}b}: {subset}")

No recursion. No stack. No visited set. No opportunity to accidentally pass path when you meant path[:]. Each value of mask is a unique subset, and you hit all 2^n of them exactly once.

TypeScript, for the people who insist on semicolons:

const elements = ['A', 'B', 'C'];
const n = elements.length;

for (let mask = 0; mask < (1 << n); mask++) {
  const subset: string[] = [];
  for (let i = 0; i < n; i++) {
    if (mask & (1 << i)) subset.push(elements[i]);
  }
  console.log(mask.toString(2).padStart(n, '0'), subset);
}

Four Operations You Will Write a Hundred Times

Once you have a mask, every subset manipulation you will ever need fits in six lines. Memorize these. Tattoo them on your forearm. At minimum, put them in a comment somewhere you will find at 2 AM.

# Is element i in the subset?
included = bool(mask & (1 << i))

# Add element i
new_mask = mask | (1 << i)

# Remove element i
new_mask = mask & ~(1 << i)

# Is subset_a a subset of subset_b?
is_subset = (subset_a & subset_b) == subset_a

# Merge two subsets (union)
union = mask_a | mask_b

# Intersection
intersection = mask_a & mask_b

The check mask & (1 << i) is the one you will use inside every inner loop. It tests bit i in constant time. If you walk out of an interview remembering only one line from this post, make it that one.

The Bill Always Comes Due: Complexity

Visiting all 2^n subsets takes O(2^n) time by definition. This is not a bug. This is not something you can optimize away by being clever. Any algorithm that must examine every subset of an n-element input is exponential, full stop.

If you also do O(n) work per subset (say, scanning each element to compute a sum), the total is O(n * 2^n). A brute-force subset-sum counter demonstrates this generously:

def count_subsets_summing_to(arr: list[int], target: int) -> int:
    n = len(arr)
    count = 0
    for mask in range(1 << n):
        total = sum(arr[i] for i in range(n) if mask & (1 << i))
        if total == target:
            count += 1
    return count

Time: O(n * 2^n). Space: O(1). For n = 20, roughly 20 million operations. For n = 30, 30 billion operations and a very warm laptop.

The cliff is not gradual. n = 20 is fine; n = 30 is a campfire; n = 40 is "you have found a bug in your problem formulation."

The subset sum problem has a pseudopolynomial DP solution at O(n * target), which is usually far better. The bitmask brute-force is the baseline to beat. Knowing it lets you justify why the DP upgrade matters instead of just asserting that DP is "faster" while waving your hands.

See exponential time complexity for a fuller treatment of why exponents eat everything.

Bitmask DP: When the Subset Is the State

Here is where bitmasks stop being a cute trick and start being genuinely powerful.

Two paths through n cities might visit the same set of cities in a different order. For the purpose of extending the tour, they are identical. The bitmask captures which cities you have visited. The bitmask DP insight is that (subset, current position) completely determines what optimal future choices look like. Order of arrival is irrelevant.

This cuts O(n!) enumeration down to O(n^2 * 2^n). For n = 20:

Brute-force permutations: roughly 2.4 * 10^18 operations. Your grandchildren will not see this finish.
Bitmask DP: about 400 million operations. Done in under a second.

The Traveling Salesman Problem is the canonical example. The Held-Karp algorithm defines dp[mask][i] as the minimum cost to visit every city in mask and end at city i:

def tsp(dist: list[list[int]]) -> int:
    n = len(dist)
    INF = float('inf')
    dp = [[INF] * n for _ in range(1 << n)]
    dp[1 << 0][0] = 0

    for mask in range(1 << n):
        for last in range(n):
            if dp[mask][last] == INF:
                continue
            if not (mask & (1 << last)):
                continue
            for nxt in range(n):
                if mask & (1 << nxt):
                    continue
                new_mask = mask | (1 << nxt)
                cost = dp[mask][last] + dist[last][nxt]
                dp[new_mask][nxt] = min(dp[new_mask][nxt], cost)

    full = (1 << n) - 1
    return min(dp[full][i] + dist[i][0] for i in range(1, n))

Time: O(n^2 * 2^n). Space: O(n * 2^n). The bitmask DP collapses the state space from n! to n * 2^n by treating any two paths through the same subset as the same state. It does not reduce the number of unique states. It prevents recomputing the same state on a hundred different paths, which is the same trick described in memoized recursion vs naive backtracking.

The O(3^n) Thing That Sounds Made Up

Iterating over every submask of a given mask looks deceptively simple:

sub = mask
while sub > 0:
    process(sub)
    sub = (sub - 1) & mask

The (sub - 1) & mask pattern is one of those bitwise tricks that feels like dark magic the first time, and then becomes second nature. Subtracting 1 flips the lowest set bit and sets all lower bits; ANDing with mask clears any bits not in mask. The result cycles through every non-zero submask of mask in decreasing order.

The total work when you run this for every mask of size n is O(3^n). Each element independently falls into one of three categories: not in the outer mask at all, in the outer mask but not in the submask, or in both. Three choices per element gives 3^n total (mask, submask) pairs across all outer masks.

O(3^n) sounds worse than O(2^n) until you realize O(4^n) is what you would get if submask enumeration were naive, and the combinatorial argument saving you a factor of (4/3)^n is genuinely satisfying. You see submask enumeration in Sum over Subsets DP problems like Maximum AND Sum.

n ≤ 20 Is a Wink, Not a Warning

When a competitive programming problem or LeetCode constraint says n ≤ 20, the problem setter is not arbitrarily limiting the input. They have verified that 2^20 = 1,048,576 fits within the time limit. They are communicating a solution approach.

n bound	Bitmask viable?	2^n
n ≤ 15	Easily	~32K
n ≤ 20	Yes, the target	~1M
n ≤ 25	Possible with O(1) per subset	~33M
n ≤ 30	Borderline	~1B
n > 30	Need a different approach	Too large

A small n alongside a problem asking about sets, assignments, or paths is the bitmask DP signal. Interview problems that follow this pattern: minimum incompatibility (n ≤ 16), smallest sufficient team (skills as bitmasks), partition into k equal sum subsets. When you see n ≤ 20 and nothing in the problem suggests a greedy or polynomial-time approach, your default hypothesis should be bitmask.

What to Say When the Interviewer Asks Why

Proposing a bitmask solution without justifying the complexity is like ordering in a restaurant and refusing to say what you want. You need the numbers ready.

For brute-force enumeration: "There are 2^n subsets. I visit each once and do O(n) work per subset, so the total is O(n * 2^n). Space is O(1) since I only track the current mask."

For bitmask DP: "There are n * 2^n states, each a (city, visited-set) pair. I process each state once with O(n) work to extend to the next city: O(n^2 * 2^n) time, O(n * 2^n) space."

For submask enumeration: "Each element has three outcomes across all masks, so total work is O(3^n)."

Know those numbers cold. If the interviewer asks why bitmask DP beats backtracking, the answer is not "DP is more efficient." Both explore the same states. Backtracking without memoization recomputes the same (city, visited-set) combination on multiple paths; bitmask DP hits each state exactly once. The state space is not smaller. The work per state is. See dynamic programming framework for the broader principle.

The gap between knowing the answer and explaining it under pressure in real time is larger than most people expect before they try it. SpaceComplexity runs voice-based mock interviews that ask exactly these follow-up questions in real time. Explaining why O(n^2 * 2^n) is fast enough when n = 20 is a different skill from reading the formula off a table.

Quick Reference

n elements, all subsets: integers 0 through 2^n minus 1
Element i in mask: mask & (1 << i)
Add element i: mask | (1 << i)
Remove element i: mask & ~(1 << i)
Full set of n elements: (1 << n) - 1
All subsets enumeration: O(2^n)
Enumeration plus O(n) work per subset: O(n * 2^n)
Bitmask DP over (subset, element) states: O(n^2 * 2^n)
Submask enumeration over all masks: O(3^n)
Interview signal: n ≤ 20