The Fenwick Tree: One Bit Trick to Rule Your Prefix Sums

You have an array of numbers. You need prefix sums. You also need to update individual elements. Do both fast. This is the entire problem, and it shows up constantly in technical interviews.

The naive approach: recompute the prefix sum from scratch on every query. That's O(n) per query. Not great. A prefix sum array gets you O(1) queries but O(n) updates, because every element to the right has to shift. Also not great. You've just moved the O(n) from one column to the other. What you want is O(log n) for both, simultaneously. The Fenwick tree, also called the Binary Indexed Tree (BIT), delivers exactly that with a single array and a handful of bit operations.

The mental model: a flat array that secretly encodes a tree. Each slot stores the sum of a specific range, and the boundaries of that range come entirely from the binary representation of the index.

The Low-Bit Trick

Before anything else, you need to understand one operation. Given an integer i, the expression i & (-i) returns the lowest set bit of i. That's it. That one operation is the heart of the entire structure.

Why does it work? Two's complement negation flips all bits and adds 1. If i ends in ...1000 in binary, then -i ends in ...1000 as well (all the trailing zeros flip to 1, the 1 flips to 0, and then adding 1 carries back). AND-ing them keeps only that last set bit.

Some concrete examples:

i = 12 = 1100₂      → i & -i = 0100₂ = 4
i = 6  = 0110₂      → i & -i = 0010₂ = 2
i = 8  = 1000₂      → i & -i = 1000₂ = 8
i = 7  = 0111₂      → i & -i = 0001₂ = 1

The pattern: a power of two gives itself. A number ending in k zeros returns 2^k.

How the Implicit Tree Works

A Fenwick tree is a 1-indexed array bit[] of the same length as your input array. Each cell bit[i] stores the sum of exactly lowbit(i) consecutive elements ending at position i. Not a prefix sum, not a single element. A range whose length is the lowest set bit of the index.

Visually, for n = 16, the ranges look like this:

Index  Binary   lowbit   Range stored in bit[i]
  1    0001       1      [1, 1]
  2    0010       2      [1, 2]
  3    0011       1      [3, 3]
  4    0100       4      [1, 4]
  5    0101       1      [5, 5]
  6    0110       2      [5, 6]
  7    0111       1      [7, 7]
  8    1000       8      [1, 8]
  9    1001       1      [9, 9]
 10    1010       2      [9,10]
 11    1011       1      [11,11]
 12    1100       4      [9,12]
 13    1101       1      [13,13]
 14    1110       2      [13,14]
 15    1111       1      [15,15]
 16    10000      16     [1,16]

Notice the structure. Indices that are powers of two store increasingly large prefix sums. Index 8 stores the sum of the entire first half. Index 4 stores the sum of the first quarter. Single-bit indices (1, 3, 5, 7...) store individual elements.

The Implicit Tree Structure

Fenwick tree implicit structure for n=8 showing each node's index, its covered range in amber, and update parent edges

Each node stores the sum of its labeled range. Edges follow the update parent direction: a point update at i propagates to i + lowbit(i), then keeps climbing until it exceeds n.

Each parent's range is the union of its children's ranges. The tree is implicit: no pointers, no nodes. Just the array.

Core Operations

Operation	Time	Space
Build from array	O(n)	O(n)
Point update	O(log n)	O(1)
Prefix sum query	O(log n)	O(1)
Range sum query [l, r]	O(log n)	O(1)
Range update + point query	O(log n)	O(n)
Range update + range query	O(log n)	O(n)

Why prefix queries are O(log n)

To compute the prefix sum [1..i], you decompose i into its set bits. Each set bit corresponds to a range that a bit[] cell already covers. You sum those cells.

For i = 13 = 1101₂:

Strip lowest bit (1): add bit[13], move to i = 12
Strip lowest bit (4): add bit[12], move to i = 8
Strip lowest bit (8): add bit[8], move to i = 0
Done

Three steps. Because 13 has 3 set bits. The worst case is when i has the most set bits, which is at most floor(log₂ n) bits. So every prefix query touches at most log₂ n cells.

def query(bit, i):
    total = 0
    while i > 0:
        total += bit[i]
        i -= i & (-i)   # strip lowest set bit
    return total

Step-by-step query walkthrough for query(13): hopping from bit[13] to bit[12] to bit[8] by stripping the lowest set bit at each step

query(13) touches three cells because 13 = 1101₂ has three set bits. Strip one bit per hop, done in O(log n).

Why updates are O(log n)

When you update element at index i, every bit[j] whose stored range includes position i must be updated. Those indices are exactly the ones you reach by repeatedly adding the lowest set bit: i, i + lowbit(i), i + lowbit(i + lowbit(i)), ...

Each step sets a new, higher bit, so after at most log₂ n steps you exceed n and stop. That's the whole proof.

def update(bit, i, delta):
    while i < len(bit):
        bit[i] += delta
        i += i & (-i)   # climb to next ancestor

Step-by-step update walkthrough showing update at index 3 propagating to bit[3], bit[4], and bit[8] before exceeding n=8

update(3, Δ) climbs 3 → 4 → 8 → 16 in three steps. Adding lowbit at each step sets a strictly higher bit, guaranteeing termination.

Why O(n) build is possible

The naive build calls update() once per element: O(n log n). You can cut that to O(n), and in an interview you should. For each index i, add bit[i] to bit[i + lowbit(i)]. This propagates each element's contribution exactly once to its immediate parent, and since every parent/child relationship is visited once, the total work is O(n).

def build(arr):
    n = len(arr)
    bit = [0] * (n + 1)
    for i in range(1, n + 1):
        bit[i] += arr[i - 1]
        parent = i + (i & (-i))
        if parent <= n:
            bit[parent] += bit[i]
    return bit

The Non-Obvious Traps

Min and max don't work. The structure relies on the operation being invertible. Sum is invertible (subtraction). Min is not. min(a, b) can't be "un-applied" to recover what the original contribution was. Fenwick trees cannot support arbitrary range minimum queries. Full stop. For range min or max, you need a segment tree, which is also about 60 more lines of code.

1-indexed everywhere. Index 0 is unusable because 0 & (-0) = 0, producing an infinite loop. Every tutorial that uses 0-based indexing is quietly translating internally. Just accept 1-based. Your muscle memory will recover in about a week.

Range update, range query needs two trees. To support add(l, r, v) with query(l, r), you need two BITs (call them B1 and B2). The math works out: prefix_sum(i) = B1[i] * i - B2[i]. You update both trees on every range update. This is the kind of bug that returns wrong numbers with no exception, no stack trace, and complete confidence. Use two trees. No exceptions.

Off-by-ones cascade. The range sum [l, r] is query(r) - query(l - 1). Using query(l) instead of query(l - 1) is the most common BIT bug in interview code. You will write it. Write a test first.

Fenwick vs. Segment Tree

The honest comparison:

	Fenwick Tree	Segment Tree
Space	O(n)	O(4n)
Point update	O(log n)	O(log n)
Prefix/range sum	O(log n)	O(log n)
Range update (lazy)	Two trees needed	Native via lazy prop
Range min/max	Not supported	Supported
Implementation length	~10 lines	~50-80 lines
Cache behavior	Generally better	Can be worse (pointer chasing in recursive version)

The rule of thumb: use a Fenwick tree when you need prefix/range sums with point updates. Use a segment tree when you need range updates, min/max, or arbitrary associative operations.

Fenwick trees run 2-4x faster in practice: no recursion, a tight loop, and a flat array that L1 cache loves. For sum queries, they're the right default. If you genuinely enjoy 80 lines of recursive segment tree, both options are on the table.

Implementations

class FenwickTree:
    def __init__(self, n: int):
        self.n = n
        self.bit = [0] * (n + 1)

    def update(self, i: int, delta: int) -> None:
        while i <= self.n:
            self.bit[i] += delta
            i += i & (-i)

    def query(self, i: int) -> int:
        total = 0
        while i > 0:
            total += self.bit[i]
            i -= i & (-i)
        return total

    def range_query(self, l: int, r: int) -> int:
        return self.query(r) - self.query(l - 1)

    @classmethod
    def build(cls, arr: list[int]) -> "FenwickTree":
        tree = cls(len(arr))
        for i, value in enumerate(arr, start=1):
            tree.bit[i] += value
            parent = i + (i & (-i))
            if parent <= tree.n:
                tree.bit[parent] += tree.bit[i]
        return tree

What Problems the Fenwick Tree Solves

Fenwick trees show up wherever you have a sequence of values that change over time and you need to aggregate prefix information quickly.

Frequency counting with cumulative queries. You have a stream of numbers. After each insertion, answer "how many elements so far are less than or equal to X?" Maintain a frequency array, use the Fenwick tree over it. Each insertion is O(log n), each count query is O(log n).

Inversion counting. An inversion in an array is a pair (i, j) where i < j but arr[i] > arr[j]. Counting all inversions naively is O(n²). With a Fenwick tree and coordinate compression, it's O(n log n).

Dynamic order statistics. You need the k-th smallest element in a collection that supports insertions. With coordinate compression and a Fenwick tree, binary search over the BIT gives you the k-th element in O(log² n) or O(log n) with a smarter descent.

Range sum with updates. The textbook case. Any problem that reduces to "sum of arr[l..r], then arr[i] += delta" is a Fenwick tree problem.

2D prefix sums with updates. Extend to a 2D BIT for matrix problems: both update and query become O(log n · log m). This shows up in problems involving rectangle sum queries over a grid that mutates.

Recognizing the Pattern

Two signals. Either one is enough to know you're looking at a Fenwick tree problem.

Signal 1: You need prefix sums, but the array changes. If the problem asks you to maintain running totals while individual values update, and no problem-specific constraint prevents you from iterating prefix sums in O(log n), it's a Fenwick tree.

Signal 2: You're counting "how many elements satisfy a condition up to position X." This is frequency-table territory, which maps exactly to Fenwick tree queries after coordinate compression.

Example 1: Count of Range Sum (LeetCode 327)

Problem: Given an array, count the number of subarrays whose sum lies in [lower, upper].

Why Fenwick tree: The sum of subarray [i, j] is prefix[j] - prefix[i-1]. You want to count pairs where lower <= prefix[j] - prefix[i-1] <= upper, i.e., prefix[j] - upper <= prefix[i-1] <= prefix[j] - lower. As you scan j left to right, you're asking how many previous prefix sums fall in a range. Coordinate-compress the prefix values, maintain a Fenwick tree of "how many prefix sums have I inserted so far at each coordinate," and each query is bit.rangeQuery(lower_coord, upper_coord). O(n log n) total.

Example 2: Count Inversions

Problem: Count the number of pairs (i, j) where i < j and arr[i] > arr[j].

Why Fenwick tree: Scan the array left to right. For each element arr[j], count how many previously inserted elements are greater than arr[j]. That count is the number of inversions ending at j. Coordinate-compress the values (so they sit in range [1, n]). When you process arr[j], query bit.rangeQuery(arr[j] + 1, n) to count larger elements already seen, then call bit.update(arr[j], 1) to record this element. Total inversions accumulate across all j. The whole pass is O(n log n).

def count_inversions(arr: list[int]) -> int:
    # coordinate compress
    sorted_unique = sorted(set(arr))
    rank = {value: i + 1 for i, value in enumerate(sorted_unique)}
    n = len(sorted_unique)

    tree = FenwickTree(n)
    inversions = 0

    for value in arr:
        r = rank[value]
        # count elements already inserted that are strictly greater
        inversions += tree.query(n) - tree.query(r)
        tree.update(r, 1)

    return inversions

Diagram: Inversion Count Walk-through

arr = [3, 1, 2]
sorted_unique = [1, 2, 3]  →  rank: {1:1, 2:2, 3:3}

Step 1: value=3 (rank 3)
  inversions += query(3) - query(3) = 0 - 0 = 0
  update(3, 1)
  BIT state: [0, 0, 0, 1]

Step 2: value=1 (rank 1)
  inversions += query(3) - query(1) = 1 - 0 = 1
  update(1, 1)
  BIT state: [0, 1, 0, 1]

Step 3: value=2 (rank 2)
  inversions += query(3) - query(2) = 2 - 1 = 1
  update(2, 1)

Total inversions = 0 + 1 + 1 = 2
Pairs: (3,1) and (3,2). Correct.

Quick Recap

The Fenwick tree stores a flat array where bit[i] covers a range of length lowbit(i) = i & (-i) ending at i.
Prefix query: strip the lowest bit repeatedly to hop through covering ranges. O(log n) because an integer has at most log n set bits.
Point update: add the lowest bit repeatedly to propagate upward. O(log n) by the same argument.
O(n) build: propagate each cell to its immediate parent once, in forward order.
Range sum is query(r) - query(l - 1). Use l - 1, not l.
Min/max queries are not supported. Use a segment tree for those.
Range update with range query needs two trees.
Index from 1, not 0.
Core use cases: prefix sums with point updates, inversion counting, dynamic frequency tables, order statistics.

SpaceComplexity runs voice-based mock interviews where the interviewer pushes on exactly the edge cases covered here: why query(l-1) and not query(l)? What if you need range min? Can you build in O(n)? Getting comfortable with these live is a different skill from reading about them.

If you're shoring up your foundation before tackling Fenwick trees, start with dynamic programming fundamentals or the sliding window technique, both of which pair naturally with the prefix-sum mindset.