What Is Fast Exponentiation? The O(log n) Algorithm Explained

June 5, 20268 min read
dsaalgorithmsinterview-prepdata-structures
What Is Fast Exponentiation? The O(log n) Algorithm Explained
TL;DR
  • Fast exponentiation computes x^n in O(log n) steps by repeatedly squaring the base, versus O(n) for the naive loop.
  • The binary representation of the exponent is the key insight: you multiply together only the powers where the binary bit is 1.
  • The iterative version runs in O(1) space; the recursive version uses O(log n) stack space — always write the iterative version in an interview.
  • The INT_MIN overflow trap bites Java and C++ when n = -2147483648; promote n to long before negating to avoid it.
  • Modular exponentiation applies % mod at every multiplication step, not at the end, so intermediate values never overflow.
  • Matrix exponentiation extends the same squaring trick to compute the n-th Fibonacci number in O(log n) when n reaches 10^18.
  • LeetCode 50 (Pow(x, n)) is the canonical interview problem; the naive O(n) loop will time out on large inputs.

You need to compute x^1000000000. A for-loop that multiplies x by itself one billion times will time out. Fast exponentiation solves it in 30 steps. Not 30 seconds. 30 multiplications. This post explains exactly why that works.

Fast exponentiation (also called binary exponentiation or exponentiation by squaring) is one of those algorithms where the idea fits in one sentence, but the consequences show up everywhere: LeetCode 50, modular arithmetic problems, Fibonacci in O(log n), and cryptographic primitives. It's also one of those ideas that makes you feel slightly cheated that nobody showed it to you sooner.

The Naive Approach Has a Fundamental Flaw

The obvious implementation multiplies x by itself n times:

def naive_power(x, n): result = 1 for _ in range(n): result *= x return result

This is O(n). For n = 10, fine. For n = 10^9, your laptop starts sounding like a jet engine and the answer never comes.

The flaw isn't that the loop is slow in some constant-factor way. The flaw is that the algorithm ignores the structure of exponentiation entirely. It treats x^1000000000 as a billion separate tasks when it's actually one task you can subdivide. Like driving from New York to Los Angeles by walking. Correct, technically. Practical, no.

Squaring Is Free

Say you already know x^4. How long does it take to compute x^8?

One multiplication. x^8 = (x^4) * (x^4).

You didn't need to start from scratch. Every time you square a result, you double the exponent at the cost of a single multiply. That's the whole trick. It's almost offensive how simple it is.

More precisely:

  • If n is even: x^n = (x^(n/2))^2
  • If n is odd: x^n = x * (x^((n-1)/2))^2

Each case halves the exponent. You go from n to n/2 to n/4 to n/8. After floor(log2(n)) steps, the exponent hits 0. For n = 10^9, that's 30 steps. The time complexity drops from O(n) to O(log n), and that is not a rounding error.

Reading the Exponent in Binary

The cleanest way to see why this works: look at the exponent in binary.

Take 2^13. The number 13 in binary is 1101:

13 = 8 + 4 + 1
   = 2^3 + 2^2 + 2^0

So:

2^13 = 2^8 * 2^4 * 2^1
     = 256 * 16 * 2
     = 8192

You only multiply together the powers where the binary bit is 1. The iterative algorithm builds each power automatically by squaring the current base at every step. When the current bit is 1, it folds that power into the running result.

This is why the algorithm needs exactly floor(log2(n)) iterations. One per bit. The binary representation isn't a cute observation, it's the actual mechanism.

Binary exponentiation step-by-step trace showing 2^13 computed in 4 iterations

The Fast Exponentiation Algorithm, Written Out

Iterative version, O(log n) time and O(1) space:

def fast_power(x: float, n: int) -> float: if n < 0: x = 1 / x n = -n result = 1.0 while n > 0: if n % 2 == 1: result *= x x *= x n //= 2 return result

Trace through 2^13:

nbitresultbase (after squaring)
1312.04.0
602.016.0
3132.0256.0
118192.065536.0
0,8192.0done

Four iterations instead of thirteen. The pattern matches the binary representation of 13 (1101) read right-to-left.

The recursive version expresses the same logic more cleanly but uses O(log n) stack space:

def fast_power_recursive(x: float, n: int) -> float: if n == 0: return 1.0 if n < 0: return fast_power_recursive(1 / x, -n) half = fast_power_recursive(x, n // 2) if n % 2 == 0: return half * half return half * half * x

In an interview, write the iterative version. It doesn't grow the call stack, and it's easier to trace through with a concrete example. Interviewers like watching you trace. Give them something to watch.

Time and Space, Precisely

Time: O(log n). The while loop runs floor(log2(n)) iterations, one per bit position. Each iteration does at most two multiplications. Total work is O(log n).

Space: O(1) iterative, O(log n) recursive. The iterative version uses a fixed handful of variables. The recursive version allocates one stack frame per level, and there are floor(log2(n)) + 1 levels. For a deep dive on how call-stack depth maps to space complexity, see Recursion Time Complexity: The Tree, the Recurrence, and the Master Theorem.

For n = 10^9, that's 30 iterations and 30 stack frames. For n = 10^18, it's 60. The algorithm scales by adding a single iteration for each factor of two. That gentle growth is what logarithmic complexity means: see What Is Logarithmic Time Complexity?.

Where This Shows Up in Interviews

LeetCode 50: Pow(x, n)

This is the standard problem. The naive O(n) loop times out. The expected solution is fast exponentiation.

The problem accepts negative exponents. x^(-n) = 1 / x^n. Handle this by flipping x to 1/x and negating n before entering the loop, as shown above.

The INT_MIN Overflow Trap

This is the bug that kills real submissions, and it will kill yours too if you're not careful. In Java and C++, int holds values from -2147483648 to 2147483647. If n = -2147483648 (INT_MIN) and you write n = -n, you overflow because 2147483648 exceeds INT_MAX by exactly one. The kind of off-by-one that makes senior engineers stare at walls.

Fix it by promoting n to a long before negating:

public double myPow(double x, int n) { long exp = n; if (exp < 0) { x = 1 / x; exp = -exp; } double result = 1.0; while (exp > 0) { if ((exp & 1) == 1) result *= x; x *= x; exp >>= 1; } return result; }

Python skips this problem entirely because Python integers are arbitrary-precision. But if you write in Java, C++, or Go and skip the cast, the edge case n = -2147483648 will fail every time. LeetCode's test suite includes it. The grader does not care about your feelings.

Modular Exponentiation

Many problems ask for the result modulo 10^9 + 7, especially combinatorics and counting problems. You can't just compute x^n and then take the remainder, because x^n overflows long before you finish. Apply the modulo at each step instead.

This works because (a * b) % m = ((a % m) * (b % m)) % m. The intermediate values stay bounded by mod^2, which fits in a 64-bit integer when mod is around 10^9.

def mod_power(base: int, exp: int, mod: int) -> int: result = 1 base %= mod while exp > 0: if exp % 2 == 1: result = (result * base) % mod base = (base * base) % mod exp //= 2 return result

For the full mechanics of why modulo distributes, What Is Modular Arithmetic? has the complete explanation.

Matrix Exponentiation

Fast exponentiation isn't limited to numbers. Any associative operation (any monoid) supports the same squaring trick. The most useful interview application is matrix exponentiation.

Computing the n-th Fibonacci number in O(log n) relies on this identity:

[[F(n+1), F(n)  ],   =  [[1, 1],  ^ n
 [F(n),   F(n-1)]]       [1, 0]]

Raise a 2x2 matrix to the n-th power using the same algorithm. Each "multiply" now means a matrix multiplication. Total: O(log n) matrix multiplications.

This becomes relevant when n is up to 10^18 and a linear DP solution cannot run fast enough. If a problem asks for the n-th term of a linear recurrence at large n, matrix exponentiation is often the intended solution. It is also the kind of thing that makes other people in the interview room nervous when you mention it.

For all fourteen language implementations of binary exponentiation, including the modular variant and the matrix extension, see Binary Exponentiation: Why Computing x^1000000000 Takes 30 Steps.

Explain It While You Code

Fast exponentiation is one of those problems where the solution clicks quickly but narrating it under interview pressure is harder than it looks. The binary representation angle, the squaring invariant, why modulo distributes, the INT_MIN edge case: all of it needs to come out of your mouth while you're coding. If you go silent and just write code, an interviewer with nothing to quote in the write-up is not a happy interviewer. SpaceComplexity runs voice-based mock interviews where you practice exactly that, with rubric-based feedback on how clearly you explained each piece.

Further Reading