What Is Fermat's Little Theorem? Modular Inverse in O(log p)

Pierre de Fermat wrote the theorem in a letter in 1640 and noted he had a proof too elegant to fit in the margin. The proof part took 358 more years. The theorem itself held up fine. Classic Fermat.

What the theorem actually does: it gives you a way out when a LeetCode problem ends with "return the answer modulo 10^9+7" and your formula has a division in it. Which, under modular arithmetic, is not supposed to work. And then, thanks to Fermat, suddenly does.

The Theorem in One Line

If p is prime and a is not divisible by p:

a^(p-1) ≡ 1 (mod p)

Raise a to the power p-1, divide by p, take the remainder. The remainder is always exactly 1. Not sometimes. Not usually. Always, for every prime p and every a that doesn't share a factor with p.

Why "little"? To distinguish it from Fermat's Last Theorem, the one that held up professional mathematicians for 358 years before Andrew Wiles solved it in 1995. This one Euler wrapped up 95 years after the original letter. By contrast, basically instant.

There's a second form: a^p ≡ a (mod p) for all integers a, including multiples of p. When a is a multiple of p, both sides collapse to 0 mod p, so it's trivially true. The first form is the one you'll actually derive useful things from.

The theorem is also a special case of Euler's: a^φ(n) ≡ 1 (mod n) for any n coprime to a, where φ(n) is Euler's totient function. When n is prime, φ(n) = n-1. You get Fermat back directly.

A Concrete Example

Take p = 7, a = 3.

The theorem says 3^6 should equal 1 (mod 7). Let's verify rather than taking the word of a 17th-century French mathematician:

3^1 = 3       → 3 mod 7 = 3
3^2 = 9       → 9 mod 7 = 2
3^3 = 27      → 27 mod 7 = 6
3^4 = 81      → 81 mod 7 = 4
3^5 = 243     → 243 mod 7 = 5
3^6 = 729     → 729 mod 7 = 1 ✓

Notice the sequence of remainders: 3, 2, 6, 4, 5, 1. That's every nonzero residue mod 7, each exactly once. This is not a coincidence and it's not luck. The powers of a cycle through all nonzero residues before landing at 1, and the cycle length must divide p-1. The theorem falls directly out of this structure.

Try p = 13, a = 5: you need 5^12 mod 13. Work it out and you get 1. Try p = 11, a = 2: 2^10 = 1024, and 1024 mod 11 = 1. Every prime, every valid a. Exactly 1. Every time.

Why Division Breaks Under Modulo

Addition and multiplication distribute cleanly over modular arithmetic:

(a + b) mod p = ((a mod p) + (b mod p)) mod p
(a * b) mod p = ((a mod p) * (b mod p)) mod p

Division does not. Consider (6 / 2) mod 5 = 3. But (6 mod 5) / (2 mod 5) = 1 / 2. Not an integer. Math, apparently, does not care about your LeetCode submission deadline.

To divide by a under modulo p, you multiply by the modular inverse of a instead.

The modular inverse of a (written a^(-1)) is the integer x such that a * x ≡ 1 (mod p). It plays the same role as 1/a in regular arithmetic, except it stays inside the integers. For p = 7, the inverse of 3 is 5, because 3 * 5 = 15 and 15 mod 7 = 1.

The question is how you find this inverse for large primes, without spending the rest of your life guessing.

For the full ground rules on what modular arithmetic allows and forbids, this post on modular arithmetic in coding interviews covers them in detail.

Fermat Hands You the Modular Inverse

Start from a^(p-1) ≡ 1 (mod p). Pull out one copy of a:

a * a^(p-2) ≡ 1 (mod p)

That's the definition of a modular inverse. So:

a^(-1) mod p = a^(p-2) mod p

The modular inverse of a is just a^(p-2) mod p. No extended Euclidean algorithm. No matrix operations. No guessing. One exponentiation and you're done.

For p = 7, a = 3: the inverse is 3^5 mod 7 = 243 mod 7 = 5. We already verified 3 * 5 ≡ 1 (mod 7) above.

For p = 10^9+7 (the standard LeetCode prime) and some large a: the inverse is a^(10^9+5) mod (10^9+7). The exponent is about a billion. That sounds very slow. It is not.

Computing It: O(log p) With Binary Exponentiation

Computing a^(p-2) naively means p-2 multiplications. For p = 10^9+7, that's roughly a billion operations. Runtime estimate: somewhere between "you've already failed the interview" and "heat death of the universe."

Binary exponentiation computes a^n mod p in O(log n) time by squaring repeatedly and reading the bits of the exponent. For every set bit, multiply the current base into the result. For every zero bit, square and move on.

def mod_pow(base: int, exp: int, mod: int) -> int:
    result = 1
    base %= mod
    while exp > 0:
        if exp % 2 == 1:
            result = result * base % mod
        base = base * base % mod
        exp //= 2
    return result

def mod_inverse(a: int, p: int) -> int:
    return mod_pow(a, p - 2, p)

The diagram below shows how this works for a^13. The exponent 13 = 1101₂ has four bits, so you need four squarings. Three bits are set, so you multiply into the result three times. Done.

Binary exponentiation: computing a^13 in four steps

For p = 10^9+7, the exponent has about 30 bits. Thirty squarings replaces a billion multiplications. That is genuinely the whole trick.

In Python, use pow(a, p - 2, p) directly. The three-argument form of pow applies fast exponentiation automatically. In Java, BigInteger.modPow does the same. In C++, write your own loop or use a trusted template. Space is O(1) throughout since you compute the result iteratively without recursion.

The nCr Pattern That Trips People Up

The most common place this theorem shows up in real problems is binomial coefficients under a prime modulus. The formula everyone knows:

nCr = n! / (r! * (n-r)!)

Under mod p, that division is illegal. You replace it:

nCr mod p = fact[n] * inv(fact[r]) * inv(fact[n-r]) mod p

where inv(x) = pow(x, p-2, p).

For a single query: three calls to mod_inverse. Fine. For many queries (which is common in harder problems), precompute all factorials and their inverses up front and answer every query in O(1):

MOD = 10**9 + 7
MAX_N = 10**5 + 1

fact = [1] * MAX_N
for i in range(1, MAX_N):
    fact[i] = fact[i-1] * i % MOD

inv_fact = [1] * MAX_N
inv_fact[MAX_N - 1] = pow(fact[MAX_N - 1], MOD - 2, MOD)
for i in range(MAX_N - 2, -1, -1):
    inv_fact[i] = inv_fact[i+1] * (i+1) % MOD

def nCr(n: int, r: int) -> int:
    if r < 0 or r > n:
        return 0
    return fact[n] * inv_fact[r] % MOD * inv_fact[n-r] % MOD

One O(n) precomputation step, then every nCr query runs in O(1).

The backward pass is the smart part. You call pow with modular exponentiation exactly once, for the largest factorial. Then you recover every smaller inverse factorial using the identity inv_fact[i] = inv_fact[i+1] * (i+1) % MOD. This follows from cancellation in (i+1)! = (i+1) * i!, so (i!)^(-1) = ((i+1)!)^(-1) * (i+1). No extra mod_pow calls.

This precomputation pattern shows up in grid path counts, subset counts, and combinatorial identities under a large prime. The post on why 10^9+7 appears everywhere explains why that specific prime is beloved by problem setters.

When This Shows Up in Interviews

Fermat's theorem almost never shows up as a stated topic. It ambushes you inside counting problems. The signals:

"Return the answer modulo 10^9+7" in a problem involving combinations, permutations, or arrangements
Counting distinct orderings with repeated elements (multinomial coefficients)
Grid path counts (grid paths = nCr(m+n-2, m-1))
Probability expressed as a fraction of integers
Any formula with division where the modulus is prime

LeetCode 1569 (Number of Ways to Reorder Array to Get Same BST), LeetCode 1220 (Count Vowels Permutation), and most problems tagged "combinatorics" will need this somewhere in the solution.

You do not need to recognize the phrase "Fermat's little theorem." You need to recognize "I have a division inside a modular expression" as the trigger, and know the fix is a^(p-2) mod p. Name the technique, derive it out loud from a^(p-1) ≡ 1, and explain why it works. That's what the interviewer is waiting to hear.

Talking through this chain of reasoning under time pressure, in real time, is exactly what separates a hire from a no-hire. If you want to practice explaining modular inverse derivations in an actual interview format, SpaceComplexity runs live voice-based mock interviews with rubric-based feedback on exactly these kinds of derivations.

What About Non-Prime Moduli?

Fermat only works when p is prime. If a problem hands you a non-prime modulus:

Modular inverses may not even exist. The inverse of a mod m exists only when gcd(a, m) = 1.
Use the Extended Euclidean Algorithm. It finds gcd(a, m) and the inverse simultaneously in O(log m).
For prime powers or other structured moduli, Euler's theorem generalizes the approach: a^φ(n) ≡ 1 (mod n) when gcd(a, n) = 1.

In interview problems, the modulus is almost always 10^9+7 or 998244353, both prime. If you see something like 10^6 or 2^32, check primality before reaching for Fermat.

Non-prime modulus means Extended Euclidean. Full stop.