What Is a Heuristic? The Shortcut That Lets Computers Guess Smart

- Heuristic definition: an estimated cost function h(n) that trades guaranteed correctness for speed, used when exact computation is too slow
- Admissibility rule: h(n) must never overestimate the true cost; violate it and A* returns a suboptimal answer with no warning
- Manhattan distance is the go-to admissible heuristic on four-directional grids because it lower-bounds actual move count and is tighter than Euclidean distance
- Greedy algorithms are heuristics at the strategy level, not just the evaluation function; they work only when the greedy choice property holds
- NP-hard problems make heuristics essential: nearest-neighbor TSP runs in O(N²) and produces routes ~20-25% over optimal, far better than exact O(N!) search
- Interview signal: name your heuristic, justify its admissibility, and know that weighted A* buys speed at a bounded suboptimality cost of (1 + epsilon)
Your GPS finds you a route in under a second across a city with millions of possible turns. It doesn't check every route. It can't. It guesses, cleverly, using a heuristic. And that educated guess is the reason you're not still sitting in the parking lot waiting for your app to finish thinking.
You've seen the word "heuristic" scattered across algorithm textbooks and interview prep guides, usually dropped without ceremony. Here's what it actually means, why it matters, and what to say when an interviewer asks about it.
Your Brain Already Does This
Imagine you're trying to find your way across a city grid. You could explore every possible route. That's BFS or Dijkstra's: thorough, principled, expanding outward in every direction like a slow explosion. Or you could say "the destination looks north-east of me, so I'll prioritize streets going north-east." You might miss the single fastest path if there's a detour, but you'll find a good path fast.
That "looks north-east" instinct is a heuristic. It uses information you already have to make a smarter guess about which direction is worth exploring.
Formally: a heuristic is a function h(n) that estimates the cost of reaching the goal from node n, using available information about the problem structure. It doesn't compute the exact cost. It guesses, cleverly. The word comes from the Greek "heuriskein," meaning "to find," which is also where "eureka" comes from. You're welcome for the etymology, that's free.
A* Is the Canonical Example
A* search is the cleanest demonstration of a heuristic in action. The algorithm evaluates each candidate node using:
f(n) = g(n) + h(n)
Where g(n) is the exact cost from start to node n (already computed), and h(n) is the heuristic estimate of the cost from n to the goal (guessed). A* expands whichever node has the lowest f(n) first.
The heuristic tells A* where to look. Without it, you get Dijkstra's algorithm: correct, but expanding nodes in concentric rings outward from the start regardless of where the goal is. Dijkstra is that person at a party who shakes every hand in the room before finding the one person they actually came to meet.
A* knows where the goal is, because h(n) encodes that knowledge.

Dijkstra explores everything. A reads the room first.*
The Manhattan Distance Example
On a grid where you can only move up, down, left, right, the Manhattan distance from (x1, y1) to (x2, y2) is:
def manhattan_distance(x1, y1, x2, y2): return abs(x1 - x2) + abs(y1 - y2)
This works as h(n) because it answers: "ignoring walls, what's the minimum number of moves from here to the goal?" It never overcounts. Even in a maze packed with walls, you still need at least this many moves. That's the key property, and it has a name.
The One Rule That Makes A* Correct: Admissibility
A heuristic is admissible if h(n) never overestimates the true cost to reach the goal. Formally: h(n) <= h*(n) for every node n, where h*(n) is the actual optimal cost.
Manhattan distance is admissible on a four-directional grid. You might need more moves than the Manhattan distance (walls exist), but you'll never need fewer. The estimate is always a lower bound on reality.
With an admissible heuristic, A* is guaranteed to find the optimal path. This is not magic. If the heuristic underestimates, A* keeps the true optimal path as a candidate until it's actually explored. It might visit a few extra nodes along the way, but it won't terminate early with something suboptimal.
Euclidean distance is also admissible on a grid. It always underestimates the true path cost through discrete cells. But it's less tight than Manhattan distance for axis-aligned movement, so A* expands slightly more nodes. Admissible, yes. Lazy, a bit.
What Happens When You Break Admissibility
Say your heuristic overestimates. A* might now expand the wrong node first, one that looks cheap but whose h(n) is inflated. That node gets selected, turns out to be suboptimal, A* terminates anyway. Wrong answer.
This isn't always bad by design. Weighted A* intentionally inflates the heuristic by a factor of 1 + epsilon:
f(n) = g(n) + (1 + epsilon) * h(n)
The result is faster search (fewer nodes expanded) with a bounded suboptimality guarantee: the returned path costs at most (1 + epsilon) times the optimal. You're explicitly choosing speed over perfect optimality, and you know exactly how much you're giving up. That's a principled trade.
The dangerous case is when you violate admissibility accidentally and don't notice your solution might be wrong. That's not a trade, that's a bug wearing a lab coat.
Greedy Algorithms Are Heuristics Too
A greedy algorithm makes locally optimal choices at each step without reconsidering earlier decisions. That is a heuristic strategy applied to the whole algorithm, not just a single evaluation function.
Greedy works when the local optimum at each step actually leads to a global optimum. When it doesn't, you get a fast answer that might be embarrassingly wrong.
Activity selection (pick the interval ending earliest, repeat) is a case where greedy gives you the exact optimal. Coin change with arbitrary denominations is where greedy fails. Start with $6 worth of coins using denominations of 4 and 3. Greedy grabs one 4-cent coin, then can't make change for the remaining 2. Correct greedy move, broken result.
The difference matters in interviews. If you propose a greedy solution, you should be able to say why the greedy choice property holds, meaning why picking the locally best option now doesn't lock you out of a globally better option later. If you can't articulate that, the interviewer will probe until you can or can't.
Where Heuristics Become Non-Negotiable: NP-Hard Problems
NP-hard problems are where heuristics stop being an optimization and start being the only option you have.
The Traveling Salesman Problem (TSP) asks: given N cities and distances between them, what's the shortest route that visits each city exactly once and returns home? The exact answer requires checking O(N!) routes. For 20 cities, that's about 2.4 quadrillion routes. You could wait. Thermodynamics says no.
The nearest-neighbor heuristic: start at any city, always go to the closest unvisited city, return home when done.
def nearest_neighbor_tsp(cities, distances): unvisited = set(cities) current = cities[0] tour = [current] unvisited.remove(current) while unvisited: next_city = min(unvisited, key=lambda c: distances[current][c]) tour.append(next_city) unvisited.remove(next_city) current = next_city tour.append(cities[0]) return tour
This runs in O(N^2) and produces a route that's typically 20-25% longer than optimal on random instances. Not perfect. Completely usable. Ships.
That's the heuristic value proposition for NP-hard problems: sacrifice provable optimality, gain a polynomial-time algorithm that gives you something you can actually use today.
Time and Space Implications
Without a heuristic, uninformed search like BFS expands every node at branching factor b per depth level: O(b^d) time and space, where d is the solution depth. Every direction is equally interesting. None of them are.
With a good admissible heuristic, A* prunes branches that h(n) identifies as unpromising. In the best case, A* expands only the nodes along the optimal path. In practice you get something between the two extremes depending on how tight your heuristic is.
A tighter heuristic (one closer to h(n) without overestimating) means fewer nodes expanded and faster search.* Manhattan distance is tighter than Euclidean for axis-aligned movement, so it performs better in practice. The theoretical worst case for both is still O(b^d), but the constant matters enormously when your graph has millions of nodes and your users are impatient.
When to Mention Heuristics in Interviews
Three predictable places this comes up:
Scenario 1: You're implementing A or best-first search.* If the interviewer asks you to find a path on a grid and you propose A*, you need to name your heuristic and justify its admissibility. "I'll use Manhattan distance because movement is restricted to four directions, so it never overestimates" is the answer. "I'll use a heuristic function" is not an answer.
Scenario 2: The interviewer asks "can you do better?" after you give a correct but slow solution. If the problem is NP-hard or involves large-scale search, the honest answer is: "We can't do asymptotically better and guarantee optimality, but we can use a heuristic to get a good approximation fast." Naming the approximation ratio shows you know the terrain.
Scenario 3: Graph problems with follow-up questions. Route optimization, game pathfinding, and map search all escalate into "what if the graph is huge?" The expected answer involves A* with a domain-appropriate heuristic, not "Dijkstra but on more servers."
The thing that trips candidates up is not knowing the formal term. You might intuitively use a heuristic without calling it one. Practice stating it explicitly: "this is an admissible heuristic because it lower-bounds the true cost." The formality signals you understand why it works, not just that it works.
SpaceComplexity is designed for exactly this kind of verbal pressure. When an interviewer says "how does your heuristic affect the optimality guarantee?" out loud, mid-session, you need to answer without staring at a blank editor. The voice interview format forces you to articulate this reasoning in real time, which is very different from writing it quietly in comments.
Key Takeaways
- A heuristic is admissible if it never overestimates the true cost:
h(n) <= h*(n). Admissibility guarantees A* finds the optimal solution. Violate it accidentally and you have a bug. Violate it on purpose (weighted A*) and you have a trade-off. - Manhattan distance is admissible on four-directional grids. Euclidean distance is admissible but looser, so A* expands more nodes with it.
- Greedy algorithms are heuristics applied at the strategy level. They're optimal when the greedy choice property holds, and quietly wrong when it doesn't.
- For NP-hard problems, heuristics are the only practical tool. Polynomial time with a known approximation ratio beats waiting forever for exact.
- In interviews: name your heuristic, justify its admissibility, and know the cost of violating it.