Geo Proximity Service System Design Interview: Geohash Is the Answer

- Lat/lng range scans fail at scale: MySQL can't efficiently combine two B-tree indexes for 2D proximity queries.
- Geohash encodes any coordinate pair as a string where shared prefix means geographic proximity, enabling fast indexed equality lookups.
- 9-cell query (center cell plus 8 neighbors) prevents boundary misses when a user sits at a geohash cell edge.
- Haversine filtering after the geohash pre-filter removes rectangle corners and returns the exact circular result set.
- Cache
geohash → [business_ids]in Redis: geographic query clustering means cache hit rates are very high. - Separate the Location-Based Service (reads) from the Business Service (writes) to keep LBS stateless and horizontally scalable.
- Quadtree adapts cell granularity to population density, making it better than geohash for uneven distributions like real-time driver locations.
You know the requirements. You have the API. Then the interviewer asks: "How do you actually find businesses within 5 km of the user?" Your first instinct is WHERE latitude BETWEEN X AND Y AND longitude BETWEEN A AND B.
That instinct is wrong. And it's wrong in exactly the way that reveals whether you've thought seriously about spatial data.
This covers the full Yelp-style geo proximity service system design interview: requirements, architecture, geohash spatial indexing, and the scaling discussion.
Set the Scope First
Five minutes on requirements pays off later.
Functional:
- Users search for businesses by location, radius, and category
- Business owners create, update, and close listings
- Users can read business details: name, address, rating, photos
Non-functional:
- Search returns results in under 100 ms at the 99th percentile
- Read-heavy: searches are roughly 100x more frequent than writes
- High availability; a few seconds of stale data in search results is acceptable
Scale estimates: Assume 100M DAU, each running 5 searches per day. That's 500M searches per day, roughly 6,000 QPS at baseline. With a 3x peak multiplier, plan for 18,000 QPS. Write QPS is much lower: 5M businesses updated infrequently.
Two Services, One Key Decision
Write and read traffic have completely different patterns. A business listing update competing for resources with 18,000 search queries per second is not a design you want. So you split.
Business Service handles write traffic. Business owners call this to create and update listings. It writes to a business database and triggers an async job to update the geospatial index. Latency here can be 200-300 ms and nobody cares.
Location-Based Service (LBS) handles read traffic. Every search goes through the LBS. It must be stateless, horizontally scalable, and backed by fast caches. Target p99 under 100 ms. This is the service your interviewer actually wants to talk about.
The two services share nothing except a database. That is the point.
Why Range Scans on Lat/Lng Fail
The naive approach:
SELECT * FROM businesses WHERE lat BETWEEN 37.72 AND 37.83 AND lng BETWEEN -122.50 AND -122.35;
MySQL cannot efficiently combine two separate B-tree indexes for simultaneous range conditions on different columns. In practice, the query planner picks one column's index, reduces 200M rows to maybe 10M, then does a sequential scan on the second column. That's still millions of comparisons per query.
You will get this wrong once in production, spend three hours looking at slow query logs, and then you will never forget it.
The root problem is that latitude and longitude are two independent 1D values, but location is a 2D concept. You need an index that reasons about 2D proximity. The interview-standard answer is geohash.
Geohash: Folding 2D Into 1D
Geohash encodes any (lat, lng) pair as a short alphanumeric string such that strings sharing a prefix are geographically close.
The encoding works by recursive bisection:
- Take the longitude range [-180, +180]. If the point is in the upper half, record bit 1; lower half, bit 0. Halve the range and repeat.
- Alternate: the next bit comes from latitude, the one after from longitude, and so on.
- Every 5 bits of the interleaved sequence become one character in a base-32 alphabet (
0-9andbcdefghjkmnpqrstuvwxyz, witha,i,l,oomitted to avoid ambiguity).
Keep halving the map until you get a cell the size you need. Shared prefix = geographically close.
The string length controls precision:
| Length | Cell size |
|---|---|
| 4 | ~39 km x 20 km |
| 5 | ~4.9 km x 4.9 km |
| 6 | ~1.2 km x 0.6 km |
| 7 | ~153 m x 153 m |
The rule: choose the precision level where the cell size is smaller than your search radius. For a "within 5 km" query, precision 5 (4.9 km cells) is the right choice. For "within 500 m," use precision 7.
Store a geohash column in the businesses table at precision 6 and index it:
ALTER TABLE businesses ADD COLUMN geohash VARCHAR(8); CREATE INDEX idx_geohash ON businesses (geohash); SELECT * FROM businesses WHERE geohash = '9q8yy9';
That's an equality lookup on an indexed string column. Fast at any scale.
The 9-Cell Query: Where Most Designs Break
Here's the part most candidates miss.
A user standing at the edge of cell 9q8yy might be closest to a business in the adjacent cell 9q8yz. Query only the user's own cell and you miss it entirely. Your app tells them "no restaurants nearby" while they're standing 40 meters from a taqueria. This is bad.
Always query the center cell plus all 8 neighbors, forming a 3x3 grid of 9 cells.
User at cell edge. Business 40 meters away in the next cell. Query just the center? Missed. Always use 9 cells.
import geohash # python-geohash library user_hash = geohash.encode(lat, lng, precision=5) neighbors = list(geohash.neighbors(user_hash).values()) # 8 neighbors cells = [user_hash] + neighbors # 9 total # SQL cursor.execute( "SELECT business_id, lat, lng FROM businesses WHERE geohash IN %s", (tuple(cells),) )
But geohash cells are rectangles, not circles. The 9-cell grid includes the corners of the bounding rectangle, which extend past your intended radius. You need one more step.
Haversine: The Exact Filter
After the geohash query returns maybe a few hundred candidates, apply the Haversine formula to compute exact great-circle distance and discard anything beyond the radius:
from math import radians, sin, cos, sqrt, atan2 def haversine_km(lat1, lng1, lat2, lng2): R = 6371 # Earth radius in km dlat = radians(lat2 - lat1) dlng = radians(lng2 - lng1) a = (sin(dlat / 2) ** 2 + cos(radians(lat1)) * cos(radians(lat2)) * sin(dlng / 2) ** 2) return 2 * R * atan2(sqrt(a), sqrt(1 - a)) results = [ b for b in candidates if haversine_km(user_lat, user_lng, b.lat, b.lng) <= radius_km ]
The rectangle gives you speed. The circle gives you accuracy. You need both.
Geohash cuts 200 million rows to a few hundred candidates. Haversine cuts those to the exact right set. Both steps are necessary: geohash alone returns wrong results; Haversine alone against the full table is too slow.
The Full Search Path
A user searches for restaurants within 5 km. End to end:
- LBS receives
GET /search?lat=37.77&lng=-122.42&radius=5000&category=restaurant - Map 5 km radius to precision 5. Compute user's geohash:
9q8yy. - Compute 8 neighbors. Build the 9-cell list.
- Pipeline 9 Redis lookups in parallel:
SMEMBERS geohash:5:9q8yy, etc. Each key holds a set of business_ids. - Cache miss? Query the geospatial index table in the database for the missing cells.
- Fetch business details (name, rating, address) from a second Redis cache keyed by business_id.
- Apply Haversine. Sort by distance or rating. Return top 20.
Two Redis hops. Everything else is the exception. Target p99 is 100 ms; this hits around 10-12 ms on the hot path.
The two Redis hops handle the vast majority of traffic. Database reads are the exception, not the norm.
Geohash vs Quadtree
Geohash uses a fixed grid. A precision-5 cell in Manhattan and a precision-5 cell in rural Kansas cover identical geographic area, even though one contains thousands of businesses and the other contains zero. Hot cells in dense areas require more work per query and are harder to cache uniformly.
A quadtree solves this by splitting cells only when they accumulate more businesses than a threshold (say, 100 per leaf node). Manhattan gets subdivided 10+ levels deep. A rural cell stays near the root. Every leaf ends up with roughly the same number of businesses, so every query does roughly the same amount of work.
The interview tradeoff: geohash is simpler to implement, cache, and explain; quadtree handles uneven density better. In most interviews, geohash is the right answer. Mention quadtree as the density-adaptive alternative that systems like Uber use for real-time driver locations, where rides cluster heavily by time and neighborhood.
Geohash wins interviews. Quadtree wins cities. Know which one you're in.
Caching Is Unusually Effective Here
Location queries have a property most systems don't: they cluster geographically. Thousands of users in downtown San Francisco send queries that resolve to the same 9 geohash cells. Cache the geohash -> [business_id_list] mapping in Redis with a 10-minute TTL and you serve most searches entirely from memory.
This is the part interviewers want to see you get excited about. It's a genuinely elegant property.
Cache layout:
- Key:
geohash:{precision}:{cell}(e.g.,geohash:5:9q8yy) - Value: Redis Set of business_ids
- TTL: 10 minutes for hot cells, 1 hour for cold cells
Business details go into a second Redis cache keyed by business_id. Two Redis lookups cover both hops in the search path for nearly all traffic.
For the database tier, read replicas handle the read load while the primary handles only writes. The LBS routes all reads to replicas. For a deeper look at cache invalidation strategies beyond TTL, the caching guide covers each pattern.
The Write Path: Async, Eventually Consistent
A business owner updates their address. The flow:
- Business Service validates and writes to the business DB primary.
- An async worker recomputes the geohash for the updated coordinates.
- Worker deletes the old
(geohash, business_id)row from the geospatial index and inserts the new one. - Cache entries for the affected cells get invalidated (delete-on-write; next read repopulates).
Eventual consistency is explicitly correct here. A restaurant updating its hours at 2 PM appearing in search results at 2:05 PM is fine. This eliminates distributed transactions from the design entirely. If your interviewer pushes on consistency, the right move is to ask: "Is there a business requirement that requires synchronous update visibility?" There usually isn't.
The 45-Minute Clock
| Phase | Time | Focus |
|---|---|---|
| Requirements + estimates | 5 min | Two services, 18K QPS, read-heavy |
| API + data model | 5 min | GET /search params, business table, geospatial_index table |
| High-level design | 10 min | LBS, Business Service, Redis cluster, DB |
| Spatial index deep dive | 10 min | Geohash, 9-cell query, Haversine, quadtree tradeoff |
| Scaling + caching | 8 min | Cell cache strategy, read replicas, write path |
| Wrap-up | 2 min | Consistency model, extensions (ratings, photos) |
The most common failure mode: spending 15 minutes on geohash bit interleaving and never reaching caching or scaling. The interviewer cares about why range scans fail, how you handle cell boundaries, and how caching works at scale. Get through the encoding mechanics in 3 minutes. Your interviewer does not need you to manually demonstrate base-32 conversion; they need to know you understand why the string prefix property enables fast lookups.
Most candidates design only the search path and get caught when the write questions come. Know the async worker pattern and be able to justify eventual consistency in one sentence.
If you want to practice this under real time pressure, SpaceComplexity runs voice-based system design interviews with rubric feedback on exactly this kind of problem.
Recap
- Two-column lat/lng range scans don't scale. You need a spatial index.
- Geohash encodes 2D location as a short string where shared prefix means geographic proximity.
- Choose precision by search radius: 5 km maps to precision 5, roughly 4.9 km cells.
- Always query 9 cells (center plus 8 neighbors) to avoid boundary misses.
- Apply Haversine after the geohash pre-filter to get the exact result set.
- Cache the
geohash -> [business_ids]mapping in Redis. Location queries cluster; cache hit rate is very high. - Separate Business Service (writes) from LBS (reads). Keep LBS stateless.
- Eventual consistency is correct for business data. No distributed transactions needed.
- Quadtree is the density-adaptive alternative for uneven geographic distributions.