Geohash and Quadtree (Proximity Search)
What it is
- Geohash encodes latitude/longitude into a short string where prefix similarity approximates spatial proximity (with caveats near boundaries). Good for Redis GEO, simple prefix queries in sorted sets or key-value stores.
- Quadtree (and R-tree family) partition 2D space recursively into quadrants (or bounding boxes) containing points or polygon references. Good for range and nearest queries with spatial indexes.
Uber-style systems combine map tiles, supply positioning, matching riders to drivers using distance/time estimates—not only raw Geo structures.
When to use
| Approach | Fit |
|---|---|
| Geohash + sorted structure | “Nearby in rough cell,” grid dispatch, simple scaling with strings |
| Quadtree / R-tree | Dense maps, complex polygons, accurate spatial joins |
| Pure brute distance | Small candidate sets only |
Loading diagram…
Proximity and pitfalls
- Geohash edge case: two close points can differ last bits across cell boundaries—queries use neighbor cells around the central prefix.
- Dynamic drivers: index updates as vehicles move; balance update rate vs index freshness (see latency-throughput).
Alternatives
- PostGIS / BigQuery geography: declarative SQL spatial queries.
- S2 / H3 (hexagonal grid): often preferred for uniform coverage and neighbor logic at planet scale.
Failure modes
- Hot cells in dense cities—same theme as sharding hot keys; shard dispatch queues by region.
- Stale locations: driver moved; match uses old coord—short TTL and refresh.
- Great-circle distance vs road distance: ETA uses routing engines, not Euclidean alone.
Interview talking points
- Candidate generation (grid/geohash neighbors) then rank by ETA (graph/routing service).
- Scale with back-of-envelope: drivers online per city, match QPS, index update rate.
- Mention surge and supply/demand as product layers beyond pure geo indexing.
Related fundamentals
Last updated on
Spotted something unclear or wrong on this page?