Geo Indexing — Detailed#
The five contenders#
flowchart TB
subgraph QT[Quadtree]
QT1[Recursive 4-way split]
QT2[Adaptive density]
QT3[Unbalanced if hot spots]
end
subgraph GH[Geohash]
GH1[Z-order curve]
GH2[String prefix = region]
GH3[Edge cases at poles + 180 deg]
end
subgraph S2[S2 cells]
S2A[Sphere-aware]
S2B[Hilbert curve]
S2C[Levels 0-30]
end
subgraph H3[H3 hexagons]
H3A[Hexes - 6 equidistant neighbours]
H3B[Resolutions 0-15]
H3C[No pole edge cases]
end
subgraph RT[R-tree]
RT1[Bounding boxes]
RT2[Best for polygons]
RT3[Hard to shard]
end
classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
class QT1,QT2,QT3,GH1,GH2,GH3,S2A,S2B,S2C,H3A,H3B,H3C,RT1,RT2,RT3 service;
Cheat sheet#
| Shape | Sphere-aware | Neighbour math | Shard-friendly | Best for | |
|---|---|---|---|---|---|
| Quadtree | square | no | trivial | yes | dynamic density (cities) |
| Geohash | square (approx) | no | string prefix | very (string range scan) | KV stores by prefix |
| S2 | spherical square | yes | 4-neighbours | yes (cell-id range) | global apps, Google |
| H3 | hexagon | yes | 6 equidistant | yes | uber-style indexes |
| R-tree | bounding rect | no | tree traversal | hard | polygon containment, Postgres GiST |
"Find nearby" with cells#
flowchart LR
Q[Query: point + radius]
C[Cover with cells]
S[Scan cells]
F[Distance filter]
R[Top-k]
Q --> C --> S --> F --> R
classDef p fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef s fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
class Q,R p;
class C,S,F s;
classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
class Q,C,S,F,R service;
- Convert (lat, lng) to cell id at the right level (cell width ≈ radius).
- Compute cells covering the search ring.
- Range-scan posting list per cell.
- Haversine-filter candidates to exact radius.
- Sort by distance.
Geohash example#
9q8yyk (San Francisco). First chars = bigger region:
- 9 — North-west quarter of the western hemisphere.
- 9q — California-ish.
- 9q8yyk — block-level.
KV store sharding by geohash prefix → all data for a city lands on adjacent shards.
S2 vs H3#
- S2: Google. Hilbert curve preserves locality on integer cell ids → range scans cheap.
- H3: Uber. Hexagons are great for "fair" neighbour distances + visualisation.
Many systems pick H3 for indexing + querying, S2 for static partitioning.
R-tree#
- Bounding-box hierarchy.
- Best for "does polygon contain point?" / "intersect rectangles?".
- Postgres / PostGIS GiST index is an R-tree variant.
- Hard to shard horizontally because boxes can cross node boundaries.
Polygon queries#
- For "is point in polygon?" — bound polygon by an R-tree or cover-set of S2/H3 cells.
- For "what polygons contain this point?" — same; cell-coverage filters candidates, exact point-in-polygon test confirms.
Where to start#
| Scenario | Pick |
|---|---|
| New service, global, simple | S2 or H3 |
| Polygon-heavy (geofences, regions) | R-tree + cell pre-filter |
| String-keyed KV (DynamoDB, Redis) | Geohash |
| Static dataset, in-memory | Quadtree |
Glossary & fundamentals#
Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.
| Tag | Concept | What it is | Page |
|---|---|---|---|
HLD |
Sharding | horizontal partitioning across nodes | database-sharding |
HLD |
Geo indexing | Geohash, Quadtree, S2, H3, R-tree | geo-indexing |