Geo Indexing — Notes
Cell levels you'll actually use
| Provider |
Level |
Edge length |
| S2 |
12 |
~1.3 km |
| S2 |
16 |
~155 m |
| H3 |
8 |
~460 m edge |
| H3 |
10 |
~65 m edge |
| Geohash |
6 chars |
~1.2 km |
| Geohash |
8 chars |
~38 m |
Choosing a level
- Pick a level whose cell edge ≈ your typical query radius.
- For variable radius, store the centroid at the finest level and aggregate up at query time.
Sharding by cell
- Use the higher-order bits of the cell id as the shard key.
- This co-locates a neighbourhood on one shard → range scans + writes are local.
- Beware hot cells (Times Square, Tokyo Station). Add a sub-shard by user-id or time bucket.
Polygon containment math
- Ray casting (count edge crossings) — O(N) per polygon.
- Winding number — same.
- For high QPS, pre-tessellate complex polygons into convex pieces.
Production tips
- Pre-compute cell coverings for static polygons (country borders, delivery zones).
- Re-index when the dataset density changes radically.
- Cache the "nearby" results for popular query points (city centers).
Refs
- Google S2 docs (s2geometry.io).
- Uber H3 docs + paper.
- PostGIS book (Regina Obe).
- Geohash.org (the original Niemeyer 2008 invention).