Skip to content

Geo Indexing — Detailed#

The five contenders#

flowchart TB
  subgraph QT[Quadtree]
    QT1[Recursive 4-way split]
    QT2[Adaptive density]
    QT3[Unbalanced if hot spots]
  end
  subgraph GH[Geohash]
    GH1[Z-order curve]
    GH2[String prefix = region]
    GH3[Edge cases at poles + 180 deg]
  end
  subgraph S2[S2 cells]
    S2A[Sphere-aware]
    S2B[Hilbert curve]
    S2C[Levels 0-30]
  end
  subgraph H3[H3 hexagons]
    H3A[Hexes - 6 equidistant neighbours]
    H3B[Resolutions 0-15]
    H3C[No pole edge cases]
  end
  subgraph RT[R-tree]
    RT1[Bounding boxes]
    RT2[Best for polygons]
    RT3[Hard to shard]
  end

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class QT1,QT2,QT3,GH1,GH2,GH3,S2A,S2B,S2C,H3A,H3B,H3C,RT1,RT2,RT3 service;

Cheat sheet#

Shape Sphere-aware Neighbour math Shard-friendly Best for
Quadtree square no trivial yes dynamic density (cities)
Geohash square (approx) no string prefix very (string range scan) KV stores by prefix
S2 spherical square yes 4-neighbours yes (cell-id range) global apps, Google
H3 hexagon yes 6 equidistant yes uber-style indexes
R-tree bounding rect no tree traversal hard polygon containment, Postgres GiST

"Find nearby" with cells#

flowchart LR
  Q[Query: point + radius]
  C[Cover with cells]
  S[Scan cells]
  F[Distance filter]
  R[Top-k]
  Q --> C --> S --> F --> R

  classDef p fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
  classDef s fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
  class Q,R p;
  class C,S,F s;

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class Q,C,S,F,R service;
  1. Convert (lat, lng) to cell id at the right level (cell width ≈ radius).
  2. Compute cells covering the search ring.
  3. Range-scan posting list per cell.
  4. Haversine-filter candidates to exact radius.
  5. Sort by distance.

Geohash example#

9q8yyk (San Francisco). First chars = bigger region: - 9 — North-west quarter of the western hemisphere. - 9q — California-ish. - 9q8yyk — block-level.

KV store sharding by geohash prefix → all data for a city lands on adjacent shards.

S2 vs H3#

  • S2: Google. Hilbert curve preserves locality on integer cell ids → range scans cheap.
  • H3: Uber. Hexagons are great for "fair" neighbour distances + visualisation.

Many systems pick H3 for indexing + querying, S2 for static partitioning.

R-tree#

  • Bounding-box hierarchy.
  • Best for "does polygon contain point?" / "intersect rectangles?".
  • Postgres / PostGIS GiST index is an R-tree variant.
  • Hard to shard horizontally because boxes can cross node boundaries.

Polygon queries#

  • For "is point in polygon?" — bound polygon by an R-tree or cover-set of S2/H3 cells.
  • For "what polygons contain this point?" — same; cell-coverage filters candidates, exact point-in-polygon test confirms.

Where to start#

Scenario Pick
New service, global, simple S2 or H3
Polygon-heavy (geofences, regions) R-tree + cell pre-filter
String-keyed KV (DynamoDB, Redis) Geohash
Static dataset, in-memory Quadtree

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD Sharding horizontal partitioning across nodes database-sharding
HLD Geo indexing Geohash, Quadtree, S2, H3, R-tree geo-indexing