Skip to content

Search Autocomplete — Notes#

Functional#

  • For a typed prefix, return top-k suggestions ordered by score.
  • Multi-language, diacritic-insensitive, fuzzy tolerant.
  • Personalize by user/locale.
  • Reflect trending queries.

Non-functional#

  • p99 latency < 100 ms incl. network.
  • 10× read amplification of search QPS (1 query = 5–10 suggest calls).
  • Update freshness: 1 hr for popularity, seconds for trending.

Capacity#

  • 100M DAU × 50 searches/day × 8 suggest calls = 40B/day = 460k/s avg, 2M/s peak.
  • Trie footprint: 100M queries × 30 B → 3 GB raw, ~1 GB compressed.
  • Per-prefix top-k (5): a few hundred MB; fits per shard in RAM.

API#

GET /suggest?q=ne&user=...&locale=en-US -> [{text, score}, ...]

Data model#

  • Offline: (query, frequency) from logs, decayed by time.
  • Per-prefix top-k materialized.
  • Optional: per-user query history.

Trade-offs#

  • Trie is fast but memory-heavy; works great when materialized top-k.
  • Elasticsearch edge_ngram simpler, slower.
  • Real-time updates vs offline batch: hybrid is common.
  • Personalization can leak privacy (typing) — anonymize and minimize.

Refs#

  • Twitter typeahead, Etsy autocomplete, Pinterest autocomplete blog, Lucene completion suggester, ByteByteGo "Design typeahead".