Search Autocomplete — Notes
Functional
- For a typed prefix, return top-k suggestions ordered by score.
- Multi-language, diacritic-insensitive, fuzzy tolerant.
- Personalize by user/locale.
- Reflect trending queries.
Non-functional
- p99 latency < 100 ms incl. network.
- 10× read amplification of search QPS (1 query = 5–10 suggest calls).
- Update freshness: 1 hr for popularity, seconds for trending.
Capacity
- 100M DAU × 50 searches/day × 8 suggest calls = 40B/day = 460k/s avg, 2M/s peak.
- Trie footprint: 100M queries × 30 B → 3 GB raw, ~1 GB compressed.
- Per-prefix top-k (5): a few hundred MB; fits per shard in RAM.
API
GET /suggest?q=ne&user=...&locale=en-US -> [{text, score}, ...]
Data model
- Offline:
(query, frequency) from logs, decayed by time.
- Per-prefix top-k materialized.
- Optional: per-user query history.
Trade-offs
- Trie is fast but memory-heavy; works great when materialized top-k.
- Elasticsearch edge_ngram simpler, slower.
- Real-time updates vs offline batch: hybrid is common.
- Personalization can leak privacy (typing) — anonymize and minimize.
Refs
- Twitter typeahead, Etsy autocomplete, Pinterest autocomplete blog,
Lucene completion suggester, ByteByteGo "Design typeahead".