Search Autocomplete — Detailed#
flowchart TB
subgraph Client[Client]
BR([Browser / App])
DBN[Debounce 100ms<br/>cancel on new key]
CACHE[Client cache<br/>per session]
end
subgraph Edge
GW[API Gateway / CDN edge]
RL([Rate limit per user])
end
subgraph Serve[Suggest Service]
PARSE[Normalize: lowercase,<br/>diacritics, language]
SPL[Speller / fuzzy match<br/>edit distance ≤ 2]
LOOK[Top-k Lookup]
PERS([Personalize<br/>user history + locale])
RANK([Ranker])
AB[A/B exp]
end
subgraph Store[Index Store]
TRIE[(Compressed Trie / DAWG<br/>per-prefix top-k materialized)]
REDIS[(Redis Sorted Sets<br/>key = prefix, score = freq)]
ES[(Elasticsearch<br/>edge_ngram fallback)]
end
subgraph Build[Offline Build Pipeline]
QL[[(Query logs - Kafka)]]
BATCH([Spark / Flink job<br/>n-gram counts, daily])
DECAY[Time decay + spike detection]
TRENDS[[Trending topics signal]]
FILT[Safety / profanity filter]
DEPLOY[Blue/green index swap]
end
subgraph Personal[Personalization]
HIST([User history KV])
LOC[Geo / language]
SES[Session intent]
end
BR --> DBN --> CACHE
CACHE -. miss .-> GW --> RL --> Serve
Serve --> PARSE --> SPL --> LOOK
LOOK --> TRIE
LOOK --> REDIS
LOOK -. fallback .-> ES
PERS --> RANK
HIST --> PERS
LOC --> PERS
SES --> PERS
LOOK --> RANK --> AB --> BR
BR -.queries+clicks.-> QL
QL --> BATCH --> DECAY --> FILT --> DEPLOY
TRENDS --> DEPLOY
DEPLOY --> TRIE
DEPLOY --> REDIS
classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
class BR,RL,PERS,HIST client;
class GW edge;
class DBN,PARSE,SPL,LOOK,AB,DECAY,FILT,DEPLOY,LOC,SES service;
class TRIE,ES,QL datastore;
class CACHE,REDIS cache;
class TRENDS queue;
class RANK,BATCH compute;
Trie design#
- Each node: char + top-k results (id + score) materialized.
- Lookup = walk N chars (N ≈ query length), return top-k at terminal node.
- Update via offline batch (recompute) — cheap reads, daily/hourly refresh.
Latency target#
- < 100 ms end-to-end (UI feels instant).
- Debounce ≥ 80 ms in client; cache on client too.
Spell correction#
- Symspell / BK-tree precomputed; runtime fuzzy match within edit distance ≤ 2.
Trending boost#
- Real-time stream finds spikes (e.g. breaking news) → boost score immediately via Redis overlay.
Index swap#
- Build new trie / Redis snapshot offline; atomic swap by flipping a pointer / namespace.
Glossary & fundamentals#
Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.
| Tag | Concept | What it is | Page |
|---|---|---|---|
HLD |
CDN | edge caching for static assets | cdn |
HLD |
API gateway / BFF | single ingress, auth, rate limit, routing | api-gateway |
HLD |
Pub/Sub & message brokers | topics, consumer groups, delivery semantics | pub-sub-pattern |
HLD |
Batch & stream processing | Lambda vs Kappa, watermarks, windows | batch-stream-processing |
LLD |
Data structures & complexity | Big-O, common DS, latency numbers | data-structures-complexity |
LLD |
Concurrency primitives | mutex, semaphore, RW lock, atomic, CAS | concurrency-primitives |