Skip to content

Recommendation System — Detailed#

flowchart TB
  subgraph Sig[Signals]
    CLK[Clicks / watches / buys]
    DWELL[Dwell time / completion]
    EXPLICIT[Ratings / likes]
    DEMO([Demographics / device])
  end

  subgraph Ingest
    KAFKA[[Kafka events]]
    FS_RT[Realtime feature store]
    LAKE[Data lake]
    EMB_TRAIN([Embedding training])
  end

  subgraph Models[Two-stage architecture]
    CG([Candidate Gen<br/>two-tower / collaborative / heuristic])
    RANK([Ranker<br/>GBDT / DNN multi-task])
    RR([Reranker<br/>diversity, freshness, business])
    FS_BATCH[Batch features]
    EMB([Embeddings store + ANN<br/>FAISS / ScaNN])
  end

  subgraph Serve
    GW[Recs API]
    CACHE([Per-user candidate cache])
    AB[A/B experiments]
    POLICY[Policy / safety filter]
  end

  subgraph Offline[Offline]
    TRAIN[Training pipelines]
    BACKTEST[Backtests]
    METRICS[Offline metrics<br/>recall@K, NDCG]
  end

  Sig --> KAFKA --> FS_RT
  KAFKA --> LAKE --> EMB_TRAIN --> EMB
  GW --> CG --> RANK --> RR --> GW
  FS_RT --> CG
  FS_BATCH --> RANK
  EMB --> CG
  AB --- GW
  POLICY --- GW
  Offline --- Models

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class DEMO,CACHE client;
    class CLK,DWELL,EXPLICIT,LAKE,FS_BATCH,GW,AB,POLICY,TRAIN,BACKTEST service;
    class FS_RT datastore;
    class KAFKA queue;
    class EMB_TRAIN,CG,RANK,RR,EMB compute;
    class METRICS obs;

Two-stage architecture#

  • Candidate Gen: cheap, high-recall, ~100s items from millions.
  • Ranker: expensive DNN, scoring 100s items in 10 ms budget.
  • Optional Reranker: diversity, freshness, business rules.

Cold start#

  • New user: popular by region / demographic.
  • New item: content-based embedding + exploration bucket.

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD Pub/Sub & message brokers topics, consumer groups, delivery semantics pub-sub-pattern
HLD Observability metrics, logs, traces, SLOs observability
HLD Search internals inverted index, BM25, embeddings, ANN search-internals