Pinterest — Detailed#
flowchart TB
subgraph Clients
Web
Mobile
end
subgraph Edge
CDN
GW
end
subgraph Create[Pin / Board]
PIN[Pin Service]
BRD[Board Service]
UP[Image Upload]
OBJ[(S3)]
THUMB[Thumbnailer<br/>multiple sizes]
OCR([OCR + text extraction])
META[(Pin metadata)]
end
subgraph Visual[Visual Search Pipeline]
DETECT[Object detection]
EMB([Embedding model<br/>image -> vector])
OBJEMB([Per-object embeddings])
ANN[(ANN index<br/>HNSW / ScaNN)]
LENS[Pinterest Lens]
end
subgraph Text[Text indexing]
INV[(Inverted index<br/>title, desc, hashtags)]
EMBT([Text embeddings])
end
subgraph Discover
HOME[Home Feed]
SEARCH[Search]
REL[Related Pins]
SHOP[Shopping]
end
subgraph Reco[Recommendation]
CG([Candidate Generation<br/>PinSage / two-tower])
RANK([Ranker])
RR([Reranker - diversity])
FRESH[Freshness mixer]
end
subgraph Graph[Graph Signals]
PG[Pin-Board graph]
UB([User-Board graph])
EDGE[Engagement edges]
PINSAGE[PinSage GNN model]
end
subgraph Stores
PDB[(Pins SQL/KV sharded)]
BDB[(Boards)]
USRDB[(Users)]
EMBSTORE([(Embedding store)])
LIKES[(Saves/Reactions)]
end
subgraph Ads
ADSVR[Ads server]
BID[Bidding / RTB]
end
Clients --> CDN --> GW
GW --> PIN --> META
PIN --> UP --> OBJ --> CDN
UP --> THUMB
UP --> OCR
PIN --> Visual
PIN --> Text
GW --> HOME --> Reco
GW --> SEARCH --> Text
Visual --> ANN
Text --> INV
Reco --> CG --> ANN
Reco --> RANK
Reco --> RR
Graph --> PINSAGE --> EMBSTORE
EMBSTORE --> CG
ADSVR --- RANK
classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
class UB client;
class PIN,BRD,UP,THUMB,DETECT,LENS,HOME,SEARCH,REL,SHOP,FRESH,PG,EDGE,PINSAGE,ADSVR,BID service;
class META,ANN,INV,PDB,BDB,USRDB,EMBSTORE,LIKES datastore;
class OCR,EMB,OBJEMB,EMBT,CG,RANK,RR compute;
class OBJ storage;
Visual search (Lens)#
- Object detection → per-object embedding → ANN lookup → return matching pins.
- Combines visual + text + co-occurrence features.
PinSage (graph neural network)#
- Random walks on pin-board bipartite graph generate training data.
- GraphSAGE-style aggregation produces 1024-d embeddings.
- Used for candidate generation across home, related, search.
Boards & feeds#
- Board = user's curated collection of pins.
- Home feed candidate gen: from followed boards + recommendations + collaborative.
Shopping#
- Pins linked to product catalog → visual product search.
- Buyable-pin metadata: SKU, price, availability.
Trade-offs#
- GNN embeddings are powerful but expensive to retrain (weekly typical).
- Two-tower retrieval simpler, faster updates.
- Diversity reranker mandatory or feed becomes monotonous.
Glossary & fundamentals#
Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.
| Tag | Concept | What it is | Page |
|---|---|---|---|
HLD |
CDN | edge caching for static assets | cdn |
HLD |
Search internals | inverted index, BM25, embeddings, ANN | search-internals |