Skip to content

Pinterest — Detailed#

flowchart TB
  subgraph Clients
    Web
    Mobile
  end

  subgraph Edge
    CDN
    GW
  end

  subgraph Create[Pin / Board]
    PIN[Pin Service]
    BRD[Board Service]
    UP[Image Upload]
    OBJ[(S3)]
    THUMB[Thumbnailer<br/>multiple sizes]
    OCR([OCR + text extraction])
    META[(Pin metadata)]
  end

  subgraph Visual[Visual Search Pipeline]
    DETECT[Object detection]
    EMB([Embedding model<br/>image -> vector])
    OBJEMB([Per-object embeddings])
    ANN[(ANN index<br/>HNSW / ScaNN)]
    LENS[Pinterest Lens]
  end

  subgraph Text[Text indexing]
    INV[(Inverted index<br/>title, desc, hashtags)]
    EMBT([Text embeddings])
  end

  subgraph Discover
    HOME[Home Feed]
    SEARCH[Search]
    REL[Related Pins]
    SHOP[Shopping]
  end

  subgraph Reco[Recommendation]
    CG([Candidate Generation<br/>PinSage / two-tower])
    RANK([Ranker])
    RR([Reranker - diversity])
    FRESH[Freshness mixer]
  end

  subgraph Graph[Graph Signals]
    PG[Pin-Board graph]
    UB([User-Board graph])
    EDGE[Engagement edges]
    PINSAGE[PinSage GNN model]
  end

  subgraph Stores
    PDB[(Pins SQL/KV sharded)]
    BDB[(Boards)]
    USRDB[(Users)]
    EMBSTORE([(Embedding store)])
    LIKES[(Saves/Reactions)]
  end

  subgraph Ads
    ADSVR[Ads server]
    BID[Bidding / RTB]
  end

  Clients --> CDN --> GW
  GW --> PIN --> META
  PIN --> UP --> OBJ --> CDN
  UP --> THUMB
  UP --> OCR
  PIN --> Visual
  PIN --> Text
  GW --> HOME --> Reco
  GW --> SEARCH --> Text
  Visual --> ANN
  Text --> INV
  Reco --> CG --> ANN
  Reco --> RANK
  Reco --> RR
  Graph --> PINSAGE --> EMBSTORE
  EMBSTORE --> CG
  ADSVR --- RANK

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class UB client;
    class PIN,BRD,UP,THUMB,DETECT,LENS,HOME,SEARCH,REL,SHOP,FRESH,PG,EDGE,PINSAGE,ADSVR,BID service;
    class META,ANN,INV,PDB,BDB,USRDB,EMBSTORE,LIKES datastore;
    class OCR,EMB,OBJEMB,EMBT,CG,RANK,RR compute;
    class OBJ storage;

Visual search (Lens)#

  • Object detection → per-object embedding → ANN lookup → return matching pins.
  • Combines visual + text + co-occurrence features.

PinSage (graph neural network)#

  • Random walks on pin-board bipartite graph generate training data.
  • GraphSAGE-style aggregation produces 1024-d embeddings.
  • Used for candidate generation across home, related, search.

Boards & feeds#

  • Board = user's curated collection of pins.
  • Home feed candidate gen: from followed boards + recommendations + collaborative.

Shopping#

  • Pins linked to product catalog → visual product search.
  • Buyable-pin metadata: SKU, price, availability.

Trade-offs#

  • GNN embeddings are powerful but expensive to retrain (weekly typical).
  • Two-tower retrieval simpler, faster updates.
  • Diversity reranker mandatory or feed becomes monotonous.

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD CDN edge caching for static assets cdn
HLD Search internals inverted index, BM25, embeddings, ANN search-internals