Skip to content

Instagram — Detailed#

flowchart TB
  subgraph Clients
    iOS
    AND[Android]
    WEB
  end

  subgraph Edge
    DNS[DNS]
    CDN[Akamai / Facebook CDN]
    LB[L7 LB]
    GW[GraphQL Gateway]
  end

  subgraph Upload[Upload Pipeline]
    PRE[Pre-signed URL service]
    OBJ[(Origin S3 / Haystack)]
    META[Metadata Service]
    TRANS([Transcoder<br/>resize 320/640/1080,<br/>HEIC->JPEG, HEVC->H264])
    THUMB[Thumbnailer + dominant color]
    ML([ML pipeline<br/>NSFW, OCR, object tags, face])
    HASH[Perceptual hash<br/>dedup]
  end

  subgraph Storage
    PMETA[(Posts metadata<br/>MySQL / TAO)]
    USERS[(Users)]
    GRAPH[(Follow graph)]
    LIKES[(Likes/Comments)]
    STORIES[(Stories - TTL 24h)]
    REELS[(Reels metadata)]
    DM[(DMs - encrypted)]
  end

  subgraph Feed[Feed Tier]
    HOME[Home Feed]
    EXP[Explore]
    REEL[Reels Feed]
    HYD([Hydrator])
    RANK([Ranker - ML])
    CG([Candidate Gen<br/>ANN embeddings])
  end

  subgraph FanOut
    FW[[Fanout workers]]
    HOMECACHE[(Home cache per user<br/>Redis ZSET)]
    CELEB[Celeb pull]
  end

  subgraph Realtime
    PR[Presence]
    WS[WebSocket gateway]
    DMS[DM Service]
    PUSH[Push notif]
  end

  subgraph Search
    INV[Inverted index<br/>hashtags / names]
    GEO[Geo index]
  end

  subgraph ML2[ML Platform]
    EMB([Embedding store])
    REC([Recommendation])
    RANK2[Ranking models]
    SAFE[Safety / Spam]
  end

  Clients --> DNS --> CDN
  Clients --> LB --> GW
  GW --> PRE --> OBJ
  GW --> META --> PMETA
  META --> TRANS --> THUMB --> CDN
  TRANS --> ML --> HASH
  ML --> SAFE
  META --> FW
  GRAPH --> FW
  FW --> HOMECACHE
  FW --> CELEB
  GW --> HOME --> HOMECACHE
  HOME --> HYD --> RANK --> Clients
  GW --> EXP --> CG --> RANK
  GW --> REEL --> CG
  GW --> Search --> INV
  Search --> GEO
  CG --> EMB
  RANK --> EMB
  DMS --> WS
  WS --> Clients
  DMS --> DM
  Clients --> PUSH

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class DNS,CDN,LB,GW,WS edge;
    class AND,PRE,META,THUMB,HASH,HOME,EXP,REEL,CELEB,PR,DMS,PUSH,GEO,RANK2,SAFE service;
    class PMETA,USERS,LIKES,STORIES,REELS,DM,INV datastore;
    class HOMECACHE cache;
    class FW queue;
    class TRANS,ML,HYD,RANK,CG,EMB,REC compute;
    class OBJ storage;

Upload pipeline#

  1. Client requests pre-signed URL.
  2. Direct PUT to origin S3.
  3. Notification → Transcoder builds multiple sizes / HLS for video.
  4. ML extracts tags, NSFW, OCR.
  5. Metadata committed; fan-out kicks in.

Stories#

  • TTL 24h; separate hot store (Redis + cold S3).
  • View receipts per viewer/story tuple.

Reels#

  • Same ingest pipeline as video; recommendation-first feed.
  • Candidate gen via embeddings + ANN (FAISS / ScaNN).

Direct Messaging#

  • WebSocket + per-thread queue.
  • E2E encryption optional (Messenger-style).

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD Load balancer / GSLB L4/L7 traffic distribution and failover load-balancer
HLD CDN edge caching for static assets cdn
HLD Realtime protocols WS / SSE / polling / gRPC streaming realtime-protocols
HLD Search internals inverted index, BM25, embeddings, ANN search-internals