Instagram — Detailed#
flowchart TB
subgraph Clients
iOS
AND[Android]
WEB
end
subgraph Edge
DNS[DNS]
CDN[Akamai / Facebook CDN]
LB[L7 LB]
GW[GraphQL Gateway]
end
subgraph Upload[Upload Pipeline]
PRE[Pre-signed URL service]
OBJ[(Origin S3 / Haystack)]
META[Metadata Service]
TRANS([Transcoder<br/>resize 320/640/1080,<br/>HEIC->JPEG, HEVC->H264])
THUMB[Thumbnailer + dominant color]
ML([ML pipeline<br/>NSFW, OCR, object tags, face])
HASH[Perceptual hash<br/>dedup]
end
subgraph Storage
PMETA[(Posts metadata<br/>MySQL / TAO)]
USERS[(Users)]
GRAPH[(Follow graph)]
LIKES[(Likes/Comments)]
STORIES[(Stories - TTL 24h)]
REELS[(Reels metadata)]
DM[(DMs - encrypted)]
end
subgraph Feed[Feed Tier]
HOME[Home Feed]
EXP[Explore]
REEL[Reels Feed]
HYD([Hydrator])
RANK([Ranker - ML])
CG([Candidate Gen<br/>ANN embeddings])
end
subgraph FanOut
FW[[Fanout workers]]
HOMECACHE[(Home cache per user<br/>Redis ZSET)]
CELEB[Celeb pull]
end
subgraph Realtime
PR[Presence]
WS[WebSocket gateway]
DMS[DM Service]
PUSH[Push notif]
end
subgraph Search
INV[Inverted index<br/>hashtags / names]
GEO[Geo index]
end
subgraph ML2[ML Platform]
EMB([Embedding store])
REC([Recommendation])
RANK2[Ranking models]
SAFE[Safety / Spam]
end
Clients --> DNS --> CDN
Clients --> LB --> GW
GW --> PRE --> OBJ
GW --> META --> PMETA
META --> TRANS --> THUMB --> CDN
TRANS --> ML --> HASH
ML --> SAFE
META --> FW
GRAPH --> FW
FW --> HOMECACHE
FW --> CELEB
GW --> HOME --> HOMECACHE
HOME --> HYD --> RANK --> Clients
GW --> EXP --> CG --> RANK
GW --> REEL --> CG
GW --> Search --> INV
Search --> GEO
CG --> EMB
RANK --> EMB
DMS --> WS
WS --> Clients
DMS --> DM
Clients --> PUSH
classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
class DNS,CDN,LB,GW,WS edge;
class AND,PRE,META,THUMB,HASH,HOME,EXP,REEL,CELEB,PR,DMS,PUSH,GEO,RANK2,SAFE service;
class PMETA,USERS,LIKES,STORIES,REELS,DM,INV datastore;
class HOMECACHE cache;
class FW queue;
class TRANS,ML,HYD,RANK,CG,EMB,REC compute;
class OBJ storage;
Upload pipeline#
- Client requests pre-signed URL.
- Direct PUT to origin S3.
- Notification → Transcoder builds multiple sizes / HLS for video.
- ML extracts tags, NSFW, OCR.
- Metadata committed; fan-out kicks in.
Stories#
- TTL 24h; separate hot store (Redis + cold S3).
- View receipts per viewer/story tuple.
Reels#
- Same ingest pipeline as video; recommendation-first feed.
- Candidate gen via embeddings + ANN (FAISS / ScaNN).
Direct Messaging#
- WebSocket + per-thread queue.
- E2E encryption optional (Messenger-style).
Glossary & fundamentals#
Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.
| Tag | Concept | What it is | Page |
|---|---|---|---|
HLD |
Load balancer / GSLB | L4/L7 traffic distribution and failover | load-balancer |
HLD |
CDN | edge caching for static assets | cdn |
HLD |
Realtime protocols | WS / SSE / polling / gRPC streaming | realtime-protocols |
HLD |
Search internals | inverted index, BM25, embeddings, ANN | search-internals |