Skip to content

Twitter / X — Detailed#

flowchart TB
  subgraph Clients
    iOS[iOS]
    AND[Android]
    WEB([Web])
  end

  subgraph Edge
    DNS[DNS]
    CDN[CDN media + assets]
    LB[L7 LB]
    GW[API Gateway / GraphQL]
    WAF[WAF + Bot]
  end

  subgraph Write[Write Path]
    TWS[Tweet Service]
    MS[Media Service]
    OBJ[(S3 / blob)]
    ID[Snowflake ID]
    INGEST[[Kafka tweet stream]]
  end

  subgraph Graph[Social Graph]
    FG[Follow Service<br/>FlockDB / TAO-like]
    GDB[(Graph store)]
  end

  subgraph FanOut[Fanout - Hybrid]
    FANW[[Fanout workers]]
    CELEB([Celebrity classifier])
    PUSH[[Push fanout<br/>writes per follower]]
    PULL[Pull on read<br/>for celeb followees]
    HOMECACHE[(Home Timeline cache<br/>Redis ZSET per user)]
  end

  subgraph Storage
    TDB[(Tweets store<br/>Manhattan / Cassandra)]
    USERDB([(User store)])
    LIKES[(Likes / Retweets KV)]
    TRENDS[(Trends KV)]
  end

  subgraph Read[Read Path]
    HOME[Home Timeline API]
    USER([User Timeline API])
    HYD([Hydrator<br/>tweet + author + counts])
    RANK([ML Ranker<br/>For-You])
    DEDUP[Dedup + filter]
  end

  subgraph Search[Search / Trends]
    EARLY[Earlybird<br/>real-time inverted index]
    SPARK([Spark / Heron streams])
    QPARSE([Query parser])
  end

  subgraph Media
    TRANS([Transcoder])
    THUMB[Thumbnailer]
    LIVE[Live streaming<br/>Periscope-style]
  end

  subgraph Notif
    NS[Notification Service]
    PUSHN((APNS / FCM))
  end

  subgraph ML
    REC([Recommendation<br/>Who-to-follow])
    SCORE[Scoring service]
    SAFE[Trust & Safety<br/>spam / abuse models]
  end

  subgraph Obs
    MET[Metrics]
    TRC[Trace]
    LOG[Logs]
  end

  Clients --> DNS --> CDN --> LB --> WAF --> GW
  GW --> TWS
  TWS --> ID
  TWS --> TDB
  TWS --> INGEST
  TWS --> MS --> OBJ
  MS --> TRANS --> THUMB
  INGEST --> FANW
  FG --> FANW
  CELEB --> FANW
  FANW -->|normal user| PUSH --> HOMECACHE
  FANW -->|celeb author| PULL
  GW --> HOME --> HOMECACHE
  HOME --> PULL --> TDB
  HOME --> HYD --> RANK --> DEDUP --> Clients
  USER --> TDB
  INGEST --> EARLY
  GW --> Search --> EARLY
  INGEST --> SPARK --> TRENDS
  TWS --> LIKES
  INGEST --> NS --> PUSHN
  SAFE -.filter.-> RANK
  REC -.suggestions.-> Clients

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class WEB,USER client;
    class DNS,CDN,LB,GW,WAF edge;
    class iOS,AND,TWS,MS,PULL,HOME,DEDUP,THUMB,LIVE,NS,SCORE,SAFE service;
    class ID,FG,GDB,TDB,USERDB,LIKES,TRENDS,EARLY datastore;
    class HOMECACHE cache;
    class INGEST,FANW,PUSH queue;
    class CELEB,HYD,RANK,SPARK,QPARSE,TRANS,REC compute;
    class OBJ storage;
    class PUSHN external;
    class MET,TRC,LOG obs;

Hybrid fan-out#

  • Normal user (<10k followers): fan-out push to timeline cache.
  • Celebrity (>1M followers): no fan-out; pull at read for that subset.
  • Merger interleaves push-cached tweets with celeb tweets by timestamp.

Storage#

  • Tweets: Manhattan (Twitter's KV) — tweet_id PK, denormalized counts.
  • Likes/RTs: counters (TAO-style or Redis HLL for view counts).
  • Search: Earlybird = Lucene-based real-time inverted index, segmented by time.

Ranking#

  • For-You feed = candidates from network + recommendations → heavy DNN scorer.
  • Latency budget per scorer < 10 ms.

Live tweets#

  • Read-after-write: route subsequent reads to coordinator that committed.
  • Push to active sessions over Streaming API (Server-Sent Events / WS).

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD Load balancer / GSLB L4/L7 traffic distribution and failover load-balancer
HLD CDN edge caching for static assets cdn
HLD API gateway / BFF single ingress, auth, rate limit, routing api-gateway
HLD Pub/Sub & message brokers topics, consumer groups, delivery semantics pub-sub-pattern
HLD Probabilistic data structures Bloom, HLL, Count-Min, MinHash, t-digest probabilistic-data-structures
HLD Observability metrics, logs, traces, SLOs observability
HLD Realtime protocols WS / SSE / polling / gRPC streaming realtime-protocols
HLD Search internals inverted index, BM25, embeddings, ANN search-internals