Skip to content

WhatsApp / Messenger — Notes#

Functional#

  • 1:1 and group chat (text, voice notes, media, location).
  • Delivery + read receipts, typing, presence.
  • Voice + video calls.
  • Multi-device, E2E encryption.
  • Push notifications when offline.

Non-functional#

  • p99 send latency < 500 ms.
  • 2B users; 100B msgs/day.
  • 99.99% uptime; survive regional outage.

Capacity#

  • 100B msgs/day → ~1.2M/s avg, 5M/s peak.
  • Avg msg + envelope ~1 KB → ~100 TB/day before E2E.
  • Long-lived WS connections: 2B / 100k per box = 20k+ gateway boxes.

API (simplified)#

WS up: hello + auth token + device id
WS msg: send(chat_id, payload, msg_id, prekey?)
WS evt: delivered(msg_id) / read(msg_id) / typing(chat_id)

Schema#

  • users(phone, id, public_key)
  • devices(user_id, device_id, push_token, key_bundle)
  • groups(id, [member_id...])
  • inbox(user_id, msg_id, payload, ts) Cassandra, TTL after delivery

Trade-offs#

  • Sticky WS gateway simplifies routing; failover requires re-connect.
  • Server-deletes-after-delivery (WhatsApp) vs full server history (Messenger): privacy vs sync convenience.
  • E2E prevents server-side moderation; flag-based reporting compensates.
  • Multi-device pairing: device tree (one master, others linked) vs full mesh.

Refs#

  • WhatsApp engineering talks (Erlang scaling stories).
  • Signal Protocol papers (X3DH, Double Ratchet).
  • Facebook Messenger architecture blog posts.
  • Alex Xu Vol 2 "Design a chat system."