Zoom / Google Meet — Detailed#
flowchart TB
subgraph Clients
APP([Native client])
WEB([Web - WebRTC])
PSTN[Dial-in PSTN]
end
subgraph Signaling
SIG[Signaling Service<br/>WS / WebSocket]
JOIN[Join / Meeting service]
AUTH[Auth / SSO]
DIR[Directory / Calendar]
end
subgraph NAT[NAT traversal]
STUN[STUN servers]
TURN([TURN relay servers])
ICE[ICE candidate gathering]
end
subgraph Media[Media plane]
SFU[SFU Selective Forwarding Unit]
MCU[MCU Multipoint Control Unit<br/>for big rooms / mix-down]
SIM[Simulcast / SVC ladders]
SR[Server-side recording]
LIVE[[Live stream out RTMP]]
BG[Virtual background / blur ML]
NS[Noise suppression ML]
JIT[Jitter buffer / FEC / RED]
BW[Bandwidth estimation - GCC / TWCC]
end
subgraph Webinar[Webinar / Large events]
PRES([Presenter set])
AUD([Audience receive-only])
CDN([HLS fallback for huge audiences])
end
subgraph Features
SHARE[Screen share]
CHAT[In-meeting chat]
REACT[Reactions]
POLL[Polls / breakout rooms]
WB[Whiteboard]
CAP[Live captions / translation]
REC[Cloud recording + transcript]
end
subgraph Storage
REC_S3[(Recording S3)]
TRANS[(Transcripts)]
META[(Meetings metadata)]
end
subgraph Crypto
E2E[E2E option<br/>MLS / SFrame]
DTLS[DTLS-SRTP per hop]
end
Clients --> AUTH --> SIG --> JOIN
Clients --> ICE
ICE --> STUN
ICE -. relay .-> TURN
Clients --> SFU
SFU --> BG
SFU --> NS
SFU --> SIM
SFU --> JIT
SFU --> BW
SFU --> SR --> REC_S3
SR --> TRANS
Webinar --- SFU
Webinar --- CDN
Features --- SFU
PSTN --> MCU
Crypto --- Media
classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
class APP,WEB,PRES,AUD,CDN client;
class PSTN,SIG,JOIN,AUTH,DIR,STUN,ICE,SFU,MCU,SIM,SR,BG,NS,JIT,BW,SHARE,CHAT,REACT,POLL,WB,CAP,REC,E2E,DTLS service;
class TRANS,META datastore;
class LIVE queue;
class TURN compute;
class REC_S3 storage;
SFU vs MCU#
- SFU (Selective Forwarding Unit): forwards each participant's stream to others; client renders mosaic. Cheap CPU; bandwidth scales O(N) per participant.
- MCU: server mixes streams into one outgoing → fixed bandwidth but heavy CPU and adds latency.
- Modern stacks default to SFU; MCU used for PSTN bridging or huge webinars.
Simulcast / SVC#
- Sender encodes 2-3 spatial layers (e.g., 1080p, 720p, 360p).
- SFU picks per-receiver the right layer.
- SVC (Scalable Video Coding) sends layered single stream.
NAT traversal#
- Each client gathers ICE candidates: host, server-reflexive (STUN), relayed (TURN).
- Exchange candidates via signaling.
- Try pairs; pick best (lowest latency).
- Fallback to TURN if symmetric NAT.
E2E encryption#
- Per-meeting key, ratcheted via MLS / Sender Keys.
- SFrame frames encrypted end-to-end while SFU still routes (without decrypting).
- Trade-off: limits server-side features (recording, captions need plaintext).
Recording / captioning#
- Server-side: SFU writes raw frames + mix → S3.
- Captions: real-time ASR; multilingual on stream.
Capacity#
- Each SFU handles 1k–5k participants depending on machine and bitrates.
- 1M concurrent meetings → 1000s of SFU nodes globally.
Glossary & fundamentals#
Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.
| Tag | Concept | What it is | Page |
|---|---|---|---|
HLD |
CDN | edge caching for static assets | cdn |
HLD |
CAP / PACELC | C vs A under partition; L vs C otherwise | cap-pacelc |
HLD |
Idempotency & retries | safe re-execution, backoff + jitter | idempotency-retries |
HLD |
Realtime protocols | WS / SSE / polling / gRPC streaming | realtime-protocols |