Zoom / Google Meet — Notes
Functional
- Multi-party real-time A/V.
- Screen share, chat, reactions, polls.
- Cloud recording + transcripts.
- PSTN dial-in / dial-out.
- Webinar mode (many viewers, few presenters).
- Optional E2E encryption.
Non-functional
- Glass-to-glass latency < 150 ms (ideal).
- Tolerate 5–10% packet loss without disconnect.
- 99.99% signaling availability.
- Global presence — meet near both participants.
Capacity
- Peak: 30M concurrent participants (Zoom 2020).
- SFU per box: 1–5k participants; 1000s of boxes.
- Per call bandwidth: 1.5 Mbps HD up + downstream × N.
Schema
meetings(id, host_id, start, end, region, codec, e2e_flag)
participants(meeting_id, user_id, join_ts, leave_ts, device)
recordings(meeting_id, segments[], transcript_ref)
Trade-offs
- SFU is the default; balances cost and quality.
- Server-side recording requires plaintext → tension with E2E.
- Simulcast doubles encode cost but enables per-receiver quality.
- TURN cost is real — fall back only when STUN fails.
Refs
- Zoom architecture blog & talks; Google Meet QUIC migration.
- WebRTC standards (RFC 8825 + RFC series).
- Jitsi Videobridge (open-source SFU).
- MLS Messaging Layer Security RFC.
- ByteByteGo "Design Zoom", Alex Xu Vol 2.