Google Docs — Detailed#
flowchart TB
subgraph Clients
EDIT[Editor app]
PRES[Presence]
CUR[Cursor / selection]
end
subgraph Edge
LB
WS[WebSocket gateway<br/>pinned per doc]
end
subgraph Collab[Collaboration engine]
OT[OT engine<br/>transform op against concurrent ops]
CRDT_ENG[CRDT engine<br/>Yjs / Automerge style]
SESS[Session manager]
REV[Revision log per doc]
SNAP[Snapshots]
LOCK[Per-doc serialization]
end
subgraph Storage
DOC[(Doc store)]
OPLOG[(Op log)]
META[(Permissions, comments, suggestions)]
end
subgraph Features
PRES_S[Presence service]
COMM[Comments / Threads]
SUGG[Suggesting mode]
VER[Version history]
EXPORT[Export PDF / Word]
OFF[Offline support]
end
subgraph Share
ACL[ACL / link sharing]
AUTH[Auth + tenant]
end
Clients --> LB --> WS --> Collab
Collab --> Storage
Features --- Collab
Share --- Storage
classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
class WS edge;
class EDIT,PRES,CUR,OT,CRDT_ENG,SESS,REV,SNAP,LOCK,PRES_S,COMM,SUGG,VER,EXPORT,OFF,ACL,AUTH service;
class DOC,OPLOG,META datastore;
OT vs CRDT#
- Operational Transform (Google Docs historically): server orders ops and transforms incoming ops against concurrent committed ops.
- CRDT (Yjs, Automerge): each op has unique ID with causal context; merges by mathematical guarantees.
Per-doc serializer#
- One server owns a doc at a time → serializes ops in a definite order.
- Failover to another server reloads from op log.
Presence#
- Lightweight; per-doc pub/sub channel.
- Cursors and selections sent at 10 Hz max with throttling.
Glossary & fundamentals#
Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.
| Tag | Concept | What it is | Page |
|---|---|---|---|
HLD |
Pub/Sub & message brokers | topics, consumer groups, delivery semantics | pub-sub-pattern |
HLD |
CRDTs | commutative replicated data types | crdts |
HLD |
Realtime protocols | WS / SSE / polling / gRPC streaming | realtime-protocols |
LLD |
Async models | futures / async-await / coroutines / actors | async-models |