Skip to content

Google Docs — Detailed#

flowchart TB
  subgraph Clients
    EDIT[Editor app]
    PRES[Presence]
    CUR[Cursor / selection]
  end

  subgraph Edge
    LB
    WS[WebSocket gateway<br/>pinned per doc]
  end

  subgraph Collab[Collaboration engine]
    OT[OT engine<br/>transform op against concurrent ops]
    CRDT_ENG[CRDT engine<br/>Yjs / Automerge style]
    SESS[Session manager]
    REV[Revision log per doc]
    SNAP[Snapshots]
    LOCK[Per-doc serialization]
  end

  subgraph Storage
    DOC[(Doc store)]
    OPLOG[(Op log)]
    META[(Permissions, comments, suggestions)]
  end

  subgraph Features
    PRES_S[Presence service]
    COMM[Comments / Threads]
    SUGG[Suggesting mode]
    VER[Version history]
    EXPORT[Export PDF / Word]
    OFF[Offline support]
  end

  subgraph Share
    ACL[ACL / link sharing]
    AUTH[Auth + tenant]
  end

  Clients --> LB --> WS --> Collab
  Collab --> Storage
  Features --- Collab
  Share --- Storage

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class WS edge;
    class EDIT,PRES,CUR,OT,CRDT_ENG,SESS,REV,SNAP,LOCK,PRES_S,COMM,SUGG,VER,EXPORT,OFF,ACL,AUTH service;
    class DOC,OPLOG,META datastore;

OT vs CRDT#

  • Operational Transform (Google Docs historically): server orders ops and transforms incoming ops against concurrent committed ops.
  • CRDT (Yjs, Automerge): each op has unique ID with causal context; merges by mathematical guarantees.

Per-doc serializer#

  • One server owns a doc at a time → serializes ops in a definite order.
  • Failover to another server reloads from op log.

Presence#

  • Lightweight; per-doc pub/sub channel.
  • Cursors and selections sent at 10 Hz max with throttling.

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD Pub/Sub & message brokers topics, consumer groups, delivery semantics pub-sub-pattern
HLD CRDTs commutative replicated data types crdts
HLD Realtime protocols WS / SSE / polling / gRPC streaming realtime-protocols
LLD Async models futures / async-await / coroutines / actors async-models