Skip to content

Distributed Transactions — Detailed#

flowchart TB
  subgraph TwoPC[Two-Phase Commit - 2PC]
    direction TB
    COORD[Coordinator / TM]
    P1[Prepare phase<br/>vote yes/no]
    P2[Commit phase<br/>commit/abort]
    RM1[RM: DB]
    RM2[[RM: MQ]]
    RM3[RM: cache]
    REC[Recovery log<br/>presumed abort/commit]
    HEUR[Heuristic decisions /<br/>in-doubt windows]
  end

  subgraph TCC[Try-Confirm-Cancel]
    T[Try: reserve / lock]
    CO[Confirm: finalize]
    CN[Cancel: release]
  end

  subgraph Saga[Saga - long-running]
    direction TB
    S1[Step 1 - local tx]
    S2[Step 2 - local tx]
    S3[Step 3 - local tx]
    CSO[Compensation S1]
    CST[Compensation S2]
    CST3[Compensation S3]
    CHO[Choreography<br/>events drive next step]
    ORC[Orchestration<br/>central process manager]
  end

  subgraph Outbox[Outbox / Transactional messaging]
    APP[App write]
    DB[[(Domain table<br/>+ outbox table same tx)]]
    REL([Relay / debezium])
    BROKER[[Kafka / Pub-Sub]]
    CONS[Consumers]
  end

  subgraph Inbox[Inbox - dedup at consumer]
    INB[[(Inbox table:<br/>processed msg_id)]]
    HND[Handler]
  end

  subgraph Linear[Linearizable global]
    SPANNER[Spanner/Calvin/CRDB<br/>Paxos per range +<br/>2PC across ranges]
    TT[TrueTime / HLC<br/>commit wait]
  end

  COORD --> P1
  P1 --> RM1
  P1 --> RM2
  P1 --> RM3
  P1 --> P2
  P2 --> RM1
  P2 --> RM2
  P2 --> RM3
  COORD --> REC
  REC --> HEUR

  S1 --> S2 --> S3
  S3 -. fail .-> CST3 -. fail of S2.-> CST -. fail of S1 .-> CSO

  APP --> DB
  DB --> REL --> BROKER --> CONS
  CONS --> INB --> HND

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class COORD,P1,P2,RM3,REC,HEUR,T,CO,CN,S1,S2,S3,CSO,CST,CHO,ORC,APP,CONS,HND,TT service;
    class RM1,DB,INB,SPANNER datastore;
    class RM2,BROKER queue;
    class REL compute;
    class CST3 storage;

When to use what#

Pattern Tolerates partition Operational pain Typical use
2PC (XA) no — blocks on coord loss hard within DC, financial cores
TCC yes (app-level locks) medium booking, reservation
Saga (orchestration) yes medium e-commerce checkout, payments
Saga (choreography) yes low complexity but observability hard event-driven systems
Outbox + inbox yes low publish reliably from DB writes
Spanner-style yes (CP) hidden by DB strict global consistency

Saga design checklist#

  • Each step idempotent.
  • Each step has a safe compensation (semantic, not physical undo).
  • Compensations are also idempotent.
  • Process manager tracks saga_id, current step, status.
  • Persist state before each external call (Outbox).

Pitfalls#

  • 2PC presumed abort vs presumed commit subtleties; coordinator failure leaves resources locked.
  • Sagas without compensations = "best effort" — be explicit.
  • Eventual consistency window must be explained to product.

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD Pub/Sub & message brokers topics, consumer groups, delivery semantics pub-sub-pattern
HLD CAP / PACELC C vs A under partition; L vs C otherwise cap-pacelc
HLD Raft / Paxos consensus replicated state machine via majority quorum consensus-raft-paxos
HLD Distributed transactions 2PC, TCC, sagas, outbox/inbox distributed-transactions
HLD Logical clocks Lamport, vector clocks, HLC, TrueTime logical-clocks
HLD Change Data Capture WAL/binlog tailing, outbox publishing change-data-capture