Skip to content

MVCC & Isolation Levels — Detailed#

flowchart TB
  subgraph MVCC[Multi-Version Concurrency Control]
    direction TB
    XID[Tx id / commit seqno]
    SS[Snapshot = set of visible XIDs]
    ROW[(Row chain<br/>v1 → v2 → v3)]
    HOTUP[HOT updates / undo log]
    GC[VACUUM / GC<br/>old tuples removed]
  end

  subgraph Levels[Isolation Levels - ANSI + extensions]
    direction TB
    RU[Read Uncommitted<br/>allows dirty reads]
    RC[Read Committed<br/>no dirty reads]
    RR[Repeatable Read<br/>same row -> same value]
    SI[Snapshot Isolation<br/>own snapshot, write skew possible]
    SR([Serializable<br/>equivalent to serial schedule])
    SSI[Serializable Snapshot Isolation<br/>SI + detect dangerous structures]
  end

  subgraph Anomalies[Anomalies]
    DR[Dirty read]
    NRR[Non-repeatable read]
    PR[Phantom read]
    LU[Lost update]
    WS[Write skew]
    CR[Cycle / serializability]
  end

  subgraph Mechanisms[Mechanisms]
    LOCK[2PL pessimistic locks]
    OCC[OCC: validate at commit]
    TS[Timestamp ordering]
    LK[Predicate locks]
  end

  subgraph Engines
    PG[Postgres<br/>MVCC undo via dead tuples + VACUUM]
    INNO[InnoDB<br/>MVCC via undo log]
    SPN[Spanner / CRDB<br/>TS via TrueTime / HLC]
    ORA[Oracle<br/>MVCC via UNDO segments]
  end

  RU -. fixes DR .-> RC
  RC -. fixes NRR .-> RR
  RR -. fixes PR / WS .-> SR
  SR -. impl as SI+detect .-> SSI
  MVCC --- Engines
  Mechanisms --- Levels
  Anomalies --- Levels

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class XID,SS,HOTUP,GC,RU,RC,SI,SSI,DR,PR,LU,WS,CR,LOCK,OCC,TS,LK,INNO,ORA service;
    class ROW,RR,NRR,PG,SPN datastore;
    class SR compute;

Anomalies matrix (ANSI + Berenson)#

Level Dirty read Non-rep read Phantom Write skew
RU
RC
RR depends
SI ✓ for snapshot reads
Serializable

(✓ = prevented)

How MVCC works in Postgres#

  • Each row tuple has xmin (creating tx) and xmax (deleting tx).
  • A snapshot is { xmin, xmax, xip[] } — visible if xmin committed and xmax not committed/in snapshot.
  • Updates write a new tuple; old becomes dead → VACUUM removes.
  • Long-running tx → bloat → autovacuum cannot reclaim → table grows.

Choosing a level#

  • Most apps: Read Committed is fine.
  • Bank transfer / inventory: Serializable or SSI.
  • Reports: Repeatable Read (consistent snapshot).
  • Beware lost updates in RC — use SELECT ... FOR UPDATE or compare-and-set.

Optimistic vs pessimistic#

  • OCC: validate at commit (Postgres SSI, CRDB). Great for low contention.
  • 2PL: lock rows. Predictable but deadlocks. MySQL default.

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD LSM vs B-Tree engines WAL, memtable, SSTables, compaction storage-engines-lsm-btree
HLD MVCC & isolation levels snapshot isolation, serializability, vacuum mvcc-isolation-levels
HLD Logical clocks Lamport, vector clocks, HLC, TrueTime logical-clocks