Skip to content

Distributed Lock Service — Detailed#

flowchart TB
  subgraph Backends[Backend choices]
    ETCD[etcd - Raft]
    ZK[ZooKeeper - Zab]
    CONSUL[Consul - Raft]
    REDIS[Redis SET NX EX + Redlock]
    DB[DB row lock SELECT FOR UPDATE]
  end

  subgraph API[Lock API]
    ACQ[Acquire timeout]
    RENEW[Renew lease]
    REL[Release]
    FENCE[Fencing token monotonic]
  end

  subgraph Failures
    FAIL1([Client GC pause -> stale lease])
    FAIL2[Network partition -> two locks]
    FAIL3[Clock skew with redlock]
    FAIL4([Lease expired but client still doing work])
  end

  subgraph Mitig[Mitigations]
    TOK[Fencing token validated by resource]
    SHORT[Short leases + autorenew]
    HEART[Heartbeats]
    SAFE[Operation idempotency]
  end

  subgraph Use[Typical uses]
    LEADER[Leader election]
    SINGLETON[Singleton job]
    CRIT[Critical section access]
    GUARD[Resource reservation]
  end

  API --- Backends
  Failures -. mitigate .-> Mitig
  Use --- API

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class FAIL1,FAIL4 client;
    class ETCD,ZK,CONSUL,ACQ,RENEW,REL,FENCE,FAIL2,FAIL3,TOK,SHORT,HEART,SAFE,LEADER,SINGLETON,CRIT,GUARD service;
    class DB datastore;
    class REDIS cache;

Why fencing tokens#

  • A lock alone is not enough: a client can pause (GC), lose its lease, then resume and act.
  • Each acquire returns a monotonic token.
  • The protected resource refuses operations with a token < latest seen → safe.

Redlock & critique#

  • Redlock (Redis multi-node algorithm) widely debated; Kleppmann's critique shows clock-dependence pitfalls.
  • For correctness-critical locks prefer etcd/ZK with proper fencing.

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD Raft / Paxos consensus replicated state machine via majority quorum consensus-raft-paxos
HLD Idempotency & retries safe re-execution, backoff + jitter idempotency-retries
LLD Creational patterns Singleton, Factory, Builder, Prototype creational-patterns