Distributed Lock Service — Detailed#
flowchart TB
subgraph Backends[Backend choices]
ETCD[etcd - Raft]
ZK[ZooKeeper - Zab]
CONSUL[Consul - Raft]
REDIS[Redis SET NX EX + Redlock]
DB[DB row lock SELECT FOR UPDATE]
end
subgraph API[Lock API]
ACQ[Acquire timeout]
RENEW[Renew lease]
REL[Release]
FENCE[Fencing token monotonic]
end
subgraph Failures
FAIL1([Client GC pause -> stale lease])
FAIL2[Network partition -> two locks]
FAIL3[Clock skew with redlock]
FAIL4([Lease expired but client still doing work])
end
subgraph Mitig[Mitigations]
TOK[Fencing token validated by resource]
SHORT[Short leases + autorenew]
HEART[Heartbeats]
SAFE[Operation idempotency]
end
subgraph Use[Typical uses]
LEADER[Leader election]
SINGLETON[Singleton job]
CRIT[Critical section access]
GUARD[Resource reservation]
end
API --- Backends
Failures -. mitigate .-> Mitig
Use --- API
classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
class FAIL1,FAIL4 client;
class ETCD,ZK,CONSUL,ACQ,RENEW,REL,FENCE,FAIL2,FAIL3,TOK,SHORT,HEART,SAFE,LEADER,SINGLETON,CRIT,GUARD service;
class DB datastore;
class REDIS cache;
Why fencing tokens#
- A lock alone is not enough: a client can pause (GC), lose its lease, then resume and act.
- Each acquire returns a monotonic token.
- The protected resource refuses operations with a token < latest seen → safe.
Redlock & critique#
- Redlock (Redis multi-node algorithm) widely debated; Kleppmann's critique shows clock-dependence pitfalls.
- For correctness-critical locks prefer etcd/ZK with proper fencing.
Glossary & fundamentals#
Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.
| Tag | Concept | What it is | Page |
|---|---|---|---|
HLD |
Raft / Paxos consensus | replicated state machine via majority quorum | consensus-raft-paxos |
HLD |
Idempotency & retries | safe re-execution, backoff + jitter | idempotency-retries |
LLD |
Creational patterns | Singleton, Factory, Builder, Prototype | creational-patterns |