Consistent Hashing — Detailed#
flowchart TB
subgraph Hashing[Hash Ring 0..2^32-1]
direction LR
H[hash function<br/>MD5 / Murmur3 / xxHash]
VN[Virtual Nodes /<br/>vnodes per server]
end
subgraph Ring[Logical Ring]
direction LR
A((vn_A1))
B((vn_B1))
C((vn_C1))
D((vn_A2))
E((vn_B2))
F((vn_C2))
G((vn_A3))
H1((vn_B3))
I((vn_C3))
A --> B --> C --> D --> E --> F --> G --> H1 --> I --> A
end
subgraph Servers[Physical Servers]
SA[Server A]
SB[Server B]
SC[Server C]
end
subgraph Replication
REP[Walk N successors<br/>for N replicas]
Q[Quorum: R+W>N]
end
subgraph Membership
GOSSIP[Gossip protocol<br/>SWIM]
HC[Failure detector<br/>phi-accrual]
REBAL[Bootstrap /<br/>handoff]
end
Key[key k]
Key --> H
H --> Ring
Ring -. owner = next vnode .-> SA
Ring -. owner = next vnode .-> SB
Ring -. owner = next vnode .-> SC
REP --> SA
REP --> SB
REP --> SC
GOSSIP -.membership.-> Ring
HC -.detect down.-> REBAL
REBAL -.move ranges.-> Servers
VN -.~100-256 vnodes/server.-> Ring
classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
class H,VN,SA,SB,SC,REP,Q,GOSSIP,HC,REBAL,Key service;
class A,B,C,D,E,F,G,H1,I queue;
Why virtual nodes#
- Without vnodes, ring is uneven; heterogeneous servers can't be weighted.
- ~100–256 vnodes/server smooths distribution; resize easily.
Replication strategy#
- Walk
Nsuccessor positions on the ring (skip same physical host / rack). - Read/write quorum:
R + W > Nfor consistency.
Variants#
- Jump consistent hash (Lamping & Veach): no ring, O(1) memory; only bucket count, no removal of arbitrary nodes.
- Rendezvous (HRW) hashing: pick max(hash(key,node)); easy weights.
- Maglev hashing (Google): lookup table, near-uniform, ECMP-friendly.
Where it's used#
- Memcached client (Ketama), Redis Cluster (slot 0–16383 via CRC16), Cassandra & Dynamo (token ring), Amazon DynamoDB, Riak, Akamai CDN, Google Maglev LB.
Glossary & fundamentals#
Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.
| Tag | Concept | What it is | Page |
|---|---|---|---|
HLD |
CDN | edge caching for static assets | cdn |
HLD |
Consistent hashing | key placement with minimal remap | consistent-hashing |
HLD |
Raft / Paxos consensus | replicated state machine via majority quorum | consensus-raft-paxos |
HLD |
Leader/follower replication | sync/semi-sync/async replication, failover | replication-leader-follower |
LLD |
Behavioural patterns | Strategy, Observer, State, Command, Chain | behavioral-patterns |