Skip to content

Consistent Hashing — Detailed#

flowchart TB
  subgraph Hashing[Hash Ring 0..2^32-1]
    direction LR
    H[hash function<br/>MD5 / Murmur3 / xxHash]
    VN[Virtual Nodes /<br/>vnodes per server]
  end

  subgraph Ring[Logical Ring]
    direction LR
    A((vn_A1))
    B((vn_B1))
    C((vn_C1))
    D((vn_A2))
    E((vn_B2))
    F((vn_C2))
    G((vn_A3))
    H1((vn_B3))
    I((vn_C3))
    A --> B --> C --> D --> E --> F --> G --> H1 --> I --> A
  end

  subgraph Servers[Physical Servers]
    SA[Server A]
    SB[Server B]
    SC[Server C]
  end

  subgraph Replication
    REP[Walk N successors<br/>for N replicas]
    Q[Quorum: R+W>N]
  end

  subgraph Membership
    GOSSIP[Gossip protocol<br/>SWIM]
    HC[Failure detector<br/>phi-accrual]
    REBAL[Bootstrap /<br/>handoff]
  end

  Key[key k]
  Key --> H
  H --> Ring
  Ring -. owner = next vnode .-> SA
  Ring -. owner = next vnode .-> SB
  Ring -. owner = next vnode .-> SC
  REP --> SA
  REP --> SB
  REP --> SC
  GOSSIP -.membership.-> Ring
  HC -.detect down.-> REBAL
  REBAL -.move ranges.-> Servers
  VN -.~100-256 vnodes/server.-> Ring

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class H,VN,SA,SB,SC,REP,Q,GOSSIP,HC,REBAL,Key service;
    class A,B,C,D,E,F,G,H1,I queue;

Why virtual nodes#

  • Without vnodes, ring is uneven; heterogeneous servers can't be weighted.
  • ~100–256 vnodes/server smooths distribution; resize easily.

Replication strategy#

  • Walk N successor positions on the ring (skip same physical host / rack).
  • Read/write quorum: R + W > N for consistency.

Variants#

  • Jump consistent hash (Lamping & Veach): no ring, O(1) memory; only bucket count, no removal of arbitrary nodes.
  • Rendezvous (HRW) hashing: pick max(hash(key,node)); easy weights.
  • Maglev hashing (Google): lookup table, near-uniform, ECMP-friendly.

Where it's used#

  • Memcached client (Ketama), Redis Cluster (slot 0–16383 via CRC16), Cassandra & Dynamo (token ring), Amazon DynamoDB, Riak, Akamai CDN, Google Maglev LB.

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD CDN edge caching for static assets cdn
HLD Consistent hashing key placement with minimal remap consistent-hashing
HLD Raft / Paxos consensus replicated state machine via majority quorum consensus-raft-paxos
HLD Leader/follower replication sync/semi-sync/async replication, failover replication-leader-follower
LLD Behavioural patterns Strategy, Observer, State, Command, Chain behavioral-patterns