Skip to content

Database Sharding — Simple#

Problem statement (interviewer prompt)

A single primary database is saturated on writes at 10TB and 50k QPS. Design a sharding strategy that distributes data across N nodes, supports online resharding without downtime, and explains how cross-shard queries / transactions are handled.

Split one logical dataset across N independent DB nodes so reads/writes for distinct keys go to different machines.

flowchart LR
  App[Application]
  R[Shard Router]
  S1[(Shard 1<br/>users A-H)]
  S2[(Shard 2<br/>users I-P)]
  S3[(Shard 3<br/>users Q-Z)]
  App --> R
  R --> S1
  R --> S2
  R --> S3

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class App,R service;
    class S1,S2,S3 datastore;