Skip to content

Capacity Planning — Detailed#

Numbers every developer should know#

Operation Time
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns
Mutex lock/unlock 25 ns
Main memory reference 100 ns
Compress 1 KB with Zippy 3,000 ns (3 µs)
Send 1 KB over 1 Gbps net 10,000 ns (10 µs)
Read 4 KB random from SSD 150,000 ns (150 µs)
Read 1 MB sequential from RAM 250,000 ns
Round trip in same DC 500,000 ns (0.5 ms)
Read 1 MB sequential from SSD 1,000,000 ns (1 ms)
HDD seek 10,000,000 ns (10 ms)
Cross-continent round trip 150,000,000 ns (150 ms)

(Memorise the orders of magnitude. Interviewers test the gap between RAM and disk a lot.)

Common throughput envelopes#

Resource Order of magnitude
One commodity server CPU ~100k simple req/s
10 GbE NIC ~1 GB/s (8 Gbps usable)
NVMe SSD 1–7 GB/s, 1M IOPS
Postgres single primary 5–20k writes/s, much more reads via replicas
Redis single node 100k–1M ops/s
Kafka partition 10 MB/s comfortably
One WS gateway 100k concurrent conns

Little's Law#

L = λ · W
  • L = concurrent items in system.
  • λ = arrival rate (req/s).
  • W = average time in system (seconds).

If p99 = 200 ms and target 1000 RPS, concurrency = 200.

The 5-step back-of-envelope#

flowchart LR
  U[1 - User volume]
  Q[2 - QPS - avg + peak]
  S[3 - Storage / yr]
  B[4 - Bandwidth]
  C[5 - Servers needed]
  U --> Q --> S --> B --> C

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class U client;
    class Q,S,B,C service;

Worked example — design a chat app#

  1. Users: 200M MAU. ~25M DAU (≈ 12.5% of MAU).
  2. Activity: 30 msgs / user / day → 750M msgs/day.
  3. Avg QPS = 750M / 86,400 ≈ 8,700/s.
  4. Peak ≈ 3-5× avg ≈ 35,000/s.
  5. Storage: 1 KB / msg × 750M × 365 = ~270 TB/year. With media + indices, x3 → ~800 TB/yr.
  6. Bandwidth: 35k peak × 1 KB = 35 MB/s ingest. Egress (fan-out 1:1 chat) similar.
  7. Servers: WS gateways for connections. 25M DAU / 100k per box = 250 boxes (with HA + headroom → 350).

Headroom & growth#

  • Always size to peak, not average.
  • Keep 30–50% headroom for failover (lose a region or two AZs).
  • Plan for 1 year of growth; revisit quarterly.

Storage tiers#

Tier Cost/GB/mo Latency
RAM $1+ ns
NVMe SSD $0.10 µs
Standard cloud SSD $0.10 ms
Object store (hot) $0.02 10s of ms
Cold archive (Glacier) $0.004 minutes to hours

Queueing theory primer#

  • M/M/1: W = 1 / (μ - λ) where μ = service rate.
  • Utilisation > 70% → queue grows; > 90% → latency explodes.
  • Engineer for ≤ 70% steady-state.

Common mistakes#

  • Confusing bytes vs bits (network speeds).
  • Forgetting replication factor when sizing storage.
  • Computing QPS as avg-only; peak is what blows up.
  • Ignoring the read:write ratio (often 100:1 for content).
  • Putting everything in one column "DB" — split per workload.

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD Pub/Sub & message brokers topics, consumer groups, delivery semantics pub-sub-pattern
HLD Leader/follower replication sync/semi-sync/async replication, failover replication-leader-follower
HLD Capacity planning BOE, Little's Law, queueing capacity-planning
LLD Concurrency primitives mutex, semaphore, RW lock, atomic, CAS concurrency-primitives