Home
HLD Fundamentals
Caching Strategies
HLD
LLD
Caching Strategies — Notes
Hierarchy (latency)
L1 CPU cache: 0.5–10 ns
RAM: 100 ns
App-local (Caffeine): ~µs
Remote cache (Redis same DC): 0.3–1 ms
SSD: 100 µs
HDD: 10 ms
Cross-DC: 50–200 ms
When NOT to cache
Mutates frequently and read-immediately-after-write semantics required.
Per-user data with tiny re-read probability.
Already cheap (< 1 ms) reads.
Sizing
Working-set estimation: 80/20 rule — cache hot 20% to hit 80%.
Hit ratio target: 90–99% for read-heavy.
Concurrency pitfalls
Stampede : 1000 RPS hit cache miss simultaneously → all query DB.
Fix: single-flight (one fetches, others wait), EX NX lock with backoff, or pre-compute.
Inconsistency window : DB updated, cache not yet evicted. Mitigate with short TTL or CDC.
Choosing TTL
Tolerance for staleness × refresh cost.
Add jitter (TTL ± 10%) to avoid mass expiry.
Refs
Facebook memcached at scale (NSDI '13), Netflix EVCache, DynamoDB DAX,
TinyLFU / W-TinyLFU paper (Caffeine).
Back to top