Skip to content

Caching Strategies — Notes#

Hierarchy (latency)#

  • L1 CPU cache: 0.5–10 ns
  • RAM: 100 ns
  • App-local (Caffeine): ~µs
  • Remote cache (Redis same DC): 0.3–1 ms
  • SSD: 100 µs
  • HDD: 10 ms
  • Cross-DC: 50–200 ms

When NOT to cache#

  • Mutates frequently and read-immediately-after-write semantics required.
  • Per-user data with tiny re-read probability.
  • Already cheap (< 1 ms) reads.

Sizing#

  • Working-set estimation: 80/20 rule — cache hot 20% to hit 80%.
  • Hit ratio target: 90–99% for read-heavy.

Concurrency pitfalls#

  • Stampede: 1000 RPS hit cache miss simultaneously → all query DB. Fix: single-flight (one fetches, others wait), EX NX lock with backoff, or pre-compute.
  • Inconsistency window: DB updated, cache not yet evicted. Mitigate with short TTL or CDC.

Choosing TTL#

  • Tolerance for staleness × refresh cost.
  • Add jitter (TTL ± 10%) to avoid mass expiry.

Refs#

  • Facebook memcached at scale (NSDI '13), Netflix EVCache, DynamoDB DAX, TinyLFU / W-TinyLFU paper (Caffeine).