Skip to content

Time-Series Database — Notes#

Functional#

  • Ingest timestamped numeric (and sometimes string) samples tagged by labels.
  • Query with range, aggregations, downsampling.
  • Retention + downsampling rules.
  • Alerting on derived series.

Non-functional#

  • 1M+ samples/s per node (compressed format).
  • p99 query < 1 s for typical dashboard panels.
  • Retention from days (hot) to years (cold tier).

Capacity#

  • Active series per node: 1–10 M depending on RAM.
  • Disk: ~1–2 bytes / sample after Gorilla compression.

Schema#

  • Series = (name, label set) → unique ID.
  • Sample = (series_id, ts, value).
  • Inverted index: label=value → posting list of series IDs.

API#

# Prometheus push gateway / remote_write
POST /api/v1/write   protobuf
GET  /api/v1/query?query=rate(http_requests_total[5m])
GET  /api/v1/query_range

Trade-offs#

  • Cardinality limit: every prod TSDB lives or dies by it. Never label with user IDs.
  • Pull vs push: Prometheus pulls (good for service discovery); Influx pushes (good for short-lived jobs).
  • One backend vs federated: Thanos / Mimir federate many Prom shards behind a single query API.
  • Long retention = cold tier on object storage; query merges hot + cold via sidecars.

Refs#

  • "Gorilla: A Fast, Scalable, In-Memory Time Series Database" (FB, VLDB '15).
  • Prometheus storage docs; TimescaleDB chunking docs.
  • Thanos / Cortex / VictoriaMetrics architecture posts.