Time-Series Database — Notes
Functional
- Ingest timestamped numeric (and sometimes string) samples tagged by labels.
- Query with range, aggregations, downsampling.
- Retention + downsampling rules.
- Alerting on derived series.
Non-functional
- 1M+ samples/s per node (compressed format).
- p99 query < 1 s for typical dashboard panels.
- Retention from days (hot) to years (cold tier).
Capacity
- Active series per node: 1–10 M depending on RAM.
- Disk: ~1–2 bytes / sample after Gorilla compression.
Schema
- Series =
(name, label set) → unique ID.
- Sample =
(series_id, ts, value).
- Inverted index:
label=value → posting list of series IDs.
API
# Prometheus push gateway / remote_write
POST /api/v1/write protobuf
GET /api/v1/query?query=rate(http_requests_total[5m])
GET /api/v1/query_range
Trade-offs
- Cardinality limit: every prod TSDB lives or dies by it. Never label with user IDs.
- Pull vs push: Prometheus pulls (good for service discovery); Influx pushes (good for short-lived jobs).
- One backend vs federated: Thanos / Mimir federate many Prom shards behind a single query API.
- Long retention = cold tier on object storage; query merges hot + cold via sidecars.
Refs
- "Gorilla: A Fast, Scalable, In-Memory Time Series Database" (FB, VLDB '15).
- Prometheus storage docs; TimescaleDB chunking docs.
- Thanos / Cortex / VictoriaMetrics architecture posts.