Real-time Analytics — Notes
Functional
- Sessionization, funnels, retention.
- Per-event dashboards & ad-hoc queries.
- Real-time alerts.
- Export to warehouse for SQL.
Non-functional
- Pipeline lag < 5 s.
- Sub-second dashboard for last 24h.
- Petabytes long-term in lake.
Trade-offs
- Druid/Pinot for high-QPS dashboards; ClickHouse for ad-hoc SQL.
- Lambda vs Kappa: kappa simpler if stream-first.
- Exactly-once is expensive; design idempotent counters.
Refs
- Druid, Pinot, ClickHouse docs.
- "Streaming Systems" Tyler Akidau.
- Snowplow / Heap engineering posts.