Skip to content

Real-time Analytics — Simple#

Problem statement (interviewer prompt)

Design a real-time analytics platform for clickstream events: ingest 1M+ events/s, sessionise per user, compute funnel + retention + cohort metrics with sub-minute freshness, and let analysts query both live and historical data with sub-second latency for dashboards.

flowchart LR
  E[Events]
  K[[Kafka]]
  ST[[Stream Processor<br/>Flink / Kinesis]]
  AGG[(Aggregates)]
  OLAP[(OLAP store<br/>Druid / ClickHouse)]
  DASH[Dashboards]
  E --> K --> ST --> AGG --> OLAP --> DASH

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class E service;
    class AGG,OLAP datastore;
    class K,ST queue;
    class DASH obs;