Skip to content

Change Data Capture — Simple#

Problem statement (interviewer prompt)

Design a pipeline that streams every change in a production OLTP database to a data warehouse, a search index, and a cache invalidator — without dual-writes and without losing events on failure. Cover CDC, the outbox pattern, schema evolution, and exactly-once semantics.

flowchart LR
  APP[App] --> DB[(Primary DB)]
  DB -. WAL / binlog .-> CDC[CDC tailer<br/>Debezium / DMS]
  CDC --> BUS[Kafka / Pub-Sub]
  BUS --> CACHE[Cache invalidator]
  BUS --> SEARCH[Search index]
  BUS --> DWH[Data warehouse]
  BUS --> AUDIT[Audit / event log]

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class APP,CACHE,SEARCH service;
    class DB,DWH datastore;
    class BUS,AUDIT queue;
    class CDC compute;