Fraud Detection — Detailed#
flowchart TB
subgraph Source[Event sources]
PAY[Payment events]
LOGIN[Login events]
SIG[Signups]
CHK[Checkouts]
end
subgraph Ingest
KAFKA[[Kafka]]
NORM[Normalize / enrich]
DEVICE([Device fingerprint])
GEO[Geo + IP intel]
end
subgraph Features[Feature store]
FE_RT[Real-time features<br/>velocity, recent counts]
FE_OF[Batch features<br/>account age, history]
AGG[[Stream aggregations]]
EMB([Embeddings: device, user, merchant])
end
subgraph Models[Scoring]
GBDT[GBDT model]
DNN[Neural model]
ANOM[Anomaly detector]
GRAPH[Graph features<br/>account linkage]
end
subgraph Rules
RULE[Rule engine<br/>blocklist, velocity caps]
BIN[BIN restrictions]
BLACKLIST[Blocklist]
end
subgraph Decision
AGGD([Score aggregator])
POL[Policy: allow / step-up / deny]
REASON[Reason codes]
end
subgraph Actions
STEP[3DS / OTP / KYC step-up]
QUEUE[[Manual review queue]]
BAN([Auto-block / device ban])
REV[Reversal]
end
subgraph Loop[Feedback loop]
LBL[Labels: chargebacks, complaints]
RETRAIN[Retraining pipelines]
AB[A/B test]
end
Source --> Ingest --> Features
Features --> Models --> Decision
Rules --> Decision
Decision --> Actions
Actions --> Loop --> Models
classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
class DEVICE,BAN client;
class PAY,LOGIN,SIG,CHK,NORM,GEO,FE_RT,FE_OF,GBDT,DNN,ANOM,RULE,BIN,BLACKLIST,POL,REASON,STEP,REV,LBL,RETRAIN,AB service;
class KAFKA,AGG,QUEUE queue;
class EMB,AGGD compute;
Real-time scoring path#
- Event arrives → enrich with device/geo → fetch features in low ms → score → return decision in < 100 ms.
- Cache features at edge for tail latency.
Velocity features#
- Counts in sliding windows (1m / 5m / 1h / 24h) per user, device, IP, BIN.
- Often implemented as Count-Min Sketch in Redis to bound memory.
Graph signals#
- Build graph of (account ↔ device ↔ payment instrument ↔ shipping address).
- Suspicious if nodes share many edges or short paths to known-bad.
Glossary & fundamentals#
Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.
| Tag | Concept | What it is | Page |
|---|---|---|---|
HLD |
Pub/Sub & message brokers | topics, consumer groups, delivery semantics | pub-sub-pattern |
HLD |
Probabilistic data structures | Bloom, HLL, Count-Min, MinHash, t-digest | probabilistic-data-structures |
HLD |
Event sourcing + CQRS | commands -> events; separate read model | event-sourcing-cqrs |