Skip to content

Spam / Abuse Detection — Detailed#

flowchart TB
  subgraph Source
    SIGNUP[Signups]
    MSG[Messages / posts]
    REVIEW[Reviews]
    PAY[Payments]
  end

  subgraph Features
    TEXT([Text features: token n-gram, embeddings])
    BEHAV[Behavior velocity, time-of-day]
    DEVICE([Device fingerprint])
    NET[IP / ASN / VPN detection]
    GRAPH[Account-link graph]
  end

  subgraph Models
    GBDT[GBDT]
    NLP[NLP / transformer]
    ANOM[Anomaly detector]
    SIMHASH[SimHash dedup]
  end

  subgraph Rules
    BLK[Blocklists]
    HARDLIM[Hard limits]
    REGEX[Pattern rules]
  end

  subgraph Decision
    SCORE([Score aggregator])
    POL[Policy]
    EXPL[Explanations]
  end

  subgraph Actions
    ALLOW
    STEP[Step-up: captcha, MFA]
    SHADOW[Shadow ban]
    BLOCK
    QUEUE[Manual review]
  end

  subgraph Loop
    LBL[Labels: reports, chargebacks]
    RETRAIN[Retraining]
  end

  Source --> Features --> Models --> Decision --> Actions
  Rules --> Decision
  Actions --> Loop --> Models

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class DEVICE client;
    class SIGNUP,MSG,REVIEW,PAY,BEHAV,NET,GBDT,NLP,ANOM,SIMHASH,BLK,HARDLIM,REGEX,POL,EXPL,STEP,SHADOW,QUEUE,LBL,RETRAIN service;
    class TEXT,SCORE compute;

Glossary & fundamentals#

Concept What it is Fundamentals
Probabilistic structures SimHash dedup probabilistic-data-structures
Pub/Sub event bus pub-sub-pattern
CDC label updates change-data-capture
Caching strategies feature cache caching-strategies
Resilience patterns safe defaults under model outage resilience-patterns
Observability drift + FPR/FNR observability