OLX / Craigslist — Detailed#
flowchart TB
subgraph Apps
BR
MOB
end
subgraph Edge
CDN
GW
end
subgraph Post[Post & Manage]
POST[Post Ad Service]
IMG[Image upload]
OBJ[(S3)]
CAT[Category / Taxonomy]
MOD[Moderation pipeline<br/>ML + human]
NSFW
DUPE[Duplicate detector]
end
subgraph Search
SRCH[Search API]
IDX[(Inverted index<br/>geo + filters)]
REL([Relevance ranker])
SAVED[Saved searches]
ALERT[Alert notifier]
end
subgraph Engage
CHAT[Chat Service]
WS[WebSocket]
PUSH
EMAIL
REPORT[Report / abuse]
end
subgraph Monetize
PROMO[Featured listing]
PAY[Payment]
SUB[Subscription / shops]
end
subgraph Trust
BAN[Ban list]
KYC[Light KYC]
PHONE([Phone verify])
end
Apps --> CDN --> GW
GW --> Post --> ADS[(Listings)]
Post --> Moderation:::nope
GW --> Search --> ADS
Search --> SAVED --> ALERT --> Engage
GW --> Engage --> CHAT
CHAT --> WS
Engage --> REPORT --> Trust
Monetize --- GW
classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
class PHONE client;
class POST,IMG,CAT,MOD,DUPE,SRCH,SAVED,CHAT,WS,REPORT,PROMO,PAY,SUB,BAN,KYC service;
class IDX,ADS datastore;
class REL compute;
class OBJ storage;
class ALERT obs;
Moderation pipeline#
- On create: ML classifies for prohibited content (weapons, drugs, fraud signals).
- Image NSFW model.
- Duplicate detection via perceptual hash + text similarity.
- Borderline cases → human queue.
Search#
- Heavy geo filter — index by city / S2 cell + categorical filters.
- Saved searches with periodic alert evaluation.
Glossary & fundamentals#
Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.
| Tag | Concept | What it is | Page |
|---|---|---|---|
HLD |
CDN | edge caching for static assets | cdn |
HLD |
Realtime protocols | WS / SSE / polling / gRPC streaming | realtime-protocols |
HLD |
Geo indexing | Geohash, Quadtree, S2, H3, R-tree | geo-indexing |
HLD |
Search internals | inverted index, BM25, embeddings, ANN | search-internals |