Pastebin — Detailed#
flowchart TB
subgraph Client
BR([Browser])
CLI[CLI / API]
end
subgraph Edge
DNS[DNS]
CDN[CDN / Edge cache]
LB[L7 LB]
WAF[WAF + abuse]
end
subgraph App[App Tier]
GW[API Gateway]
CRT[Create Service]
READ[Read Service]
DEL[Delete / Expire]
SYN[Syntax highlighter<br/>server-side]
AUTH[Auth - opt]
end
subgraph ID[ID generation]
SF[Snowflake / KGS]
B62[Base62 encode<br/>8 chars]
end
subgraph Storage
META[(Metadata SQL<br/>paste id, owner, expiry, lang)]
BLOB[(Object store S3 / GCS<br/>raw paste body)]
CACHE[(Redis hot pastes)]
end
subgraph Async
Q[[Kafka events<br/>create / view / delete]]
SC([Expiry sweeper<br/>cron])
ANL[Analytics<br/>views, languages]
AB([Abuse scanner<br/>secret detection])
end
subgraph Obs
M[Metrics]
L[Logs]
T[Traces]
end
BR --> DNS --> CDN --> LB --> WAF --> GW
CLI --> DNS
GW --> CRT
GW --> READ
GW --> DEL
CRT --> SF --> B62
CRT --> META
CRT --> BLOB
READ --> CACHE
CACHE -. miss .-> BLOB
READ --> META
READ --> SYN
DEL --> META
DEL --> BLOB
SC -.expire.-> META
SC -.delete.-> BLOB
CRT --> Q
Q --> ANL
Q --> AB
AB -.flag.-> META
App -.metrics.-> M
classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
class BR client;
class DNS,CDN,LB,WAF,GW edge;
class CLI,CRT,READ,DEL,SYN,AUTH,B62,ANL service;
class SF,META datastore;
class CACHE cache;
class Q queue;
class SC,AB compute;
class BLOB storage;
class M,L,T obs;
Storage choice#
- Body in object store (S3) — keyed by paste id; cheap, durable.
- Metadata in SQL (or DynamoDB) for indexing by owner, recent, expiry.
- Hot pastes cached in Redis as
id → body(TTL 1 hr).
Expiry#
- Either: (a) explicit cron sweeping
expires_at < now, (b) S3 lifecycle policy delete. - Soft-delete then GC for undo window.
Privacy modes#
- Public / Unlisted / Private (password-protected).
- For private: store paste encrypted with key derived from URL fragment (zero-knowledge — server can't read).
Anti-abuse#
- Scan content for API keys, PII; rate-limit per IP.
- Captcha + honeypot for anonymous create.
Glossary & fundamentals#
Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.
| Tag | Concept | What it is | Page |
|---|---|---|---|
HLD |
Load balancer / GSLB | L4/L7 traffic distribution and failover | load-balancer |
HLD |
CDN | edge caching for static assets | cdn |
HLD |
API gateway / BFF | single ingress, auth, rate limit, routing | api-gateway |
HLD |
Pub/Sub & message brokers | topics, consumer groups, delivery semantics | pub-sub-pattern |
HLD |
Observability | metrics, logs, traces, SLOs | observability |
HLD |
Search internals | inverted index, BM25, embeddings, ANN | search-internals |