Skip to content

Pastebin — Detailed#

flowchart TB
  subgraph Client
    BR([Browser])
    CLI[CLI / API]
  end

  subgraph Edge
    DNS[DNS]
    CDN[CDN / Edge cache]
    LB[L7 LB]
    WAF[WAF + abuse]
  end

  subgraph App[App Tier]
    GW[API Gateway]
    CRT[Create Service]
    READ[Read Service]
    DEL[Delete / Expire]
    SYN[Syntax highlighter<br/>server-side]
    AUTH[Auth - opt]
  end

  subgraph ID[ID generation]
    SF[Snowflake / KGS]
    B62[Base62 encode<br/>8 chars]
  end

  subgraph Storage
    META[(Metadata SQL<br/>paste id, owner, expiry, lang)]
    BLOB[(Object store S3 / GCS<br/>raw paste body)]
    CACHE[(Redis hot pastes)]
  end

  subgraph Async
    Q[[Kafka events<br/>create / view / delete]]
    SC([Expiry sweeper<br/>cron])
    ANL[Analytics<br/>views, languages]
    AB([Abuse scanner<br/>secret detection])
  end

  subgraph Obs
    M[Metrics] 
    L[Logs] 
    T[Traces]
  end

  BR --> DNS --> CDN --> LB --> WAF --> GW
  CLI --> DNS
  GW --> CRT
  GW --> READ
  GW --> DEL
  CRT --> SF --> B62
  CRT --> META
  CRT --> BLOB
  READ --> CACHE
  CACHE -. miss .-> BLOB
  READ --> META
  READ --> SYN
  DEL --> META
  DEL --> BLOB
  SC -.expire.-> META
  SC -.delete.-> BLOB
  CRT --> Q
  Q --> ANL
  Q --> AB
  AB -.flag.-> META
  App -.metrics.-> M

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class BR client;
    class DNS,CDN,LB,WAF,GW edge;
    class CLI,CRT,READ,DEL,SYN,AUTH,B62,ANL service;
    class SF,META datastore;
    class CACHE cache;
    class Q queue;
    class SC,AB compute;
    class BLOB storage;
    class M,L,T obs;

Storage choice#

  • Body in object store (S3) — keyed by paste id; cheap, durable.
  • Metadata in SQL (or DynamoDB) for indexing by owner, recent, expiry.
  • Hot pastes cached in Redis as id → body (TTL 1 hr).

Expiry#

  • Either: (a) explicit cron sweeping expires_at < now, (b) S3 lifecycle policy delete.
  • Soft-delete then GC for undo window.

Privacy modes#

  • Public / Unlisted / Private (password-protected).
  • For private: store paste encrypted with key derived from URL fragment (zero-knowledge — server can't read).

Anti-abuse#

  • Scan content for API keys, PII; rate-limit per IP.
  • Captcha + honeypot for anonymous create.

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD Load balancer / GSLB L4/L7 traffic distribution and failover load-balancer
HLD CDN edge caching for static assets cdn
HLD API gateway / BFF single ingress, auth, rate limit, routing api-gateway
HLD Pub/Sub & message brokers topics, consumer groups, delivery semantics pub-sub-pattern
HLD Observability metrics, logs, traces, SLOs observability
HLD Search internals inverted index, BM25, embeddings, ANN search-internals