Skip to content

YouTube — Detailed#

flowchart TB
  subgraph Creator
    APP([Creator app / Studio])
  end

  subgraph Upload[Upload Pipeline]
    PRE[Resumable upload<br/>chunked - tus / GCS]
    OBJ[(Origin: Colossus / S3)]
    META[(Video metadata)]
    QUE[[Transcode jobs queue]]
  end

  subgraph Transcode[Transcode Farm]
    SPLIT[Chunked by GoP /<br/>shot boundary]
    ENC([Encoders<br/>H.264 / VP9 / AV1 ladders])
    AUD([Audio encoder<br/>AAC / Opus])
    SUB([Auto captions / ASR])
    THUMB[Thumbnail extractor]
    PACK[HLS / DASH packager]
    DRM[DRM: Widevine / FairPlay / PlayReady]
  end

  subgraph Distribution
    EDGE[YouTube CDN<br/>Google Edge / ISP peering]
    GGC[Google Global Cache<br/>inside ISP]
    ABR([Adaptive Bitrate selector<br/>client side])
  end

  subgraph Discovery
    HOME[Home feed]
    SEARCH[Search]
    REL[Related / Up Next]
    SUBS[Subscriptions feed]
    SHORTS[Shorts]
  end

  subgraph Reco[Recommendation]
    CG([Candidate Gen<br/>two-tower + collaborative])
    RANK([Ranker DNN<br/>watch time objective])
    RR([Reranker - diversity, safety])
    SIG[Signals: watch, like, skip, search]
  end

  subgraph Engagement
    LIKE[Like / dislike]
    COMM[Comments]
    SUBN[Subscriptions]
    NOTIF[Notifications]
    PLAY[Playlists]
  end

  subgraph Ads
    ADSVR[Ad server]
    AUC[Auction / bidding]
    BR[Brand safety]
  end

  subgraph Live[Live Streaming]
    RTMP([Ingest RTMP / SRT])
    PKG[Live packager LL-HLS]
    DVR[DVR window]
  end

  subgraph Safety
    CP[Content ID]
    MOD[Moderation: ML + human]
    AGE[Age gating]
  end

  APP --> PRE --> OBJ
  PRE --> QUE
  QUE --> SPLIT --> ENC --> PACK --> DRM --> EDGE
  ENC --> AUD
  SPLIT --> THUMB
  SPLIT --> SUB
  Viewer --> EDGE
  Viewer --> ABR
  Viewer --> Discovery
  Discovery --> Reco
  Reco --> CG --> RANK --> RR --> Viewer
  SIG --> Reco
  Engagement --> SIG
  Ads --- Viewer
  Live --> EDGE
  Safety --- ENC
  Safety --- Discovery
  CP --- OBJ

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class APP,ABR client;
    class EDGE edge;
    class SPLIT,THUMB,PACK,DRM,GGC,HOME,SEARCH,REL,SUBS,SHORTS,SIG,LIKE,COMM,SUBN,NOTIF,PLAY,ADSVR,AUC,BR,PKG,DVR,CP,MOD,AGE service;
    class META datastore;
    class QUE queue;
    class ENC,AUD,SUB,CG,RANK,RR,RTMP compute;
    class PRE,OBJ storage;

Transcoding ladders (typical)#

  • 144p, 240p, 360p, 480p, 720p, 1080p, 1440p, 2160p (4K), HDR
  • Codecs: H.264 (baseline), VP9 (Premium / 4K), AV1 (modern devices).
  • Audio: AAC + Opus.
  • Output: HLS (Apple) + DASH (rest) segments, typically 2–6 s.

Storage#

  • Origin in Colossus (Google) or S3 (other CDNs).
  • Many copies per ladder × codec; popular videos pre-pushed to edge.
  • Cold videos kept on lower tier.

CDN strategy#

  • Google Edge nodes globally.
  • Google Global Cache appliances inside large ISPs to save peering cost.
  • Anycast routing to nearest POP.

Live streaming#

  • Ingest RTMP (legacy) or SRT/WebRTC; transcode on the fly.
  • Low-Latency HLS (LL-HLS) for ~2-3 s glass-to-glass.

Content ID#

  • Audio + video fingerprint match against rightsholder library.
  • Auto-claim → revenue share / block / track.

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD Load balancer / GSLB L4/L7 traffic distribution and failover load-balancer
HLD CDN edge caching for static assets cdn
LLD REST API design verbs, statuses, pagination, errors rest-api-design
LLD Behavioural patterns Strategy, Observer, State, Command, Chain behavioral-patterns