Skip to content

YouTube — Notes#

Functional#

  • Resumable video upload.
  • Multi-resolution + codec transcode.
  • Global delivery via CDN with ABR.
  • Discovery: home, search, related, subscriptions, shorts.
  • Engagement: likes, comments, subs, notifications, playlists.
  • Live streaming + DVR.
  • Ads, Premium (ad-free), monetization for creators.
  • Content ID copyright management.

Non-functional#

  • 2.5B MAU; 500h video / minute uploaded.
  • p99 first frame < 2 s globally.
  • 99.99% playback availability.

Capacity#

  • 500h/min uploads × 60 min × 24h = 720k h/day.
  • Avg 1 GB/h raw → 720 PB/day raw → many EB/yr.
  • Transcoded ladders push that 3-5× to storage; offset by aggressive cold-tiering.
  • View traffic: tens of exabytes/year egress.

Schema highlights#

  • videos(id, channel_id, title, desc, ts, lang, ladders[], status)
  • engagements(video_id, user_id, type, ts) for watch/like
  • subs(viewer, channel)
  • embeddings(video_id, vec)

Trade-offs#

  • Resumable chunked upload mandatory for large/spotty creator uploads.
  • VP9 / AV1 saves egress but costs CPU encoding time; tier roll-out.
  • Pre-push vs on-demand to edge: predict by signals; cold videos pull on first miss.
  • Watch-time objective pushed dwell up but moderation backlash → balance with safety metrics.

Refs#

  • "Deep Neural Networks for YouTube Recommendations" (Covington et al., RecSys '16).
  • "Vitess" architecture (YouTube's MySQL scaler).
  • Google's CDN / Edge engineering posts; Open Connect (Netflix) for comparison.
  • ByteByteGo "Design YouTube", Alex Xu Vol 2.