Spotify — Notes
Functional
- Catalog of music + podcasts.
- Streaming playback (multi-device).
- Playlists (personal, collaborative, algorithmic).
- Recommendations (Discover Weekly, Daily Mix, Release Radar).
- Search, social sharing.
- Offline downloads.
- Spotify Connect (device handoff).
Non-functional
- 600M+ MAU, 200M+ premium.
- p99 first audio < 500 ms.
- 99.99% playback availability.
Capacity
- Tens of millions of tracks; each in 3-5 bitrates × few codecs.
- ~5 PB total catalog after multi-encode (much smaller than video CDN).
- 1B+ streams/day; royalty pipeline batch nightly.
Schema
tracks(id, album_id, artists[], duration_ms, isrc, audio_files[])
playlists(id, owner, collaborative, item_order[])
streams(user_id, track_id, ts, source, duration_ms) Kafka topic for events
taste_profile(user_id, vec)
Trade-offs
- Many small files for albums; CDN cache hit rate critical.
- Hi-Fi (lossless) explodes bandwidth — gated by subscription.
- Per-stream royalty model demands precise accounting; need idempotent stream logging.
- Podcast model differs: exclusive shows, ads (DAI), serving with different SLAs.
Refs
- Spotify Engineering blog (Apollo, Backstage, Discover Weekly explainers),
"The Spotify Recommendation Stack" talks,
ByteByteGo "Design Spotify", Alex Xu Vol 2.