Instagram — Notes
Functional
- Photo/video upload with filters.
- Feed (home, explore, reels), stories, DMs.
- Like / comment / save.
- Hashtag + user search.
- Notifications.
Non-functional
- 2B MAU, ~100M photos/day, billions of feed opens.
- p99 feed open < 250 ms.
Capacity
- Uploads: 100M/day = 1.2k/s avg, 10k/s peak; avg 3 MB → 3 GB/s peak ingest.
- Storage: 100M × 3 MB × 365 = 110 PB/yr raw; with 3 resolutions and HEVC → ~250 PB/yr.
- Hot CDN: ~1 PB working set.
Schema
media(id PK, owner_id, type, ts, caption, location, sizes[])
follow(follower, followee)
home_feed(user_id, [(ts, media_id)]) Redis ZSET, capped 500.
stories(id, owner, expires_at)
ID
- 64-bit "Instagram ID":
[timestamp ms | shard | seq] Postgres-side generator.
Trade-offs
- Haystack vs S3: Haystack reduces inode metadata overhead for small files.
- Hybrid push/pull for fan-out (same as Twitter).
- Server-side resize vs client adaptive: prefer server-side multiple ladder + CDN.
- AI tagging: improves search/safety but adds GPU cost.
Refs
- Instagram engineering blog (sharding, IDs, feed), Haystack paper,
FB TAO paper, ByteByteGo "Design Instagram", Alex Xu Vol 2.