Google Photos — Notes
Functional
- Backup photos + videos from phones / desktops.
- Multi-resolution display + downloads.
- ML-driven search (objects, faces, text, locations).
- Albums, memories, partner sharing.
- Live / motion photos, RAW handling.
Non-functional
- 11×9s durability.
- p99 thumbnail load < 300 ms (CDN).
- Search response < 500 ms.
- 1B+ users; many EB of storage.
Capacity
- 100B+ photos. Avg original 4 MB; full ladder 6× = ~25 MB per item.
- Hot tier holds last ~30 days; rest cold.
- Search index per user ~MB, fits sharded easily.
Schema
media(id, owner_id, type, ts, original_ref, ladders[], hash, exif)
albums(id, owner, [media_ids])
labels(media_id, label, score, source)
faces(user_id, face_id, embedding, cluster_id)
Trade-offs
- Original retention vs storage cost: cold-tier old + dedup.
- Server-side ML vs client-side: cloud is much more capable; offload to keep client battery.
- Cross-user dedup: huge savings but ownership/privacy hazards — Google avoids it for non-identical content.
- Live photos = paired image + short video; transcode strategy diverges.
Refs
- Google Photos engineering posts (face clustering, search architecture).
- Apple iCloud Photos talks (different storage approach).
- Image dedup papers; CLIP / vision embeddings for search.
- ByteByteGo "Design Google Photos".