Skip to content

Google Photos — Notes#

Functional#

  • Backup photos + videos from phones / desktops.
  • Multi-resolution display + downloads.
  • ML-driven search (objects, faces, text, locations).
  • Albums, memories, partner sharing.
  • Live / motion photos, RAW handling.

Non-functional#

  • 11×9s durability.
  • p99 thumbnail load < 300 ms (CDN).
  • Search response < 500 ms.
  • 1B+ users; many EB of storage.

Capacity#

  • 100B+ photos. Avg original 4 MB; full ladder 6× = ~25 MB per item.
  • Hot tier holds last ~30 days; rest cold.
  • Search index per user ~MB, fits sharded easily.

Schema#

  • media(id, owner_id, type, ts, original_ref, ladders[], hash, exif)
  • albums(id, owner, [media_ids])
  • labels(media_id, label, score, source)
  • faces(user_id, face_id, embedding, cluster_id)

Trade-offs#

  • Original retention vs storage cost: cold-tier old + dedup.
  • Server-side ML vs client-side: cloud is much more capable; offload to keep client battery.
  • Cross-user dedup: huge savings but ownership/privacy hazards — Google avoids it for non-identical content.
  • Live photos = paired image + short video; transcode strategy diverges.

Refs#

  • Google Photos engineering posts (face clustering, search architecture).
  • Apple iCloud Photos talks (different storage approach).
  • Image dedup papers; CLIP / vision embeddings for search.
  • ByteByteGo "Design Google Photos".