Skip to content

Reddit / Quora / Stack Overflow — Notes#

Functional#

  • Post / question with title, body, tags, media.
  • Threaded comments / answers (deeply nested).
  • Voting (up/down or accept).
  • Listings: hot, new, top, controversial.
  • Per-user feed, notifications, mentions.
  • Moderation (community + global).

Non-functional#

  • Read-heavy 50:1; comment voting bursts.
  • p99 listing < 200 ms.
  • Eventual consistency on vote counts OK; user must see own vote immediately.

Capacity (Reddit-scale)#

  • 50M+ DAU, 1B+ posts/comments total.
  • ~2B page views/day → ~25k/s avg.
  • Storage: text ~2 KB avg × 1B posts = 2 TB + comments many TB.

Schema highlights#

  • posts(id PK, subreddit_id, author, ts, title, body, score, num_comments)
  • comments(id, post_id, parent_id, path, author, body, score) materialized path
  • votes(user_id, target_id, value, ts) Cassandra; counters in Redis
  • subreddits(id, name, type, rules)

Trade-offs#

  • Materialized-path comments simplify thread queries; cost: rewrite path on move.
  • Eventually consistent vote counters allow huge scale; show user-personal vote from local cache.
  • Personalized vs algorithm-only: SO uses simple formulas (signals trust); Quora goes full ML.
  • Sharding by subreddit localizes hot subs but can hotspot one shard.

Refs#

  • Reddit "Hot" ranking blog post (Amir Salihefendic).
  • Quora engineering posts (Webnode, Quora Search).
  • Stack Overflow architecture (Nick Craver's blog series).
  • ByteByteGo "Design Reddit".