A/B Testing — Notes
Functional
- Define experiment + variants + metrics.
- Bucket users deterministically.
- Collect exposures + metric events.
- Compute lifts with statistical rigor.
- Guardrails + kill switch.
Non-functional
- Assignment latency < 1 ms (client cache).
- Daily refresh of analyses, with streaming for guardrails.
- Reliable randomization.
Trade-offs
- Server-side vs client-side assignment: server eliminates leakage to client modifying.
- Fixed-horizon vs sequential: sequential lets you peek safely.
- Many overlapping experiments demands mutual-exclusion groups.
Refs
- "Trustworthy Online Controlled Experiments" Kohavi et al.
- Microsoft ExP, Booking, LinkedIn, Airbnb experimentation blog posts.
- Optimizely / GrowthBook docs.