[Tech Talk #41] Columbia DRO Tianyi Peng: When A/B Testing Platforms Meet Reinforcement Learning
A novel “Difference-in-Q” (DQ) estimator, based on reinforcement learning, is proposed to address the Interference problem in A/B testing. DQ outperforms traditional estimators in bias-variance trade-off, reducing bias and exponentially decreasing variance. Collaborating with ByteDance, DQ achieved a 99% reduction in mean squared error in large-scale commercial scenarios.
- Applied Statistics
- A/B testing
- experiment design
- interference
- off-policy evaluation
- reinforcement learning
3 minutes to read