Reinforcement Learning

[Tech Talk #41] Columbia DRO Tianyi Peng: When A/B Testing Platforms Meet Reinforcement Learning

A novel “Difference-in-Q” (DQ) estimator, based on reinforcement learning, is proposed to address the Interference problem in A/B testing. DQ outperforms traditional estimators in bias-variance trade-off, reducing bias and exponentially decreasing variance. Collaborating with ByteDance, DQ achieved a 99% reduction in mean squared error in large-scale commercial scenarios.

Applied Statistics
A/B testing
experiment design
interference
off-policy evaluation
reinforcement learning

2023-09-03

3 minutes to read