logo

[Tech Talk #41] Columbia DRO Tianyi Peng: When A/B Testing Platforms Meet Reinforcement Learning

A novel “Difference-in-Q” (DQ) estimator, based on reinforcement learning, is proposed to address the Interference problem in A/B testing. DQ outperforms traditional estimators in bias-variance trade-off, reducing bias and exponentially decreasing variance. Collaborating with ByteDance, DQ achieved a 99% reduction in mean squared error in large-scale commercial scenarios.
3 minutes to read