☆ 3.8 Proceedings Paper

Debiased Off-Policy Evaluation for Recommendation Systems

15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021) (2021)

Journal

15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021)

Volume -, Issue -, Pages 372-379

Publisher

ASSOC COMPUTING MACHINERY

DOI: 10.1145/3460231.3474231

Keywords

ad design; off-policy evaluation; bandit; reinforcement learning

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The paper proposes an alternative method to evaluate new algorithms by predicting their performance using historical data, validates the method through a simulation experiment and an advertisement design, and shows smaller mean squared errors compared to state-of-the-art methods.

Efficient methods to evaluate new algorithms are critical for improving interactive bandit and reinforcement learning systems such as recommendation systems. A/B tests are reliable, but are time- and money-consuming, and entail a risk of failure. In this paper, we develop an alternative method, which predicts the performance of algorithms given historical data that may have been generated by a different algorithm. Our estimator has the property that its prediction converges in probability to the true performance of a counterfactual algorithm at a rate of root N, as the sample size N increases. We also show a correct way to estimate the variance of our prediction, thus allowing the analyst to quantify the uncertainty in the prediction. These properties hold even when the analyst does not know which among a large number of potentially important state variables are actually important. We validate our method by a simulation experiment about reinforcement learning. We finally apply it to improve advertisement design by a major advertisement company. We find that our method produces smaller mean squared errors than state-of-the-art methods.

Debiased Off-Policy Evaluation for Recommendation Systems

Journal

15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021)

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Debiased Off-Policy Evaluation for Recommendation Systems

Journal

15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021)

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper