4.5 Article

On the Bias, Risk, and Consistency of Sample Means in Multi-armed Bandits

Journal

SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE
Volume 3, Issue 4, Pages 1278-1300

Publisher

SIAM PUBLICATIONS
DOI: 10.1137/20M1361249

Keywords

multi-armed bandits; sample mean; bias; risk bounds; consistency

Ask authors/readers for more resources

This paper extensively discusses the bias, risk, and consistency of sample means in multi-armed bandit experiments, identifying four distinct sources of selection bias. A new notion of effective sample size is introduced to bound the risk of the sample mean, with carefully designed examples provided for better understanding of the various sources of selection bias studied. The proofs in the paper combine variational representations of information-theoretic divergences with new martingale concentration inequalities.
The sample mean is among the most well-studied estimators in statistics, having many desirable properties such as unbiasedness and consistency. However, when analyzing data collected using a multi-armed bandit (MAB) experiment, the sample mean is biased and much remains to be understood about its properties. For example, when is it consistent, how large is its bias, and can we bound its mean squared error? This paper delivers a thorough and systematic treatment of the bias, risk, and consistency of MAB sample means. Specifically, we identify four distinct sources of selection bias (sampling, stopping, choosing, and rewinding) and analyze them both separately and together. We further demonstrate that a new notion of effective sample size can be used to bound the risk of the sample mean under suitable loss functions. We present several carefully designed examples to provide intuition on the different sources of selection bias we study. Our treatment is nonparametric and algorithm-agnostic, meaning that it is not tied to a specific algorithm or goal. In a nutshell, our proofs combine variational representations of information-theoretic divergences with new martingale concentration inequalities.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available