期刊
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
卷 118, 期 542, 页码 987-999出版社
TAYLOR & FRANCIS INC
DOI: 10.1080/01621459.2021.1967164
关键词
Asymptotic optimality; Empirical Bayes; Linear regression; L-statistics; Nonparametric regression
In this study, we propose a method called Aurora for the estimation of effect sizes using multiple observations per unit. Aurora achieves near-Bayes optimal mean squared error without any assumptions or knowledge about the effect size distribution or noise. It leverages replication and recasts the estimation problem as a general regression problem. Aurora with linear regression matches the performance of various estimators including sample mean, trimmed mean, sample median, and James-Stein shrunk versions thereof.
We study empirical Bayes estimation of the effect sizes of N units from K noisy observations on each unit. We show that it is possible to achieve near-Bayes optimal mean squared error, without any assumptions or knowledge about the effect size distribution or the noise. The noise distribution can be heteroscedastic and vary arbitrarily from unit to unit. Our proposal, which we call Aurora, leverages the replication inherent in the K observations per unit and recasts the effect size estimation problem as a general regression problem. Aurora with linear regression provably matches the performance of a wide array of estimators including the sample mean, the trimmed mean, the sample median, as well as James-Stein shrunk versions thereof. Aurora automates effect size estimation for Internet-scale datasets, as we demonstrate on data from a large technology firm.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据