4.2 Article

Stratified learning: A general-purpose statistical method for improved learning under covariate shift

期刊

出版社

WILEY
DOI: 10.1002/sam.11643

关键词

astrostatistics; bias reduction; domain adaptation; machine learning; propensity scores

向作者/读者索取更多资源

The study proposes a method to address the issue of covariate shift in supervised learning. By conditioning on propensity scores, the method achieves improved target prediction. It outperforms state-of-the-art methods in cosmology research.
We propose a simple, statistically principled, and theoretically justified method to improve supervised learning when the training set is not representative, a situation known as covariate shift. We build upon a well-established methodology in causal inference and show that the effects of covariate shift can be reduced or eliminated by conditioning on propensity scores. In practice, this is achieved by fitting learners within strata constructed by partitioning the data based on the estimated propensity scores, leading to approximately balanced covariates and much-improved target prediction. We refer to the overall method as Stratified Learning, or StratLearn. We demonstrate the effectiveness of this general-purpose method on two contemporary research questions in cosmology, outperforming state-of-the-art importance weighting methods. We obtain the best-reported AUC (0.958) on the updated Supernovae photometric classification challenge, and we improve upon existing conditional density estimation of galaxy redshift from Sloan Digital Sky Survey (SDSS) data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据