期刊
COMPUTATIONAL STATISTICS & DATA ANALYSIS
卷 126, 期 -, 页码 78-91出版社
ELSEVIER SCIENCE BV
DOI: 10.1016/j.csda.2018.04.009
关键词
Balanced estimation; Measurement errors; High dimensionality; Model selection; Nearest positive semi-definite projection; Combined L-1 and concave regularization
资金
- National Natural Science Foundation of China [11471029, 11601501, 11671374, 71731010]
- Beijing Natural Science Foundation [1182003]
- Anhui Provincial Natural Science Foundation [1708085QA02]
- Research Grant by the Recruitment Program of Global Experts of China for Young Professionals
Noisy and missing data are often encountered in real applications such that the observed covariates contain measurement errors. Despite the rapid progress of model selection with contaminated covariates in high dimensions, methodology that enjoys virtues in all aspects of prediction, variable selection, and computation remains largely unexplored. In this paper, we propose a new method called as the balanced estimation for high-dimensional error-in-variables regression to achieve an ideal balance between prediction and variable selection under both additive and multiplicative measurement errors. It combines the strengths of the nearest positive semi-definite projection and the combined L-1 and concave regularization, and thus can be efficiently solved through the coordinate optimization algorithm. We also provide theoretical guarantees for the proposed methodology by establishing the oracle prediction and estimation error bounds equivalent to those for Lasso with the clean data set, as well as an explicit and asymptotically vanishing bound on the false sign rate that controls overfitting, a serious problem under measurement errors. Our numerical studies show that the amelioration of variable selection will in turn improve the prediction and estimation performance under measurement errors. (C) 2018 Elsevier B.V. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据