期刊
STATISTICA SINICA
卷 31, 期 1, 页码 391-420出版社
STATISTICA SINICA
DOI: 10.5705/ss.202018.0170
关键词
Pairwise screening; penalized regression; sure independence screening; variable selection
资金
- NSF [DMS-1821231, DMS-1613112, IIS-1633212, DMS-1916237]
- NIH [R01GM126550]
This study introduces a variable selection method incorporating pairwise effects in covariates and combines it with sure independence screening, showing competitive performance in terms of prediction accuracy and variable selection accuracy.
In relation to variable selection, most existing screening methods focus on marginal effects and ignore the dependence between covariates. To improve the performance of variable selection, we incorporate pairwise effects in covariates for screening and penalization. We achieve this by studying the asymptotic distribution of the maximal absolute pairwise sample correlation between independent covariates. The novelty of the theory is that the convergence is related to the dimensionality p, and is uniform with respect to the sample size n. Moreover, we obtain an upper bound for the maximal pairwise R squared when regressing the response onto two covariates. Based on these extreme-value results, we propose a screening procedure to detect covariates pairs that are potentially correlated and associated with the response. We further combine the pairwise screening with sure independence screening and develop a new regularized variable selection procedure. Numerical studies show that our method is competitive in terms of both prediction accuracy and variable selection accuracy.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据