4.5 Article

Partial sufficient variable screening with categorical controls

期刊

出版社

ELSEVIER
DOI: 10.1016/j.csda.2023.107784

关键词

Categorical data; Conditional independence; Sufficient dimension reduction; Sure screening; Ultrahigh dimensional data analysis

向作者/读者索取更多资源

Variable screening is an important tool for dimension reduction in ultrahigh dimensional data analysis. This study proposes a partial sufficient variable screening method for the presence of control variables, which aims to reduce the predictive set without losing regression information. The method achieves variable screening by constraining the reduction of continuous variables using the subpopulations identified by categorical variables. The effectiveness of the method is demonstrated through simulation studies and an application in gene screening for diffuse large-B-cell lymphoma prognosis.
Variable screening as a fast and effective dimension reduction tool plays an important role in analyzing ultrahigh dimensional data. While a very large number of actual datasets contain both continuous and categorical variables, existing methods are mostly designed for continuous data. Partial sufficient variable screening, which aims to reduce the predictive set of primary interest without loss of regression information in the presence of some control variables, is proposed with theoretical guarantees. Specifically, for regression analyses involving mixed types of predictors, variable screening is approached under the notion of sufficiency by constraining the reduction of the continuous variables through the subpopulations identified by the categorical variables. The effectiveness of the proposed method is demonstrated through simulation studies encompassing a variety of regression and classification models, and an application in prognostic gene screening for diffuse large-B-cell lymphoma.Published by Elsevier B.V.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据