4.1 Article

Classification of High-Dimensional Data with Ensemble of Logistic Regression Models

期刊

JOURNAL OF BIOPHARMACEUTICAL STATISTICS
卷 20, 期 1, 页码 160-171

出版社

TAYLOR & FRANCIS INC
DOI: 10.1080/10543400903280639

关键词

Aggregation; Class prediction; Cross-validation; Decision threshold; Majority voting; Random partition

资金

  1. Oak Ridge Institute for Science and Education
  2. U. S. Department of Energy
  3. U.S. Food and Drug Administration
  4. CSULB

向作者/读者索取更多资源

A classification method is developed based on ensembles of logistic regression models, with each model fitted from a different set of predictors determined by a random partition of the feature space. The proposed method enables class prediction by an ensemble of logistic regression models for a high-dimensional data set, which is impossible by a single logistic regression model due to the restriction that the sample size needs to be larger than the number of predictors. The proposed classification method is applied to gene expression data on pediatric acute myeloid leukemia (AML) patients to predict each patient's risk for treatment failure or relapse at the time of diagnosis. Hence, specific prognostic biomarkers can be used to predict outcomes in pediatric AML and formulate individual risk-adjusted treatment. Our study shows that the proposed method is comparable to other widely used models in generalized accuracy and is significantly improved in balance between sensitivity and specificity. The proposed ensemble algorithm enables the standard classification model to be used for classification of high-dimensional data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.1
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据