☆ 4.0 Article

Variable importance-weighted Random Forests

QUANTITATIVE BIOLOGY (2017)

期刊

QUANTITATIVE BIOLOGY

卷 5, 期 4, 页码 338-351

出版社

HIGHER EDUCATION PRESS

DOI: 10.1007/s40484-017-0121-6

关键词

Random Forests; variable importance score; classification; regression

类别

Mathematical & Computational Biology

资金

National Institutes of Health [R01 GM59507, P01 CA154295, P50 CA196530]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Background: Random Forests is a popular classification and regression method that has proven powerful for various prediction problems in biological studies. However, its performance often deteriorates when the number of features increases. To address this limitation, feature elimination Random Forests was proposed that only uses features with the largest variable importance scores. Yet the performance of this method is not satisfying, possibly due to its rigid feature selection, and increased correlations between trees of forest. Methods: We propose variable importance-weighted Random Forests, which instead of sampling features with equal probability at each node to build up trees, samples features according to their variable importance scores, and then select the best split from the randomly selected features. Results: We evaluate the performance of our method through comprehensive simulation and real data analyses, for both regression and classification. Compared to the standard Random Forests and the feature elimination Random Forests methods, our proposed method has improved performance in most cases. Conclusions: By incorporating the variable importance scores into the random feature selection step, our method can better utilize more informative features without completely ignoring less informative ones, hence has improved prediction accuracy in the presence of weak signals and large noises. We have implemented an R package viRandomForests based on the original R package randomForest and it can be freely downloaded from http:// zhaocenter.org/software.

Variable importance-weighted Random Forests

期刊

QUANTITATIVE BIOLOGY

出版社

HIGHER EDUCATION PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Variable importance-weighted Random Forests

期刊

QUANTITATIVE BIOLOGY

出版社

HIGHER EDUCATION PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文