☆ 4.6 Article

Feature-ranked self-growing forest: a tree ensemble based on structure diversity for classification and regression

NEURAL COMPUTING & APPLICATIONS (2023)

期刊

NEURAL COMPUTING & APPLICATIONS

卷 35, 期 13, 页码 9285-9298

出版社

SPRINGER LONDON LTD

DOI: 10.1007/s00521-023-08202-y

关键词

Machine learning; Ensemble; Tree; Random forest

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Tree ensemble algorithms, like random forest, are widely used in machine learning, but the number of trees in the ensemble is an important hyperparameter. A new algorithm called feature-ranked self-growing forest (FSF) is introduced, which automatically grows the ensemble based on the structural diversity of the trees' nodes. The performance of FSF was tested with classification and regression datasets and compared to random forest, showing superior performance in most cases.

Tree ensemble algorithms, such as random forest (RF), are some of the most widely applied methods in machine learning. However, an important hyperparameter, the number of classification or regression trees within the ensemble must be specified in these algorithms. The number of trees within the ensemble can adversely affect bias or computational cost and should ideally be adapted for each task. For this reason, a novel tree ensemble is described, the feature-ranked self-growing forest (FSF), that allows the automatic growth of a tree ensemble based on the structural diversity of the first two levels of trees' nodes. The algorithm's performance was tested with 30 classification and 30 regression datasets and compared with RF. The computational complexity was also theoretically and experimentally analyzed. FSF had a significant higher performance for 57%, and an equivalent performance for 27% of classification datasets compared to RF. FSF had a higher performance for 70% and an equivalent performance for 7% of regression datasets compared to RF. Computational complexity of FSF was competitive compared to that of other tree ensembles, being mainly dependent on the number of observations within the dataset. Therefore, it can be implied that FSF is a suitable out-of-the-box approach with potential as a tool for feature ranking and dataset's complexity analysis using the number of trees computed for a particular task. A MATLAB and Python implementation of the algorithm and a working example for classification and regression are provided for academic use.

Feature-ranked self-growing forest: a tree ensemble based on structure diversity for classification and regression

期刊

NEURAL COMPUTING & APPLICATIONS

出版社

SPRINGER LONDON LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Feature-ranked self-growing forest: a tree ensemble based on structure diversity for classification and regression

期刊

NEURAL COMPUTING & APPLICATIONS

出版社

SPRINGER LONDON LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文