4.1 Article

Robustness of random forests for regression

期刊

JOURNAL OF NONPARAMETRIC STATISTICS
卷 24, 期 4, 页码 993-1006

出版社

TAYLOR & FRANCIS LTD
DOI: 10.1080/10485252.2012.715161

关键词

random forest; quantile regression forest; robustness; median; ranks; least-absolute deviations

资金

  1. Natural Sciences and Engineering Research Council of Canada (NSERC)
  2. Le Fonds quebecois de la recherche sur la nature et les technologies (FQRNT)

向作者/读者索取更多资源

In this paper, we empirically investigate the robustness of random forests for regression problems. We also investigate the performance of six variations of the original random forest method, all aimed at improving robustness. These variations are based on three main ideas: (1) robustify the aggregation method, (2) robustify the splitting criterion and (3) taking a robust transformation of the response. More precisely, with the first idea, we use the median (or weighted median), instead of the mean, to combine the predictions from the individual trees. With the second idea, we use least-absolute deviations from the median, instead of least-squares, as splitting criterion. With the third idea, we build the trees using the ranks of the response instead of the original values. The competing methods are compared via a simulation study with artificial data using two different types of contaminations and also with 13 real data sets. Our results show that all three ideas improve the robustness of the original random forest algorithm. However, a robust aggregation of the individual trees is generally more profitable than a robust splitting criterion.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.1
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据