☆ 4.1 Article

Robustness of random forests for regression

JOURNAL OF NONPARAMETRIC STATISTICS (2012)

Journal

JOURNAL OF NONPARAMETRIC STATISTICS

Volume 24, Issue 4, Pages 993-1006

Publisher

TAYLOR & FRANCIS LTD

DOI: 10.1080/10485252.2012.715161

Keywords

random forest; quantile regression forest; robustness; median; ranks; least-absolute deviations

Funding

Natural Sciences and Engineering Research Council of Canada (NSERC)
Le Fonds quebecois de la recherche sur la nature et les technologies (FQRNT)

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

In this paper, we empirically investigate the robustness of random forests for regression problems. We also investigate the performance of six variations of the original random forest method, all aimed at improving robustness. These variations are based on three main ideas: (1) robustify the aggregation method, (2) robustify the splitting criterion and (3) taking a robust transformation of the response. More precisely, with the first idea, we use the median (or weighted median), instead of the mean, to combine the predictions from the individual trees. With the second idea, we use least-absolute deviations from the median, instead of least-squares, as splitting criterion. With the third idea, we build the trees using the ranks of the response instead of the original values. The competing methods are compared via a simulation study with artificial data using two different types of contaminations and also with 13 real data sets. Our results show that all three ideas improve the robustness of the original random forest algorithm. However, a robust aggregation of the individual trees is generally more profitable than a robust splitting criterion.

Robustness of random forests for regression

Journal

JOURNAL OF NONPARAMETRIC STATISTICS

Publisher

TAYLOR & FRANCIS LTD

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Robustness of random forests for regression

Journal

JOURNAL OF NONPARAMETRIC STATISTICS

Publisher

TAYLOR & FRANCIS LTD

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper