Journal
AMERICAN STATISTICIAN
Volume 63, Issue 4, Pages 308-319Publisher
TAYLOR & FRANCIS INC
DOI: 10.1198/tast.2009.08199
Keywords
Linear model; Random forest; Variable importance
Categories
Ask authors/readers for more resources
Relative importance of regressor variables is an old topic that still awaits a satisfactory solution. When interest is in attributing importance in linear regression, averaging over orderings methods for decomposing R-2 are among the state-of-the-art methods, although the mechanism behind their behavior is not (yet) completely understood. Random forests-a machine-learning tool for classification and regression proposed a few years ago-have an inherent procedure of producing variable importances. This article compares the two approaches (linear model on the one hand and two versions of random forests on the other hand) and finds both striking similarities and differences, some of which can be explained whereas others remain a challenge. The investigation improves understanding of the nature of variable importance in random forests. This article has supplementary material online.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available