4.6 Article

ranger: A Fast Implementation of Random Forests for High Dimensional Data in C plus plus and R

Journal

JOURNAL OF STATISTICAL SOFTWARE
Volume 77, Issue 1, Pages 1-17

Publisher

JOURNAL STATISTICAL SOFTWARE
DOI: 10.18637/jss.v077.i01

Keywords

C plus; classification; machine learning; R; random forests; Rcpp; recursive partitioning; survival analysis

Funding

  1. European Union [HEALTH-2011-278913]
  2. DFG Cluster of Excellence Inflammation at Interfaces

Ask authors/readers for more resources

We introduce the C++ application and R package ranger. The software is a fast implementation of random forests for high dimensional data. Ensembles of classification, regression and survival trees are supported. We describe the implementation, provide examples, validate the package with a reference implementation, and compare runtime and memory usage with other implementations. The new software proves to scale best with the number of features, samples, trees, and features tried for splitting. Finally, we show that ranger is the fastest and most memory efficient implementation of random forests to analyze data on the scale of a genome-wide association study.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available