期刊
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
卷 83, 期 3, 页码 505-533出版社
WILEY
DOI: 10.1111/rssb.12425
关键词
adaptive regression; nonparametric methods; online learning; online regression trees
Random forest is a popular algorithm in supervised learning for its high accuracy, low parameter count, and reasonable computational cost. This paper introduces an online RF algorithm based on Mondrian Forests, which efficiently performs exact aggregation and shows competitiveness in multi-class classification.
Random forest (RF) is one of the algorithms of choice in many supervised learning applications, be it classification or regression. The appeal of such tree-ensemble methods comes from a combination of several characteristics: a remarkable accuracy in a variety of tasks, a small number of parameters to tune, robustness with respect to features scaling, a reasonable computational cost for training and prediction, and their suitability in high-dimensional settings. The most commonly used RF variants, however, are 'offline' algorithms, which require the availability of the whole dataset at once. In this paper, we introduce AMF, an online RF algorithm based on Mondrian Forests. Using a variant of the context tree weighting algorithm, we show that it is possible to efficiently perform an exact aggregation over all prunings of the trees; in particular, this enables to obtain a truly online parameter-free algorithm which is competitive with the optimal pruning of the Mondrian tree, and thus adaptive to the unknown regularity of the regression function. Numerical experiments show that AMF is competitive with respect to several strong baselines on a large number of datasets for multi-class classification.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据