期刊
PATTERN RECOGNITION LETTERS
卷 145, 期 -, 页码 37-42出版社
ELSEVIER
DOI: 10.1016/j.patrec.2021.01.033
关键词
Online regression; Incremental regression; Decision trees; Hoeffding trees
资金
- Sao Paulo Research Foundation - FAPESP [2018/07319-6]
- Fundacao de Amparo a Pesquisa do Estado de Sao Paulo (FAPESP) [18/07319-6] Funding Source: FAPESP
This paper introduces a hashing-based algorithm called Quantization Observer (QO) for monitoring and evaluating split candidates in numerical features for online tree regressors. QO is more effective in terms of memory and processing time compared to its competitors, providing accurate split point suggestions.
A central aspect of online decision trees is evaluating the incoming data and performing model growth. For such, trees much deal with different kinds of input features. Numerical features are no exception, and they pose additional challenges compared to other kinds of features, as there is no trivial strategy to choose the best point to make a split decision. Regression tasks are even more challenging because both the features and the target are continuous. Typical online solutions evaluate and store all the points monitored between split attempts, which goes against the constraints posed in real-time applications. In this paper, we introduce the Quantization Observer (QO), a simple yet effective hashing-based algorithm to monitor and evaluate split candidates in numerical features for online tree regressors. QO can be easily integrated into incremental decision trees, such as Hoeffding Trees, and it has a monitoring cost of O (1) per instance and a sub-linear cost to evaluate split candidates. Previous solutions had a O(logn) cost per insertion (in the best case) and a linear cost to evaluate split candidates. Our extensive experimental setup highlights QO's effectiveness in providing accurate split point suggestions while spending much less memory and processing time than its competitors. (C) 2021 Elsevier B.V. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据