☆ 4.7 Article

On the Suitability of Bagging-Based Ensembles with Borderline Label Noise

MATHEMATICS (2022)

期刊

MATHEMATICS

卷 10, 期 11, 页码 -

出版社

MDPI

DOI: 10.3390/math10111892

关键词

borderline noise; label noise; bagging; ensembles; robust learners; classification

类别

Mathematics

资金

MCIU/AEI/ERDF, UE [PGC2018-098860-B-I00]
ERDF Operational Programme 2014-2020 [A-FQM-345-UGR18]
Economy and Knowledge Council of the Regional Government of Andalusia, Spain
MCIN/AEI [CEX2020-001105-M]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Real-world classification data often contain noise, which can affect the accuracy and complexity of models. Building ensembles of classifiers, such as bagging, has shown potential in reducing the effects of noise. This paper investigates the usage of bagging techniques in complex problems where noise impacts decision boundaries among classes, and finds that bagging can achieve better accuracy and robustness in the presence of borderline noise.

Real-world classification data usually contain noise, which can affect the accuracy of the models and their complexity. In this context, an interesting approach to reduce the effects of noise is building ensembles of classifiers, which traditionally have been credited with the ability to tackle difficult problems. Among the alternatives to build ensembles with noisy data, bagging has shown some potential in the specialized literature. However, existing works in this field are limited and only focus on the study of noise based on a random mislabeling, which is unlikely to occur in real-world applications. Recent research shows that other types of noise, such as that occurring at class boundaries, are more common and challenging for classification algorithms. This paper delves into the analysis of the usage of bagging techniques in these complex problems, in which noise affects the decision boundaries among classes. In order to investigate whether bagging is able to reduce the impact of borderline noise, an experimental study is carried out considering a large number of datasets with different noise levels, and several noise models and classification algorithms. The results obtained reflect that bagging obtains a better accuracy and robustness than the individual models with this complex type of noise. The highest improvements in average accuracy are around 2-4% and are generally found at medium-high noise levels (from 15-20% onwards). The partial consideration of noisy samples when creating the subsamples from the original training set in bagging can make it so that only some parts of the decision boundaries among classes are impaired when building each model, reducing the impact of noise in the global system.

On the Suitability of Bagging-Based Ensembles with Borderline Label Noise

期刊

MATHEMATICS

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

On the Suitability of Bagging-Based Ensembles with Borderline Label Noise

期刊

MATHEMATICS

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文