4.7 Article

No Free Lunch in imbalanced learning

期刊

KNOWLEDGE-BASED SYSTEMS
卷 227, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2021.107222

关键词

Supervised learning; Imbalanced domain learning; No Free Lunch

资金

  1. National Funds through the Portuguese funding agency, FCT - Fundacao para a Ciencia e a Tecnologia, Portugal [UIDB/50014/2020]

向作者/读者索取更多资源

The No Free Lunch theorems have sparked debates on the impact of data preprocessing methods in the field of imbalanced domain learning. The study concludes that in the context of imbalanced domain learning, resampling strategies have equivalent impact on predictive model performance.
The No Free Lunch (NFL) theorems have sparked intense debate since their publication, from theoretical and practical perspectives. However, to this date, no discussion is provided concerning its impact in the established field of imbalanced domain learning (IDL), known for its challenges regarding learning and evaluation processes. Most importantly, understanding the effect of commonly used solutions in such a field would prove very useful for future research. In this paper, we study the impact of data preprocessing methods, also known as resampling strategies, under the framework of the NFL theorems. Focusing on binary classification tasks, we claim that in IDL settings, when given a learning algorithm and a uniformly distributed set of target functions, the core conclusions of the NFL theorems are extensible to resampling strategies. As such, given no a priori knowledge or assumptions concerning data domains, any two resampling strategies are identical concerning their impact in the performance of predictive models. We provide a theoretical analysis and discussion on the intersection between IDL and the NFL theorems to support such a claim. Also, we collect empirical evidence via a thorough experimental study, including 98 data sets from multiple real-world knowledge domains. (C) 2021 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据