期刊
KNOWLEDGE-BASED SYSTEMS
卷 227, 期 -, 页码 -出版社
ELSEVIER
DOI: 10.1016/j.knosys.2021.107222
关键词
Supervised learning; Imbalanced domain learning; No Free Lunch
资金
- National Funds through the Portuguese funding agency, FCT - Fundacao para a Ciencia e a Tecnologia, Portugal [UIDB/50014/2020]
The No Free Lunch theorems have sparked debates on the impact of data preprocessing methods in the field of imbalanced domain learning. The study concludes that in the context of imbalanced domain learning, resampling strategies have equivalent impact on predictive model performance.
The No Free Lunch (NFL) theorems have sparked intense debate since their publication, from theoretical and practical perspectives. However, to this date, no discussion is provided concerning its impact in the established field of imbalanced domain learning (IDL), known for its challenges regarding learning and evaluation processes. Most importantly, understanding the effect of commonly used solutions in such a field would prove very useful for future research. In this paper, we study the impact of data preprocessing methods, also known as resampling strategies, under the framework of the NFL theorems. Focusing on binary classification tasks, we claim that in IDL settings, when given a learning algorithm and a uniformly distributed set of target functions, the core conclusions of the NFL theorems are extensible to resampling strategies. As such, given no a priori knowledge or assumptions concerning data domains, any two resampling strategies are identical concerning their impact in the performance of predictive models. We provide a theoretical analysis and discussion on the intersection between IDL and the NFL theorems to support such a claim. Also, we collect empirical evidence via a thorough experimental study, including 98 data sets from multiple real-world knowledge domains. (C) 2021 Elsevier B.V. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据