4.7 Article

A partition-based problem transformation algorithm for classifying imbalanced multi-label data

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.engappai.2023.107506

关键词

Multi-label learning; Class imbalance learning; Hierarchical clustering; Problem transformation; Label correlations

向作者/读者索取更多资源

The study proposes a novel partition-based imbalanced multi-label learning algorithm, MLHC, which divides the original label space into disconnected subspaces using hierarchical clustering. It successfully tackles the class imbalance problem in multi-label data and outperforms other class imbalance multi-label learning algorithms.
Multi-label learning has garnered much research interest due to its wide range of real-world applications. Many multi-label learning methods have been proposed; however, few have addressed the class imbalance problem existing in multi-label data. Even though some studies have taken this issue into account, most of them have ignored the label correlations or only considered random correlations between them. In this study, we propose a novel partition-based imbalanced multi-label learning algorithm, named Multi-label Learning based on Hierarchical Clustering (MLHC), to tackle this problem. MLHC first carries out hierarchical clustering on the original label space to divide it into several disconnected subspaces, each of which contains several labels that are strongly correlated with each other. Then, for each label subspace, we use the problem transformation strategy to convert it into a multi-class problem by binary coding. Any multi-class imbalance learning algorithm can be applied to the transformed multi-class data. Finally, the classification results will be decoded to retrieve the corresponding label subspace, and all label subspace results are combined to show the predicted label vector in the original label space. We conducted experiments not only on thirteen benchmark multi-label datasets but also carried out them on XJTU-SY which is a multi-label engineering application dataset, and the results indicated that our proposed MLHC learning algorithm outperforms several state-of-the-art class imbalance multi-label learning algorithms, demonstrating the effectiveness and necessity of discovering label correlations and transforming the original imbalanced multi-label learning problem into multiple strongly correlated multi-class imbalanced learning problems.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据