4.7 Article

Cost-sensitive positive and unlabeled learning

期刊

INFORMATION SCIENCES
卷 558, 期 -, 页码 229-245

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2021.01.002

关键词

Positive and Unlabeled learning (PU learning); Class imbalance; Cost-sensitive learning; Generalization bound

资金

  1. NSF of China [61973162, U1713208]
  2. Fundamental Research Funds for the Central Universities [30920032202]
  3. CCF-Tencent Open Fund [RAGR20200101]
  4. Young Elite Scientists Sponsorship Programby CAST [2018QNRC001]
  5. Hong Kong Scholars Program [XJ2019036]

向作者/读者索取更多资源

This paper proposes a novel algorithm CSPU for PU learning, which addresses class imbalance by imposing different misclassification costs on different classes. The algorithm outperforms other comparators in dealing with minority classes.
Positive and Unlabeled learning (PU learning) aims to train a binary classifier solely based on positively labeled and unlabeled data when negatively labeled data are absent or distributed too diversely. However, none of the existing PU learning methods takes the class imbalance problem into account, which significantly neglects the minority class and is likely to generate a biased classifier. Therefore, this paper proposes a novel algorithm termed Cost-Sensitive Positive and Unlabeled learning (CSPU) which imposes different misclassification costs on different classes when conducting PU classification. Specifically, we assign distinct weights to the losses caused by false negative and false positive examples, and employ double hinge loss to build our CSPU algorithm under the framework of empirical risk minimization. Theoretically, we analyze the computational complexity, and also derive a generalization error bound of CSPU which guarantees the good performance of our algorithm on test data. Empirically, we compare CSPU with the state-of-the-art PU learning methods on synthetic dataset, OpenML benchmark datasets, and real-world datasets. The results clearly demonstrate the superiority of the proposed CSPU to other comparators in dealing with class imbalanced tasks. (C) 2021 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据