4.8 Article

A Unifying Probabilistic Framework for Partially Labeled Data Learning

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2022.3228755

关键词

Phase locked loops; Correlation; Training; Probabilistic logic; Testing; Task analysis; Noise measurement; Partially labeled data learning (PLDL); partial label learning (PLL); partial multi-label learning (PML); classification

向作者/读者索取更多资源

Partially labeled data learning is widely used in data science, but the challenge lies in handling ambiguities caused by false-positive labels. The current strategy is to identify the ground-truth labels from the candidate set, but it lacks theoretical interpretation. Instead, we propose a novel unifying probabilistic framework that provides a clear formulation and theoretical interpretation for PLL and PML. Our framework also integrates the identifying and embedding methods, considering feature and label correlations. Experimental results show the superiority of our derived framework in both PLL and PML scenarios.
Partially labeled data learning (PLDL), including partial label learning (PLL) and partial multi-label learning (PML), has been widely used in nowadays data science. Researchers attempt to construct different specific models to deal with the different classification tasks for PLL and PML scenarios respectively. The main challenge in training classifiers for PLL and PML is how to deal with ambiguities caused by the noisy false-positive labels in the candidate label set. The state-of-the-art strategy for both scenarios is to perform disambiguation by identifying the ground-truth label(s) directly from the candidate label set, which can be summarized into two categories: 'the identifying method' and 'the embedding method'. However, both kinds of methods are constructed by hand-designed heuristic modeling under considerations like feature/label correlations with no theoretical interpretation. Instead of adopting heuristic or specific modeling, we propose a novel unifying framework called A Unifying Probabilistic Framework for Partially Labeled Data Learning (UPF-PLDL), which is derived from a clear probabilistic formulation, and brings existing research on PLL and PML under one theoretical interpretation with respect to information theory. Furthermore, the proposed UPF-PLDL also unifies 'the identifying method' and 'the embedding method' into one integrated framework, which naturally incorporates the feature and label correlation considerations. Comprehensive experiments on synthetic and real-world datasets for both PLL and PML scenarios clearly demonstrate the superiorities of the derived framework.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据