4.7 Article

Modeling PU learning using probabilistic logic programming

Journal

MACHINE LEARNING
Volume -, Issue -, Pages -

Publisher

SPRINGER
DOI: 10.1007/s10994-023-06461-3

Keywords

Positive unlabeled learning; Weak supervision; Probabilistic logic programming; Modeling; Unidentifiability

Ask authors/readers for more resources

The paper investigates how to leverage the strengths of probabilistic logic programming (PLP) to formulate and integrate more realistic assumptions for learning better classifiers. It proposes a PLP-based general method, called PU ProbLog, that allows for partial modeling of the labeling mechanism and supports PU learning in relational domains. The empirical analysis demonstrates that partially modeling the labeling bias improves the performance of the learned classifiers.
The goal of learning from positive and unlabeled (PU) examples is to learn a classifier that predicts the posterior class probability. The challenge is that the available labels in the data are determined by (1) the true class, and (2) the labeling mechanism that selects which positive examples get labeled, where often certain examples have a higher probability to be selected than others. Incorrectly assuming an unbiased labeling mechanism leads to learning a biased classifier. Yet, this is what most existing methods do. A handful of methods makes more realistic assumptions, but they are either so general that it is impossible to distinguish between the effects of the true classification and of the labeling mechanism, or too restrictive to correctly model the real situation, or require knowledge that is typically unavailable. This paper studies how to formulate and integrate more realistic assumptions for learning better classifiers, by exploiting the strengths of probabilistic logic programming (PLP). Concretely, (1) we propose PU ProbLog: a PLP-based general method that allows to (partially) model the labeling mechanism. (2) We show that our method generalizes existing methods, in the sense that it can model the same assumptions. (3) Thanks to the use of PLP, our method supports also PU learning in relational domains. (4) Our empirical analysis shows that partially modeling the labeling bias, improves the learned classifiers.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available