4.5 Article

Improving crowd labeling using Stackelberg models

Journal

Publisher

SPRINGER HEIDELBERG
DOI: 10.1007/s13042-021-01276-x

Keywords

Crowdsourcing learning; Ground truth inference; Adversary machine learning; Stackelberg game

Funding

  1. National Natural Science Foundation of China [U1711267]
  2. Fundamental Research Funds for the Central Universities [CUG2018JM18]
  3. Open Research Project of Hubei Key Laboratory of Intelligent Geo-Information Processing [KLIGIP201601]

Ask authors/readers for more resources

This paper introduces a novel label integration method based on game theory, called Stackelberg label inference (SLI). It addresses the issue of low label quality and avoids poor results caused by the involvement of multiple noisy label sets. SLI demonstrates superior performance in both label quality and model quality when the number of labelers is relatively small.
Crowdsourcing systems provide an easy means of acquiring labeled training data for supervised learning. However, the labels provided by non-expert crowd workers (labelers) often appear low quality. In order to solve this problem, in practice each sample always obtains a multiple noisy label set from multiple different labelers, then ground truth inference algorithms are employed to obtain integrated labels of samples. So ground truth inference methods directly determine the label quality of samples. In this paper, we propose a novel label integration method based on game theory. We assume that there is an adversary in crowdsourcing system who intentionally provides incorrect integrated labels. We model the interaction between the data miner and the adversary as a Stackelberg game in which one player (the data miner) controls the predictive model whereas another (the adversary) tries to choose the integrated labels which would be most harmful for the current classifier. On this basis, we transform the label integration problem into a repeated Stackelberg model. We call our method Stackelberg label inference (SLI). SLI does not need to estimate the quality of labelers, and avoids the chicken-egg problem that can lead to poor result. Moreover, because SLI has little involvement of multiple noisy label sets on the noise data set, it is not very sensitive to the number of labelers. SLI shows better performance when the number of labelers is relatively small. In term of both label quality and model quality, the experimental results show that SLI is superior to the other state-of-the-art ground truth inference methods used to compare.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available