4.4 Article

Truth Inference in Crowdsourcing: Is the Problem Solved?

Journal

PROCEEDINGS OF THE VLDB ENDOWMENT
Volume 10, Issue 5, Pages 541-552

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.14778/3055540.3055547

Keywords

-

Funding

  1. 973 Program of China [2015CB358700]
  2. NSF of China [61373024, 61632016, 61422205, 61472198]
  3. Shenzhou
  4. Tencent
  5. Research Grants Council of Hong Kong (RGC) [HKU 17229116, 17205115]
  6. University of Hong Kong [102009508, 104004129]
  7. [FDCT/116/2013/A3]
  8. [MYRG105 (Y1-L3)-FST13-GZ]

Ask authors/readers for more resources

Crowdsourcing has emerged as a novel problem-solving paradigm, which facilitates addressing problems that are hard for computers, e.g., entity resolution and sentiment analysis. However, due to the openness of crowdsourcing, workers may yield low-quality answers, and a redundancy-based method is widely employed, which first assigns each task to multiple workers and then infers the correct answer (called truth) for the task based on the answers of the assigned workers. A fundamental problem in this method is Truth Inference, which decides how to effectively infer the truth. Recently, the database community and data mining community independently study this problem and propose various algorithms. However, these algorithms are not compared extensively under the same framework and it is hard for practitioners to select appropriate algorithms. To alleviate this problem, we provide a detailed survey on 17 existing algorithms and perform a comprehensive evaluation using 5 real datasets. We make all codes and datasets public for future research. Through experiments we find that existing algorithms are not stable across different datasets and there is no algorithm that outperforms others consistently. We believe that the truth inference problem is not fully solved, and identify the limitations of existing algorithms and point out promising research directions.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available