☆ 4.7 Article

Improving data and model quality in crowdsourcing using cross-entropy-based noise correction

INFORMATION SCIENCES (2021)

期刊

INFORMATION SCIENCES

卷 546, 期 -, 页码 803-814

出版社

ELSEVIER SCIENCE INC

DOI: 10.1016/j.ins.2020.08.117

关键词

Crowdsourcing; Label noise correction; Entropy; Cross-entropy

类别

Computer Science, Information Systems

资金

National Natural Science Foundation of China [U1711267]
Fundamental Research Funds for the Central Universities [CUGGC03]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Crowdsourcing services offer a fast and cost-effective way to obtain labeled data, but label noise is commonly present. This study introduces a Cross-Entropy-based Noise Correction method that outperforms existing techniques in mitigating label noise in crowdsourced data.

Crowdsourcing services provide a fast, efficient, and cost-effective approach to obtaining labeled data, particularly for human-like tasks. In a crowdsourcing scenario, after ground truth inference methods have been employed to obtain integrated instance labels, label noise remains present in the integrated labels. Label noise handling techniques can then be implemented to mitigate the effects of this noise. In this study, we propose a Cross-Entropy-based Noise Correction (CENC) method for crowdsourcing. CENC uses the entropies of the label distributions generated from multiple noisy label sets to filter noisy instances. It then exploits the cross-entropies between each possible true class probability distribution and each predicted class probability distribution to rectify the noisy instances. Using both simulated benchmark data and real-world crowdsourced data, we show that CENC outperforms all other existing state-of-the-art noise correction methods. (C) 2020 Elsevier Inc. All rights reserved.

Improving data and model quality in crowdsourcing using cross-entropy-based noise correction

期刊

INFORMATION SCIENCES

出版社

ELSEVIER SCIENCE INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Improving data and model quality in crowdsourcing using cross-entropy-based noise correction

期刊

INFORMATION SCIENCES

出版社

ELSEVIER SCIENCE INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文