☆ 4.6 Article

BiasFinder: Metamorphic Test Generation to Uncover Bias for Sentiment Analysis Systems

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2022)

期刊

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING

卷 48, 期 12, 页码 5087-5101

出版社

IEEE COMPUTER SOC

DOI: 10.1109/TSE.2021.3136169

关键词

Sentiment analysis; test case generation; metamorphic testing; bias; fairness bug

类别

Computer Science, Software Engineering Engineering, Electrical & Electronic

资金

Singapore Ministry of Education(MOE) Academic Research Fund (AcRF) Tier 1

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper presents BiasFinder, an approach that can discover biased predictions in sentiment analysis systems via metamorphic testing. BiasFinder automatically curates suitable templates and generates new texts to uncover bias in systems.

Artificial intelligence systems, such as Sentiment Analysis (SA) systems, typically learn from large amounts of data that may reflect human bias. Consequently, such systems may exhibit unintended demographic bias against specific characteristics (e.g., gender, occupation, country-of-origin, etc.). Such bias manifests in an SA system when it predicts different sentiments for similar texts that differ only in the characteristic of individuals described. To automatically uncover bias in SA systems, this paper presents BiasFinder, an approach that can discover biased predictions in SA systems via metamorphic testing. A key feature of BiasFinder is the automatic curation of suitable templates from any given text inputs, using various Natural Language Processing (NLP) techniques to identify words that describe demographic characteristics. Next, BiasFinder generates new texts from these templates by mutating words associated with a class of a characteristic (e.g., gender-specific words such as female names, she, her). These texts are then used to tease out bias in an SA system. BiasFinder identifies a bias-uncovering test case (BTC) when an SA system predicts different sentiments for texts that differ only in words associated with a different class (e.g., male vs. female) of a target characteristic (e.g., gender). We evaluate BiasFinder on 10 SA systems and 2 large scale datasets, and the results show that BiasFinder can create more BTCs than two popular baselines. We also conduct an annotation study and find that human annotators consistently think that test cases generated by BiasFinder are more fluent than the two baselines.

BiasFinder: Metamorphic Test Generation to Uncover Bias for Sentiment Analysis Systems

期刊

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

BiasFinder: Metamorphic Test Generation to Uncover Bias for Sentiment Analysis Systems

期刊

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文