☆ 4.6 Review

Evaluating the Quality of Machine Learning Explanations: A Survey on Methods and Metrics

ELECTRONICS (2021)

期刊

ELECTRONICS

卷 10, 期 5, 页码 -

出版社

MDPI

DOI: 10.3390/electronics10050593

关键词

explainable machine learning; evaluation of explainability; application-grounded evaluation; human-grounded evaluation; functionality-grounded evaluation; evaluation metrics; quality of explanation

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Physics, Applied

资金

University of Technology Sydney Internal Fund
Austrian Science Fund (FWF) [P-32554 xAI]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The paper provides a comprehensive overview of methods proposed for the evaluation of ML explanations in the current literature. It identifies properties of explainability from the review of definitions of explainability and uses them as objectives that evaluation metrics should achieve. The survey found that different explanation methods use quantitative metrics primarily to evaluate either simplicity of interpretability or fidelity of explainability, while subjective measures like trust and confidence are key in human-centered evaluation of explainable systems.

The most successful Machine Learning (ML) systems remain complex black boxes to end-users, and even experts are often unable to understand the rationale behind their decisions. The lack of transparency of such systems can have severe consequences or poor uses of limited valuable resources in medical diagnosis, financial decision-making, and in other high-stake domains. Therefore, the issue of ML explanation has experienced a surge in interest from the research community to application domains. While numerous explanation methods have been explored, there is a need for evaluations to quantify the quality of explanation methods to determine whether and to what extent the offered explainability achieves the defined objective, and compare available explanation methods and suggest the best explanation from the comparison for a specific task. This survey paper presents a comprehensive overview of methods proposed in the current literature for the evaluation of ML explanations. We identify properties of explainability from the review of definitions of explainability. The identified properties of explainability are used as objectives that evaluation metrics should achieve. The survey found that the quantitative metrics for both model-based and example-based explanations are primarily used to evaluate the parsimony/simplicity of interpretability, while the quantitative metrics for attribution-based explanations are primarily used to evaluate the soundness of fidelity of explainability. The survey also demonstrated that subjective measures, such as trust and confidence, have been embraced as the focal point for the human-centered evaluation of explainable systems. The paper concludes that the evaluation of ML explanations is a multidisciplinary research topic. It is also not possible to define an implementation of evaluation metrics, which can be applied to all explanation methods.

Evaluating the Quality of Machine Learning Explanations: A Survey on Methods and Metrics

期刊

ELECTRONICS

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Evaluating the Quality of Machine Learning Explanations: A Survey on Methods and Metrics

期刊

ELECTRONICS

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文