☆ 4.5 Article

Boundary sampling to boost mutation testing for deep learning models

INFORMATION AND SOFTWARE TECHNOLOGY (2021)

期刊

INFORMATION AND SOFTWARE TECHNOLOGY

卷 130, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.infsof.2020.106413

关键词

Software testing; Deep learning; Mutation testing; Boundary; Neural network

类别

Computer Science, Information Systems Computer Science, Software Engineering

资金

National Key R&D Program of China [2018YFB1003901]
National Natural Science Foundation of China [61932012, 61872177, 61832009, 61772263, 61772259]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The study introduces boundary sample selection (BSS) approach to select a smaller, sensitive, representative, and efficient subset of the test dataset for promoting mutation testing in DL models. The experimental results show that the subsets generated by BSS are smaller in size, superior in observing mutation effects, replaceable to a high degree in mutation score, and have better Mean Reciprocal Rank (MRR) values compared to the whole test sets. BSS can help reduce labeling cost, run mutation testing quickly, and identify killed mutants early.

Context: The prevalent application of Deep Learning (DL) models has raised concerns about their reliability. Due to the data-driven programming paradigm, the quality of test datasets is extremely important to gain accurate assessment of DL models. Recently, researchers have introduced mutation testing into DL testing, which applies mutation operators to generate mutants from DL models, and observes whether the test data can identify mutants to check the quality of test dataset. However, there still exist many factors (e.g., huge labeling efforts and high running cost) hindering the implementation of mutation testing for DL models. Objective: We desire for an approach to selecting a smaller, sensitive, representative and efficient subset of the whole test dataset to promote the current mutation testing (e.g., reduce labeling and running cost) for DL Models. Method: We propose boundary sample selection (BSS), which employs the distance of samples to decision boundary of DL models as the indicator to construct the appropriate subset. To evaluate the performance of BSS, we conduct an extensive empirical study with two widely-used datasets, three popular DL models, and 14 up-to-date DL mutation operators. Results : We observe that (1) The sizes of our subsets generated by BSS are much smaller (about 3%-20% of the whole test set). (2) Under most mutation operators, our subsets are superior (about 9.94-21.63) than the whole test sets in observing mutation effects. (3) Our subsets could replace the whole test sets to a very high degree (higher than 97%) when considering mutation score. (4) The MRR values of our proposed subsets are clearly better (about 2.28-13.19 times higher) than that of the whole test sets. Conclusions: The result shows that BSS can help testers save labelling cost, run mutation testing quickly and identify killed mutants early.

Boundary sampling to boost mutation testing for deep learning models

期刊

INFORMATION AND SOFTWARE TECHNOLOGY

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Boundary sampling to boost mutation testing for deep learning models

期刊

INFORMATION AND SOFTWARE TECHNOLOGY

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文