☆ 4.5 Article

Boundary sampling to boost mutation testing for deep learning models

INFORMATION AND SOFTWARE TECHNOLOGY (2021)

Journal

INFORMATION AND SOFTWARE TECHNOLOGY

Volume 130, Issue -, Pages -

Publisher

ELSEVIER

DOI: 10.1016/j.infsof.2020.106413

Keywords

Software testing; Deep learning; Mutation testing; Boundary; Neural network

Funding

National Key R&D Program of China [2018YFB1003901]
National Natural Science Foundation of China [61932012, 61872177, 61832009, 61772263, 61772259]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The study introduces boundary sample selection (BSS) approach to select a smaller, sensitive, representative, and efficient subset of the test dataset for promoting mutation testing in DL models. The experimental results show that the subsets generated by BSS are smaller in size, superior in observing mutation effects, replaceable to a high degree in mutation score, and have better Mean Reciprocal Rank (MRR) values compared to the whole test sets. BSS can help reduce labeling cost, run mutation testing quickly, and identify killed mutants early.

Context: The prevalent application of Deep Learning (DL) models has raised concerns about their reliability. Due to the data-driven programming paradigm, the quality of test datasets is extremely important to gain accurate assessment of DL models. Recently, researchers have introduced mutation testing into DL testing, which applies mutation operators to generate mutants from DL models, and observes whether the test data can identify mutants to check the quality of test dataset. However, there still exist many factors (e.g., huge labeling efforts and high running cost) hindering the implementation of mutation testing for DL models. Objective: We desire for an approach to selecting a smaller, sensitive, representative and efficient subset of the whole test dataset to promote the current mutation testing (e.g., reduce labeling and running cost) for DL Models. Method: We propose boundary sample selection (BSS), which employs the distance of samples to decision boundary of DL models as the indicator to construct the appropriate subset. To evaluate the performance of BSS, we conduct an extensive empirical study with two widely-used datasets, three popular DL models, and 14 up-to-date DL mutation operators. Results : We observe that (1) The sizes of our subsets generated by BSS are much smaller (about 3%-20% of the whole test set). (2) Under most mutation operators, our subsets are superior (about 9.94-21.63) than the whole test sets in observing mutation effects. (3) Our subsets could replace the whole test sets to a very high degree (higher than 97%) when considering mutation score. (4) The MRR values of our proposed subsets are clearly better (about 2.28-13.19 times higher) than that of the whole test sets. Conclusions: The result shows that BSS can help testers save labelling cost, run mutation testing quickly and identify killed mutants early.

Boundary sampling to boost mutation testing for deep learning models

Journal

INFORMATION AND SOFTWARE TECHNOLOGY

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Boundary sampling to boost mutation testing for deep learning models

Journal

INFORMATION AND SOFTWARE TECHNOLOGY

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper