4.6 Article

How Well Do Change Sequences Predict Defects? Sequence Learning from Software Changes

Journal

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
Volume 46, Issue 11, Pages 1155-1175

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TSE.2018.2876256

Keywords

Measurement; Software; Predictive models; Semantics; History; Machine learning; Feature extraction; Defect prediction; process metrics; sequence learning

Funding

  1. Hong Kong RGC/GRF [16202917]

Ask authors/readers for more resources

Software defect prediction, which aims to identify defective modules, can assist developers in finding bugs and prioritizing limited quality assurance resources. Various features to build defect prediction models have been proposed and evaluated. Among them, process metrics are one important category. Yet, existing process metrics are mainly encoded manually from change histories and ignore the sequential information arising from the changes during software evolution. Are the change sequences derived from such information useful to characterize buggy program modules? How can we leverage such sequences to build good defect prediction models? Unlike traditional process metrics used for existing defect prediction models, change sequences are mostly vectors of variable length. This makes it difficult to apply such sequences directly in prediction models that are driven by conventional classifiers. To resolve this challenge, we utilize Recurrent Neural Network (RNN), which is a deep learning technique, to encode features from sequence data automatically. In this paper, we propose a novel approach called Fences, which extracts six types of change sequences covering different aspects of software changes via fine-grained change analysis. It approaches defects prediction by mapping it to a sequence labeling problem solvable by RNN. Our evaluations on 10 open source projects show that Fences can predict defects with high performance. In particular, our approach achieves an average F-measure of 0.657, which improves the prediction models built on traditional metrics significantly. The improvements vary from 31.6 to 46.8 percent on average. In terms of AUC, Fences achieves an average value of 0.892, and the improvements over baselines vary from 4.2 to 16.1 percent. Fences also outperforms the state-of-the-art technique which learns semantic features automatically from static code via deep learning.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available