4.5 Article

Trace matrix optimization for fault localization

Journal

JOURNAL OF SYSTEMS AND SOFTWARE
Volume 208, Issue -, Pages -

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.jss.2023.111900

Keywords

Fault localization; Coincidental correctness; Imbalanced data; Data augmentation

Ask authors/readers for more resources

This paper proposes a two-stage trace matrix optimization method for fault localization, which addresses the challenges of coincidental correctness and data imbalance in the current trace matrix. Through extensive experiments, significant improvements in fault localization effectiveness are demonstrated.
Fault localization (FL) techniques gather trace information as input data and analyze it to identify the relationship between program statements and failures. Therefore, the input trace matrix is essential for fault localization. However, the current trace matrix faces two main challenges. Firstly, the occurrences of coincidental correctness (CC), which refer to the execution of faulty statements that lead to correct program output, adversely impact the effectiveness of FL. Secondly, the significant disparity in the number of failing and passing test cases poses a data imbalance problem for fault localization. To overcome these issues, we propose TRAIN: a Two-stage tRace mAtrix optImizatioN method for fault localization. In the first stage of optimization, TRAIN leverages an improved cluster analysis to identify and exclude the CC tests to optimize the trace matrix. Subsequently, in the second stage, TRAIN utilizes data augmentation to enhance the failing test cases to further balance the trace matrix. The optimized trace matrix is then used as input data in the FL pipeline to locate the faulty statements. Through extensive experiments conducted on 330 faulty versions of nine large-sized programs (obtained from Defects4J, ManyBugs, and SIR) using six state-of-the-art FL methods, TRAIN demonstrates remarkable improvements in FL effectiveness.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available