Journal
IEEE ACCESS
Volume 7, Issue -, Pages 78870-78881Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2019.2922686
Keywords
Bug localization; information retrieval; surface lexical similarity; semantic similarity; bug report; word embedding
Categories
Funding
- National Key Research and Development Program [2016YFC0801804, 2016YFC0801405]
- National Natural Science Foundation of China [61806067]
Ask authors/readers for more resources
Although software bug localization in software maintenance and evolution is cumbersome and time-consuming, it is also very important, especially for large-scale software projects. To lighten the workload of developers, researchers have developed various information retrieval (IR)-based bug localization models for automated software support. In this paper, we propose a new method that reduces the time required for bug localization. First, the surface lexical similarity between a bug report and source code file is calculated based on the vector space model. Second, to address the lexical gap between the programming language and natural language, the word vector is used to calculate the semantic similarity between the bug report and source code file. Then, we use surface lexical and semantic similarity to calculate the total similarity for detecting buggy source code files. Our experimental word vectors are derived from Skip-gram and GloVe model training. We select an optimal 100 dimensional word vector for bug localization by evaluating it on four open source software examples. Finally, our experimental results show that our method outperforms classical IR-based methods in locating relevant source code files based on several indicators.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available