4.5 Article

A noun-based approach to feature location using time-aware term-weighting

期刊

INFORMATION AND SOFTWARE TECHNOLOGY
卷 56, 期 8, 页码 991-1011

出版社

ELSEVIER SCIENCE BV
DOI: 10.1016/j.infsof.2014.03.007

关键词

Feature location; Software change request; Time-metadata; Term-weighting; Noun usage

资金

  1. High Impact Research Grant - Ministry of Education, Malaysia [UM.C/625/1/HIR/MOHE/FCSIT/13]

向作者/读者索取更多资源

Context: Feature location aims to identify the source code location corresponding to the implementation of a software feature. Many existing feature location methods apply text retrieval to determine the relevancy of the features to the text data extracted from the software repositories. One of the preprocessing activities in text retrieval is term-weighting, which is used to adjust the importance of a term within a document or corpus. Common term-weighting techniques may not be optimal to deal with text data from software repositories due to the origin of term-weighting techniques from a natural language context. Objective: This paper describes how the consideration of when the terms were used in the repositories, under the condition of weighting only the noun terms, can improve a feature location approach. Method: We propose a feature location approach using a new term-weighting technique that takes into account how recently a term has been used in the repositories. In this approach, only the noun terms are weighted to reduce the dataset volume and avoid dealing with dimensionality reduction. Results: An empirical evaluation of the approach on four open-source projects reveals improvements to the accuracy, effectiveness and performance up to 50%, 17%, and 13%, respectively, when compared to the commonly-used Vector Space Model approach. The comparison of the proposed term-weighting technique with the Term Frequency-Inverse Document Frequency technique shows accuracy, effectiveness, and performance improvements as much as 15%, 10%, and 40%, respectively. The investigation of using only noun terms, instead of using all terms, in the proposed approach also indicates improvements up to 28%, 21%, and 58% on accuracy, effectiveness, and performance, respectively. Conclusion: In general, the use of time in the weighting of terms, along with the use of only the noun terms, makes significant improvements to a feature location approach that relies on textual information. (C) 2014 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据