4.6 Article

Towards Reliable Online Just-in-Time Software Defect Prediction

期刊

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
卷 49, 期 3, 页码 1342-1358

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TSE.2022.3175789

关键词

Software; Reliability; Training; Codes; Software reliability; Software quality; Indexes; Just-in-time software defect prediction; online learning; concept drift; verification latency; class imbalance learning

向作者/读者索取更多资源

Throughout its development, a software project is affected by different phases, modules, and developers, leading to challenges in Just-in-Time Software Defect Prediction (JIT-SDP) due to concept drift and verification latency. This study provides the first detailed analysis of the types and impacts of concept drift on JIT-SDP classifiers. It proposes a new approach to improve the stability and reliability of predictive performance over time.
Throughout its development period, a software project experiences different phases, comprises modules with different complexities and is touched by many different developers. Hence, it is natural that problems such as Just-in-Time Software Defect Prediction (JIT-SDP) are affected by changes in the defect generating process (concept drifts), potentially hindering predictive performance. JIT-SDP also suffers from delays in receiving the labels of training examples (verification latency), potentially exacerbating the challenges posed by concept drift and further hindering predictive performance. However, little is known about what types of concept drift affect JIT-SDP and how they affect JIT-SDP classifiers in view of verification latency. This work performs the first detailed analysis of that. Among others, it reveals that different types of concept drift together with verification latency significantly impair the stability of the predictive performance of existing JIT-SDP approaches, drastically affecting their reliability over time. Based on the findings, a new JIT-SDP approach is proposed, aimed at providing higher and more stable predictive performance (i.e., reliable) over time. Experiments based on ten GitHub open source projects show that our approach was capable of produce significantly more stable predictive performances in all investigated datasets while maintaining or improving the predictive performance obtained by state-of-art methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据