4.7 Article

Defect Prediction With Semantics and Context Features of Codes Based on Graph Representation Learning

期刊

IEEE TRANSACTIONS ON RELIABILITY
卷 70, 期 2, 页码 613-625

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TR.2020.3040191

关键词

Software; Software development management; Measurement; Semantics; Syntactics; Data mining; Computer bugs; Deep learning; defect prediction; graph representation learning; software defect dataset; software engineering

向作者/读者索取更多资源

This article proposes a software defect identification method combining semantics and context information, using abstract syntax tree representation learning. The experiments show that this method outperforms existing methods and traditional machine learning baselines, and that information on code concepts significantly improves accuracy.
To optimize the process of software testing and to improve software quality and reliability, many attempts have been made to develop more effective methods for predicting software defects. Previous work on defect prediction has used machine learning and artificial software metrics. Unfortunately, artificial metrics are unable to represent the features of syntactic, semantic, and context information of defective modules. In this article, therefore, we propose a practical approach for identifying software defect patterns via the combination of semantics and context information using abstract syntax tree representation learning. Graph neural networks are also leveraged to capture the latent defect information of defective subtrees, which are pruned based on a fix-inducing change. To validate the proposed approach for predicting defects, we define mining rules based on the GitHub workflow and collect 6052 defects from 307 projects. The experiments indicate that the proposed approach performs better than the state-of-the-art approach and five traditional machine learning baselines. An ablation study shows that the information about code concepts leads to a significant increase in accuracy.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据