☆ 4.6 Article

Machine Learning for Technical Debt Identification

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2022)

期刊

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING

卷 48, 期 12, 页码 4892-4906

出版社

IEEE COMPUTER SOC

DOI: 10.1109/TSE.2021.3129355

关键词

Machine learning; metrics; measurement; quality analysis and evaluation; software maintenance

类别

Computer Science, Software Engineering Engineering, Electrical & Electronic

资金

European Union [801015]
SmartCLIDE Project [871177]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Technical Debt (TD) is an effective metaphor for conveying the consequences of software inefficiencies and their elimination to both technical and non-technical stakeholders. To accurately identify and quantify TD, a range of metrics related to source code, repository activity, issue tracking, refactorings, duplication, and commenting rates are used as features for statistical and Machine Learning models. The results show that it is feasible to assess TD in Java projects with sufficient accuracy and reasonable effort, leading to the implementation of an automated TD assessment tool prototype.

Technical Debt (TD) is a successful metaphor in conveying the consequences of software inefficiencies and their elimination to both technical and non-technical stakeholders, primarily due to its monetary nature. The identification and quantification of TD rely heavily on the use of a small handful of sophisticated tools that check for violations of certain predefined rules, usually through static analysis. Different tools result in divergent TD estimates calling into question the reliability of findings derived by a single tool. To alleviate this issue we use 18 metrics pertaining to source code, repository activity, issue tracking, refactorings, duplication and commenting rates of each class as features for statistical and Machine Learning models, so as to classify them as High-TD or not. As a benchmark we exploit 18,857 classes obtained from 25 Java projects, whose high levels of TD has been confirmed by three leading tools. The findings indicate that it is feasible to identify TD issues with sufficient accuracy and reasonable effort: a subset of superior classifiers achieved an F$_2$2-measure score of approximately 0.79 with an associated Module Inspection ratio of approximately 0.10. Based on the results a tool prototype for automatically assessing the TD of Java projects has been implemented.

Machine Learning for Technical Debt Identification

期刊

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Machine Learning for Technical Debt Identification

期刊

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文