☆ 4.5 Review

Machine learning techniques for code smell detection: A systematic literature review and meta-analysis

INFORMATION AND SOFTWARE TECHNOLOGY (2019)

期刊

INFORMATION AND SOFTWARE TECHNOLOGY

卷 108, 期 -, 页码 115-138

出版社

ELSEVIER

DOI: 10.1016/j.infsof.2018.12.009

关键词

Code smells; Machine learning; Systematic literature review

类别

Computer Science, Information Systems Computer Science, Software Engineering

资金

National Natural Science Foundation of China [61432001, 61802374]
CAS-TWAS
Swiss National Science Foundation through the SNF [PP00P2_170529]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Background: Code smells indicate suboptimal design or implementation choices in the source code that often lead it to be more change- and fault-prone. Researchers defined dozens of code smell detectors, which exploit different sources of information to support developers when diagnosing design flaws. Despite their good accuracy, previous work pointed out three important limitations that might preclude the use of code smell detectors in practice: (i) subjectiveness of developers with respect to code smells detected by such tools, (ii) scarce agreement between different detectors, and (iii) difficulties in finding good thresholds to be used for detection. To overcome these limitations, the use of machine learning techniques represents an ever increasing research area. Objective: While the research community carefully studied the methodologies applied by researchers when defining heuristic-based code smell detectors, there is still a noticeable lack of knowledge on how machine learning approaches have been adopted for code smell detection and whether there are points of improvement to allow a better detection of code smells. Our goal is to provide an overview and discuss the usage of machine learning approaches in the field of code smells. Method: This paper presents a Systematic Literature Review (SLR) on Machine Learning Techniques for Code Smell Detection. Our work considers papers published between 2000 and 2017. Starting from an initial set of 2456 papers, we found that 15 of them actually adopted machine learning approaches. We studied them under four different perspectives: (i) code smells considered, (ii) setup of machine learning approaches, (iii) design of the evaluation strategies, and (iv) a meta-analysis on the performance achieved by the models proposed so far. Results: The analyses performed show that God Class, Long Method, Functional Decomposition, and Spaghetti Code have been heavily considered in the literature. DECISION TREES and SUPPORT VECTOR MACHINES are the most commonly used machine learning algorithms for code smell detection. Models based on a large set of independent variables have performed well. JRIP and RANDOM FOREST are the most effective classifiers in terms of performance. The analyses also reveal the existence of several open issues and challenges that the research community should focus on in the future. Conclusion: Based on our findings, we argue that there is still room for the improvement of machine learning techniques in the context of code smell detection. The open issues emerged in this study can represent the input for researchers interested in developing more powerful techniques.

Machine learning techniques for code smell detection: A systematic literature review and meta-analysis

期刊

INFORMATION AND SOFTWARE TECHNOLOGY

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Machine learning techniques for code smell detection: A systematic literature review and meta-analysis

期刊

INFORMATION AND SOFTWARE TECHNOLOGY

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文