☆ 4.5 Article

Why do builds fail?-A conceptual replication study

JOURNAL OF SYSTEMS AND SOFTWARE (2021)

期刊

JOURNAL OF SYSTEMS AND SOFTWARE

卷 177, 期 -, 页码 -

出版社

ELSEVIER SCIENCE INC

DOI: 10.1016/j.jss.2021.110939

关键词

Continuous integration; Build failure; Test smells; Code smells; Quantitative analysis; Cross-project prediction

类别

Computer Science, Software Engineering Computer Science, Theory & Methods

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study investigates a wide range of factors potentially explaining software build breakages, with a focus on features such as build history, author, code complexity, and code smells. Results show that these features are the most important predictors of build failures, achieving high precision and recall in balanced project datasets but lower performance in imbalanced datasets. Cross-project prediction using models trained on balanced datasets leads to an average improvement in precision and recall on imbalanced projects.

Previous studies have investigated a wide range of factors potentially explaining software build breakages, focusing primarily on build-triggering code changes or previous CI outcomes. However, code quality factors such as the presence of code/test smells have not been yet evaluated in the context of CI, even though such factors have been linked to problems of comprehension and technical debt, and hence might introduce bugs and build breakages. This paper performs a conceptual replication study on 27,675 Travis CI builds of 15 GitHub projects, considering the features reported by Rausch et al. and Zolfagharinia et al., as well as those related to code/test smells. Using a multivariate model constructed from nine dimensions of features, results indicate a precision (recall) ranging between 58.3% and 79.0% (52.4% and 69.6%) in balanced project datasets, and between 2.5% and 37.5% (2.5% and 12.4%) in imbalanced project datasets. Models trained on our balanced project datasets were later used to perform cross-project prediction on the imbalanced projects, achieving an average improvement of 9.3% (16.2%) in precision (recall). Statistically, the results confirm that features from the build history, author, code complexity, and code/test smell dimensions are the most important predictors of build failures. (C) 2021 Elsevier Inc. All rights reserved.

Why do builds fail?-A conceptual replication study

期刊

JOURNAL OF SYSTEMS AND SOFTWARE

出版社

ELSEVIER SCIENCE INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Why do builds fail?-A conceptual replication study

期刊

JOURNAL OF SYSTEMS AND SOFTWARE

出版社

ELSEVIER SCIENCE INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文