4.5 Article

Why do builds fail?-A conceptual replication study

期刊

JOURNAL OF SYSTEMS AND SOFTWARE
卷 177, 期 -, 页码 -

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.jss.2021.110939

关键词

Continuous integration; Build failure; Test smells; Code smells; Quantitative analysis; Cross-project prediction

向作者/读者索取更多资源

This study investigates a wide range of factors potentially explaining software build breakages, with a focus on features such as build history, author, code complexity, and code smells. Results show that these features are the most important predictors of build failures, achieving high precision and recall in balanced project datasets but lower performance in imbalanced datasets. Cross-project prediction using models trained on balanced datasets leads to an average improvement in precision and recall on imbalanced projects.
Previous studies have investigated a wide range of factors potentially explaining software build breakages, focusing primarily on build-triggering code changes or previous CI outcomes. However, code quality factors such as the presence of code/test smells have not been yet evaluated in the context of CI, even though such factors have been linked to problems of comprehension and technical debt, and hence might introduce bugs and build breakages. This paper performs a conceptual replication study on 27,675 Travis CI builds of 15 GitHub projects, considering the features reported by Rausch et al. and Zolfagharinia et al., as well as those related to code/test smells. Using a multivariate model constructed from nine dimensions of features, results indicate a precision (recall) ranging between 58.3% and 79.0% (52.4% and 69.6%) in balanced project datasets, and between 2.5% and 37.5% (2.5% and 12.4%) in imbalanced project datasets. Models trained on our balanced project datasets were later used to perform cross-project prediction on the imbalanced projects, achieving an average improvement of 9.3% (16.2%) in precision (recall). Statistically, the results confirm that features from the build history, author, code complexity, and code/test smell dimensions are the most important predictors of build failures. (C) 2021 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据