4.5 Article

Large-scale intent analysis for identifying large-review-effort code changes

Journal

INFORMATION AND SOFTWARE TECHNOLOGY
Volume 130, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.infsof.2020.106408

Keywords

Change intent analysis; Review effort; Machine learning

Ask authors/readers for more resources

This study leverages change intent to characterize and identify changes with a large review effort, exploring the feasibility of automatically classifying them. Results show that code changes with certain intents are more likely to require a large review effort, and machine learning prediction models can effectively identify such changes. The study suggests that considering change intent can significantly improve the performance of prediction models for large review effort changes.
Context: Code changes to software occur due to various reasons such as bug fixing, new feature addition, and code refactoring. Change intents have been studied for years to help developers understand the rationale behind code commits. However, in most existing studies, the intent of the change is rarely leveraged to provide more specific, context aware analysis. Objective: In this paper, we present the first study to leverage change intent to characterize and identify Large-Review-Effort (LRE) changes-changes with large review effort. Method: Specifically, we first propose a feedback-driven and heuristics-based approach to identify change intents of code changes. We then characterize the changes regarding review effort by using various features extracted from change metadata and the change intents. We further explore the feasibility of automatically classifying LRE changes. We conduct our study on four large-scale projects, one from Microsoft and three are open source projects, i.e., Qt, Android, and OpenStack. Results: Our results show that, (i) code changes with some intents (i.e., Feature and Refactor) are more likely to be LRE changes, (ii) machine learning based prediction models are applicable for identifying LRE changes, and (iii) prediction models built for code changes with some intents achieve better performance than prediction models without considering the change intent, the improvement in AUC can be up to 19 percentage points and is 7.4 percentage points on average. Conclusion: The change intent analysis and its application on LRE identification proposed in this study has already been used in Microsoft to provide the review effort and intent information of changes for reviewers to accelerate the review process. To show how to deploy our approaches in real-world practice, we report a case study of developing and deploying the intent analysis system in Microsoft. Moreover, we also evaluate the usefulness of our approaches by using a questionnaire survey. The feedback from developers demonstrate its practical value.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available