4.7 Article

Empirical investigation of hyperparameter optimization for software defect count prediction

Journal

EXPERT SYSTEMS WITH APPLICATIONS
Volume 191, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2021.116217

Keywords

Defect count prediction; Hyperparameter tuning; Machine learning

Ask authors/readers for more resources

The study found that optimizing hyperparameters for learning techniques significantly improves defect count prediction performance, changes the ranking of classifiers, and grid search optimization generally outperforms random search optimization for default parameters.
Prior identification of defects in software modules can help testers to allocate limited resources efficiently. Defect prediction techniques are helpful for this situation because they allow testers to identify and focus on defect prone parts of the software. The regression approach is a machine learning technique used to find the defect count in software segments. These regression techniques are more effective if their hyperparameters are adjusted. However, limited studies are available for hyperparameter optimization of regression techniques. In this paper, we investigated the impact of hyperparameter optimization on defect count prediction. In an empirical analysis on 15 software defect datasets, we find that hyperparameter optimization of learning techniques: (1) improves the prediction performance for MLPR, Lasso, DTR, Hubber, and SVR, by 16.96%, 8.31%, 8.16%, 6.01% and 5.22%, respectively; (2) linear regression is not optimization-sensitive; (3) overall grid search optimization improved the prediction performance by 4.42% while random search optimization by 3.36%.; (4) non-significant classifier have also changed their ranking substantially, and (5) logistic regression obtained the highest ranking concerning hyperparameter optimization. While both random and grid search performed well, but concerning the default parameter, the grid search always obtained better outcomes; however, it may not with random search. This emphasizes the importance of exploring the parameter space when using parameter-sensitive regression techniques.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available