☆ 4.4 Article

Software reuse analytics using integrated random forest and gradient boosting machine learning algorithm

SOFTWARE-PRACTICE & EXPERIENCE (2021)

Journal

SOFTWARE-PRACTICE & EXPERIENCE

Volume 51, Issue 4, Pages 735-747

Publisher

WILEY

DOI: 10.1002/spe.2921

Keywords

AdaBoostM1; confusion matrix; DecisionStump; gradient boosting machine; J48; JRip; LMT; LogitBoost; one R; part; random forest; software metrics; software reuse

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Cleaner Production is crucial for achieving sustainable production in companies, and software reuse is pivotal for software enterprises. This paper introduces a new machine learning algorithm (RFGBM) to test the reusability of given software code, outperforming existing algorithms in performance parameters.

The term Cleaner Production (CP) for Production Companies is contemplated as influential to get sustainable production. CP mainly deals with three R's that is, reuse, reduce, and recycle. For software enterprise, the software reuse plays a pivotal role. Software reuse is a process of producing new products or software from the existing software by updating it. To extract useful information from the existing software data mining comes into light. The algorithms used for software reuse face issues related to maintenance cost, accuracy, and performance. Also, the currently used algorithm does not give accurate results on whether the component of software can be reused. Machine Learning gives the best results to predicate if the given software component is reusable or not. This paper introduces an integrated Random Forest and Gradient Boosting Machine Learning Algorithm (RFGBM) which test the reusability of the given software code considering the object-oriented parameters such as cohesion, coupling, cyclomatic complexity, bugs, number of children, and depth inheritance tree. Further, the proposed algorithm is compared with J48, AdaBoostM1, LogitBoost, Part, One R, LMT, JRip, DecisionStump algorithms. Performance metrices like accuracy, error rate, Relative Absolute Error, and Mean Absolute Error are improved using RFGBM. This algorithm also utilizes data preprocessing with the help of an unsupervised filter to remove the missing value for efficiency improvement. Proposed algorithm outperforms existing in term of performance parameters.

Software reuse analytics using integrated random forest and gradient boosting machine learning algorithm

Journal

SOFTWARE-PRACTICE & EXPERIENCE

Publisher

WILEY

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Software reuse analytics using integrated random forest and gradient boosting machine learning algorithm

Journal

SOFTWARE-PRACTICE & EXPERIENCE

Publisher

WILEY

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper