☆ 4.6 Article

Software defect prediction using hybrid model (CBIL) of convolutional neural network (CNN) and bidirectional long short-term memory (Bi-LSTM)

PEERJ COMPUTER SCIENCE (2021)

Journal

PEERJ COMPUTER SCIENCE

Volume 7, Issue -, Pages -

Publisher

PEERJ INC

DOI: 10.7717/peerj-cs.739

Keywords

Defect; Software defect prediction; Abstract syntax tree; Machine learning; Deep learning; Convolutional neural network; YBidirectional long short-term memory

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The study introduces a novel software defect prediction model CBIL, which extracts Abstract Syntax Tree tokens as vectors from source code and utilizes deep learning techniques to enhance the accuracy of software defect prediction, showing significant improvements compared to traditional models.

In recent years, the software industry has invested substantial effort to improve software quality in organizations. Applying proactive software defect prediction will help developers and white box testers to find the defects earlier, and this will reduce the time and effort. Traditional software defect prediction models concentrate on traditional features of source code including code complexity, lines of code, etc. However, these features fail to extract the semantics of source code. In this research, we propose a hybrid model that is called CBIL. CBIL can predict the defective areas of source code. It extracts Abstract Syntax Tree (AST) tokens as vectors from source code. Mapping and word embedding turn integer vectors into dense vectors. Then, Convolutional Neural Network (CNN) extracts the semantics of AST tokens. After that, Bidirectional Long Short-Term Memory (Bi-LSTM) keeps key features and ignores other features in order to enhance the accuracy of software defect prediction. The proposed model CBIL is evaluated on a sample of seven open-source Java projects of the PROMISE dataset. CBIL is evaluated by applying the following evaluation metrics: F-measure and area under the curve (AUC). The results display that CBIL model improves the average of F-measure by 25% compared to CNN, as CNN accomplishes the top performance among the selected baseline models. In average of AUC, CBIL model improves AUC by 18% compared to Recurrent Neural Network (RNN), as RNN accomplishes the top performance among the selected baseline models used in the experiments.

Software defect prediction using hybrid model (CBIL) of convolutional neural network (CNN) and bidirectional long short-term memory (Bi-LSTM)

Journal

PEERJ COMPUTER SCIENCE

Publisher

PEERJ INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Software defect prediction using hybrid model (CBIL) of convolutional neural network (CNN) and bidirectional long short-term memory (Bi-LSTM)

Journal

PEERJ COMPUTER SCIENCE

Publisher

PEERJ INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper