4.2 Article

A transfer cost-sensitive boosting approach for cross-project defect prediction

Journal

SOFTWARE QUALITY JOURNAL
Volume 25, Issue 1, Pages 235-272

Publisher

SPRINGER
DOI: 10.1007/s11219-015-9287-1

Keywords

Boosting; Class imbalance; Cost-sensitive learning; Cross-project defect prediction; Software defect prediction; Transfer learning

Funding

  1. National Research Foundation of Korea (NRF) - Korea government (Ministry of Science, ICT and Future Planning (MSIP)) [NRF-2013R1A1A2006985]
  2. Institute for Information & communications Technology Promotion (IITP) - Korea government (MSIP) [R0101-15-0144]
  3. Ministry of Public Safety & Security (MPSS), Republic of Korea [R0101-15-0144] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

Ask authors/readers for more resources

Software defect prediction has been regarded as one of the crucial tasks to improve software quality by effectively allocating valuable resources to fault-prone modules. It is necessary to have a sufficient set of historical data for building a predictor. Without a set of sufficient historical data within a company, cross-project defect prediction (CPDP) can be employed where data from other companies are used to build predictors. In such cases, a transfer learning technique, which extracts common knowledge from source projects and transfers it to a target project, can be used to enhance the prediction performance. There exists the class imbalance problem, which causes difficulties for the learner to predict defects. The main impacts of imbalanced data under cross-project settings have not been investigated in depth. We propose a transfer cost-sensitive boosting method that considers both knowledge transfer and class imbalance for CPDP when given a small amount of labeled target data. The proposed approach performs boosting that assigns weights to the training instances with consideration of both distributional characteristics and the class imbalance. Through comparative experiments with the transfer learning and the class imbalance learning techniques, we show that the proposed model provides significantly higher defect detection accuracy while retaining better overall performance. As a result, a combination of transfer learning and class imbalance learning is highly effective for improving the prediction performance under cross-project settings. The proposed approach will help to design an effective prediction model for CPDP. The improved defect prediction performance could help to direct software quality assurance activities and reduce costs. Consequently, the quality of software can be managed effectively.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available