4.2 Article

Heterogeneous fault prediction with cost-sensitive domain adaptation

Journal

Publisher

WILEY
DOI: 10.1002/stvr.1658

Keywords

cost-sensitive learning; class imbalance; heterogeneous domain adaptation; heterogeneous fault prediction; mixed project; software quality assurance

Funding

  1. NSFC-Key Project of General Technology Fundamental Research United Fund [U1736211]
  2. National Key Research and Development Program of China [2017YFB0202001]
  3. National Nature Science Foundation of China [61672208, 41571417]
  4. Program of State Key Laboratory of Software Engineering [SKLSE-1216-14]
  5. Science and Technology Program in Henan province [1721102410064]
  6. Science and Technique Development Program of Henan [172102210186]
  7. Research Foundation of Henan University [2015YBZR024]

Ask authors/readers for more resources

In the early phases of software testing, projects may have only limited historical defect data. Learning prediction model with such insufficient training data will limit the efficacy of learned predictor. In practice, there are usually many publicly available fault prediction datasets. Recently, heterogeneous fault prediction (HFP) has been proposed. However, existing HFP models do not investigate how to use mixed project data to predict target. Furthermore, defect data are often imbalanced. The imbalanced data distribution of source usually leads to serious misclassification of fault-prone instances, which will degrade the predictor's performance. Existing HFP methods do not consider the class imbalance problem in the training stages. In this paper, we propose a novel Cost-sensitive Label and Structure-consistent Unilateral Projection (CLSUP) approach for HFP. CLSUP can not only make better use of the within-project and cross-project data but also alleviate the class imbalance problem by setting different misclassification costs for fault-prone and non-fault-prone instances. Extensive experiments on 30 projects demonstrate the effectiveness of CLSUP.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available