4.7 Article

Fuzzy rule-based oversampling technique for imbalanced and incomplete data learning

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 158, Issue -, Pages 154-174

Publisher

ELSEVIER
DOI: 10.1016/j.knosys.2018.05.044

Keywords

Fuzzy rules; Imbalanced data; Missing values; Attribute correlation; Synthesize minority instances

Funding

  1. National Natural Science Foundation of China [61573266]
  2. Natural Science Basic Research Plan in Shaanxi Province [2017JQ1013]
  3. Fundamental Research Funds for the Central Universities [JB180705]

Ask authors/readers for more resources

Datasets that have skewed class distributions pose a difficulty to learning algorithms in pattern classification. A number of different methods to deal with this problem have been developed in recent years. Specifically, synthetic oversampling techniques focus on balancing the distribution between the training instances of the majority and minority classes by generating extra artificial minority class instances. Unfortunately, few of them can be spread to tackle the problem of imbalanced data with missing values. Moreover, in most cases, existing oversampling methods do not make full use of the correlation between attributes. To this end, in this paper, we propose a fuzzy rule-based oversampling technique (FRO) to handle the class imbalance problem. FRO firstly creates fuzzy rules from the training data and assigns each of them a rule weight, which represents the certainty degree of an instance belonging to the fuzzy subspace. Then it synthesizes new minority instances under the guidance of fuzzy rules. The number of minority instances to be generated under a given fuzzy rule is determined by the rule weight. In a similar way, FRO can also recover the missing values that exist in the imbalanced dataset. Extensive experiments using 55 real-world imbalanced datasets evaluate the performance of the proposed FRO technique. The results show that our method is better than or comparable with a set of alternative state-of-the-art imbalanced classification algorithms in terms of various assessment metrics.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available