4.7 Article

Hybrid-ensemble-based interpretable TSK fuzzy classifier for imbalanced data

Journal

INFORMATION FUSION
Volume 98, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.inffus.2023.101845

Keywords

Imbalanced TSK fuzzy classifier; Ensemble learning; Hybrid ensemble structure; Generalization capability; Interpretability; Imbalanced residual sketch learning

Ask authors/readers for more resources

Due to its distinguished nonlinear mapping capability and interpretability, a novel hybrid-ensemble-based imbalanced interpretable TSK fuzzy classifier (HI-TSK-FC) is proposed in this study to achieve enhanced generalization and better interpretability. The HI-TSK-FC integrates an imbalanced global linear regression sub-classifier (IGLRc) and several imbalanced TSK fuzzy sub-classifiers (I-TSK-FCs). The training method of HI-TSK-FC, called imbalanced residual sketch learning (IRSL), is devised to share the virtues of both deep and wide learning.
Owing to its distinguished nonlinear mapping capability and interpretability, a Takagi-Sugeno-Kang (TSK) fuzzy classifier is always employed to achieve both enhanced generalization and better interpretability on imbalanced datasets. In this study, inspired by ensemble learning, a novel hybrid-ensemble-based imbalanced interpretable TSK fuzzy classifier (HI-TSK-FC) is proposed to integrate an imbalanced global linear regression sub-classifier (IGLRc) and several imbalanced TSK fuzzy sub-classifiers (I-TSK-FCs). According to both human's wholly coarse to locally fine cognitive behavior and the stacked generalization principle, the training method of HI-TSK-FC, called imbalanced residual sketch learning (IRSL), is further devised to share the virtues of both deep and wide learning. Concretely, IRSL firstly generates IGLRc by calling a newly proposed imbalanced global -sparse-representation-based regression method (IGSR) on all training samples to achieve a wholly coarse result and identifies the nonlinearly distributed training samples in imbalanced datasets. Subsequently, the nonlinearly distributed training samples can be partitioned into several imbalanced residual sketches with much imbalanced likely by calling the proposed residual-based partition method (RPM). After that, several I-TSK-FCs are generated to achieve locally fine results in a parallel way on the corresponding imbalanced residual sketches. Finally, a new minimal-distance-based voting strategy is taken on the stacked outputs of both IGLRc and all I-TSK-FCs to obtain the final output of HI-TSK-FC. Besides, the proposed minimal-distance-based voting strategy also guarantees an interpretable and clear prediction route of HI-TSK-FC for each testing sample. Consequently, HI-TSK-FC owns both feature-importance-based and linguistic-based interpretabilities. Both experimental and statistical results confirm that except for feature-importance-based interpretability, HI-TSK-FC achieves at least comparable generalization capability, better linguistic interpretability (fewer adopted fuzzy rules and smaller model complexity) and faster running speed on all adopted imbalanced datasets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available