4.7 Article

Understanding CNN fragility when learning with imbalanced data

Journal

MACHINE LEARNING
Volume -, Issue -, Pages -

Publisher

SPRINGER
DOI: 10.1007/s10994-023-06326-9

Keywords

Machine learning; Deep learning; Class imbalance; Computer vision; CNN fragility

Ask authors/readers for more resources

Convolutional neural networks (CNNs) struggle to generalize to minority classes and have opaque decision-making processes on imbalanced image data. This study focuses on the latent features of CNNs to demystify their decisions. The class top-K CE and FE are found to contain important information regarding a CNN's ability to generalize to minority classes. This research also highlights the significance of diversifying class latent features in developing effective methods for imbalanced learning.
Convolutional neural networks (CNNs) have achieved impressive results on imbalanced image data, but they still have difficulty generalizing to minority classes and their decisions are difficult to interpret. These problems are related because the method by which CNNs generalize to minority classes, which requires improvement, is wrapped in a black-box. To demystify CNN decisions on imbalanced data, we focus on their latent features. Although CNNs embed the pattern knowledge learned from a training set in model parameters, the effect of this knowledge is contained in feature and classification embeddings (FE and CE). These embeddings can be extracted from a trained model and their global, class properties (e.g., frequency, magnitude and identity) can be analyzed. We find that important information regarding the ability of a neural network to generalize to minority classes resides in the class top-K CE and FE. We show that a CNN learns a limited number of class top-K CE per category, and that their magnitudes vary based on whether the same class is balanced or imbalanced. We hypothesize that latent class diversity is as important as the number of class examples, which has important implications for re-sampling and cost-sensitive methods. These methods generally focus on rebalancing model weights, class numbers and margins; instead of diversifying class latent features. We also demonstrate that a CNN has difficulty generalizing to test data if the magnitude of its top-K latent features do not match the training set. We use three popular image datasets and two cost-sensitive algorithms commonly employed in imbalanced learning for our experiments.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available