4.7 Article

GEV-NN: A deep neural network architecture for class imbalance problem in binary classification

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 194, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.knosys.2020.105534

Keywords

Neural networks; Auto-encoder; Gumbel distribution; Imbalanced classification

Funding

  1. Basic Science Research Program through the National Research Foundation of Korea (NRF) - Ministry of Science, ICT & Future Planning, Republic of Korea [2017R1A2B4010826, 2019K2A9A2A06 020672]

Ask authors/readers for more resources

Class imbalance is a common issue in many applications such as medical diagnosis, fraud detection, web advertising, etc. Although standard deep learning method has achieved remarkably high-performance on datasets with balanced classes, its ability to classify imbalanced dataset is still limited. This paper proposes a novel end-to-end deep neural network architecture and adopts Gumbel distribution as an activation function in neural networks for class imbalance problem in the application of binary classification. Our proposed architecture, named GEV-NN, consists of three components: the first component serves to score input variables to determine a set of suitable input, the second component is an auto-encoder that learns efficient explanatory features for the minority class, and in the last component, the combination of the scored input and extracted features are then used to make the final prediction. We jointly optimize these components in an end-to-end training. Extensive experiments using real-world imbalanced datasets showed that GEV-NN significantly outperforms the state-of-the-art baselines by around 2% at most. In addition, the GEV-NN gives a beneficial advantage to interpret variable importance. We find key risk factors for hypertension, which are consistent with other scientific researches, using the first component of GEV-NN. (C) 2020 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available