☆ 4.7 Article

Imputation of missing data with neural networks for classification

KNOWLEDGE-BASED SYSTEMS (2019)

Journal

KNOWLEDGE-BASED SYSTEMS

Volume 182, Issue -, Pages -

Publisher

ELSEVIER

DOI: 10.1016/j.knosys.2019.07.009

Keywords

Classification; Data imputation; Gradient decent; Missing attribute value; Neural network

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

We propose a mechanism to use data with missing values for designing classifiers which is different from predicting missing values for classification. Our imputation method uses an auto-encoder neural network. We make an innovative use of the training data without missing values to train the autoen-coder so that it is better equipped to predict missing values. It is a two-stage training scheme. Unlike most of the existing auto-encoder based methods which use a bottleneck layer for missing data handling, we justify and use a latent space of much higher dimension than that of the input. Now to design a classifier using a training set with missing values, we use the trained auto-encoder to predict missing values based on the hypothesis that a good choice for a missing value would be the one which can reconstruct itself via the auto-encoder. For this we make an initial guess of the missing value using the nearest neighbor rule and then refine the missing value minimizing the reconstruction error. We train several classifiers using the union of the imputed instances and the remaining training instances without missing values. We also train another classifier of the same type with the same configuration using the corresponding complete dataset. The performances of these classifiers are compared. We compare the proposed method with eight state-of-the-art imputation techniques using fourteen datasets and eight classification strategies. (C) 2019 Elsevier B.V. All rights reserved.

Imputation of missing data with neural networks for classification

Journal

KNOWLEDGE-BASED SYSTEMS

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Imputation of missing data with neural networks for classification

Journal

KNOWLEDGE-BASED SYSTEMS

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper