4.3 Article

A simulation comparison of imputation methods for quantitative data in the presence of multiple data patterns

Journal

JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION
Volume 88, Issue 18, Pages 3588-3619

Publisher

TAYLOR & FRANCIS LTD
DOI: 10.1080/00949655.2018.1530773

Keywords

Forward imputation; iterative principal component analysis; Mahalanobis distance; missForest; missing data; Monte Carlo simulation; multivariate exponential power distribution; multivariate skew-normal distribution; nearest-neighbour imputation

Ask authors/readers for more resources

An extensive investigation via simulation is carried out with the aim of comparing three nonparametric, single imputation methods in the presence of multiple data patterns. The ultimate goal is to provide useful hints for users needing to quickly pick the most effective imputation method among the following: Forward Imputation (Forlmp), considered in the two variants of with the principal component analysis (PCA), which alternates the use of PCA and the Nearest-Neighbour Imputation (NNI) method in a forward, sequential procedure, and with the Mahalanobis distance, which involves the use of the Mahalanobis distance when performing NNI; the iterative PCA technique, which imputes missing values simultaneously via PCA; the method, which is based on random forests and is developed for mixed-type data. The performance of these methods is compared under several data patterns characterized by different levels of kurtosis or skewness and correlation structures.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available