4.7 Article

PCA model building with missing data: New proposals and a comparative study

Journal

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS
Volume 146, Issue -, Pages 77-88

Publisher

ELSEVIER
DOI: 10.1016/j.chemolab.2015.05.006

Keywords

Missing data; PCA model building; PCA model exploitation

Funding

  1. Spanish Ministry of Science and Innovation
  2. FEDER funds from the European Union [DPI2011-28112-C04-02]
  3. Spanish Ministry of Economy and Competitiveness [ECO2013-43353-R]

Ask authors/readers for more resources

This paper introduces new methods for building principal component analysis (PCA) models with missing data: projection to the model plane (PMP), known data regression (KDR), KDR with principal component regression (PCR), KDR with partial least squares regression (PLS) and trimmed scores regression (TSR). These methods are adapted from their PCA model exploitation version to deal with the more general problem of PCA model building when the training set has missing values. A comparative study is carried out comparing these new methods with the standard ones, such as the modified nonlinear iterative partial least squares (NIPALS), the iterative algorithm (IA), the data augmentation method (DA) and the nonlinear programming approach (NIP). The performance is assessed using the mean squared prediction error of the reconstructed matrix and the cosines between the actual principal components and the ones extracted by each method. Four data sets, two simulated and two real ones, with several percentages of missing data, are used to perform the comparison. (C) 2015 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available