4.8 Article

Flexible High-Dimensional Unsupervised Learning with Missing Data

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2018.2885760

Keywords

Analytical models; Computational modeling; Data models; Unsupervised learning; Covariance matrices; Clustering algorithms; Mixture models; Clustering; factor analysis; generalized hyperbolic; missing data; mixture of factor analyzers; mixture model; model-based clustering; unsupervised classification

Funding

  1. Ontario Graduate Scholarship, Faculty of Science
  2. Canada Research Chairs programs

Ask authors/readers for more resources

The mixture of factor analyzers (MFA) model is a famous mixture model-based approach for unsupervised learning with high-dimensional data. It can be useful, inter alia, in situations where the data dimensionality far exceeds the number of observations. In recent years, the MFA model has been extended to non-Gaussian mixtures to account for clusters with heavier tail weight and/or asymmetry. The generalized hyperbolic factor analyzers (MGHFA) model is one such extension, which leads to a flexible modelling paradigm that accounts for both heavier tail weight and cluster asymmetry. In many practical applications, the occurrence of missing values often complicates data analyses. A generalization of the MGHFA is presented to accommodate missing values. Under a missing-at-random mechanism, we develop a computationally efficient alternating expectation conditional maximization algorithm for parameter estimation of the MGHFA model with different patterns of missing values. The imputation of missing values under an incomplete-data structure of MGHFA is also investigated. The performance of our proposed methodology is illustrated through the analysis of simulated and real data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available