4.7 Article

Lifting the limitations of Gaussian mixture regression through coupling with principal component analysis and deep autoencoding

Journal

Publisher

ELSEVIER
DOI: 10.1016/j.chemolab.2021.104437

Keywords

Direct inverse analysis; Gaussian mixture regression; Principal component analysis; Deep autoencoding; Material design Gaussian mixture regression; GMR; PCA; DAE; Generative topographic mapping regression; GTMR; Gaussian mixture modeling; GMM; Root-mean-squared error; RMSE

Funding

  1. Japan Society for the Promotion of Science [19K15352, 20H02553, 20H04538]
  2. Center of Innovation (COI) program from Japan Science and Technology Agency [R1WD12]
  3. Grants-in-Aid for Scientific Research [20H04538, 19K15352, 20H02553] Funding Source: KAKEN

Ask authors/readers for more resources

The study introduces a novel modeling approach that transforms explanatory variables into latent variables and combines them with Gaussian mixture regression for direct inverse analysis. By using dimensionality reduction methods such as PCA or DAE, PCA-GMR and DAE-GMR models are developed, which significantly reduce prediction errors and enhance predictive accuracy.
The mathematical modeling of correlations between target properties and their factors of influence, particularly that allowing inverse analysis, is an essential part of molecular, material, and process designs. In contrast to approaches employing pseudo-inverse analysis, Gaussian mixture regression (GMR), which assumes that the relationships between variables can be represented as a mixture of Gaussian distributions, allows for direct inverse analysis. However, as this model optimizes the means and variance-covariance matrices of all variables, parameter estimation becomes increasingly difficult with the increasing number of variables. Herein, this drawback is addressed by the transformation of explanatory variables X into latent variables Z before GMR modeling. As the inverse (Z to X) transformation is necessary for direct inverse analysis, principal component analysis (PCA) and deep autoencoding (DAE) are employed as dimensionality reduction methods. After X is transformed to Z with the help of PCA or DAE, a GMR model is constructed with Z and objective variables Y, and the proposed method is therefore denoted as PCA-GMR or DAE-GMR, respectively. As Z values can be predicted by inputting Y values into the GMR model and can be transformed to X values, direct inverse analysis is also possible. Given that unlabeled data can be employed to construct PCA and DAE models, the proposed methods can also be used as semi-supervised learning techniques. The predictive abilities of PCA-GMR and DAE-GMR are verified using molecular, material, and spectral datasets and surpass that of traditional GMR on all datasets, with the maximum reduction of prediction errors equaling 63%.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available