4.7 Article

Gaussian mixture model with feature selection: An embedded approach

Journal

COMPUTERS & INDUSTRIAL ENGINEERING
Volume 152, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.cie.2020.107000

Keywords

Gaussian Mixture Model (GMM); Expectation Maximization (EM); Feature selection

Ask authors/readers for more resources

A new algorithm called Expectation Selection Maximization (ESM) is proposed in this paper to address the issue of confusion and increased computational cost in GMM models by adding a feature selection step. The introduction of a relevancy index (RI) assists in feature selection by indicating the probability of assigning data points to specific clustering groups. The theoretical analysis justifies the effectiveness of RI for feature selection.
Gaussian Mixture Model (GMM) is a popular clustering algorithm due to its neat statistical properties, which enable the soft clustering and the determination of the number of clusters. Expectation-Maximization (EM) is usually applied to estimate the GMM parameters. While promising, the inclusion of features that are not contributing to clustering may confuse the model and increase computational cost. Recognizing the issue, in this paper, we propose a new algorithm, termed Expectation Selection Maximization (ESM), by adding a feature selection step (5). Specifically, we introduce a relevancy index (RI), a metric indicating the probability of assigning a data point to a specific clustering group. The RI index reveals the contribution of the feature to the clustering process thus can assist the feature selection. We conduct theoretical analysis to justify the use of RI for feature selection. Also, to demonstrate the efficacy of the proposed ESM, two synthetic datasets, four benchmark datasets, and an Alzheimer's Disease dataset are studied.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available