4.5 Article

An adaptive optimization method for estimating the number of components in a Gaussian mixture model

Journal

JOURNAL OF COMPUTATIONAL SCIENCE
Volume 64, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.jocs.2022.101874

Keywords

GMM; MIGMM; chi(2) distribution; Mahalanobis distance; Adaptive optimal number; Adaptive interval

Funding

  1. Key Projects of Hunan Provincial Department of Education [21A0403, 21A0405]
  2. Hunan Provincial Natural Science Foundation of China [2022JJ30282]
  3. Key Laboratory of Hunan Province [2019TP1014]
  4. university-industry collaborative project [202102211006]

Ask authors/readers for more resources

This study introduces a novel method for adaptively determining the optimal number of components (M) in a Gaussian mixture model when fitting a dataset, avoiding underfitting and overfitting.
Regarding the determination of the number of components (M) in a Gaussian mixture model (GMM), this study proposes a novel method for adaptively locating an optimal value of M when using a GMM to fit a given dataset; this method avoids underfitting and overfitting due to an unreasonable manually specified interval. The major contributions of this study are highlighted: (1) An adaptive interval for M (denoted as M is an element of [M-Min(Ada), M-Max(Ada)]) based on two procedures of a novel method, the modified incremental Gaussian mixture model (MIGMM), is determined via an adjustable parameter beta. (2) Considering some typical criteria, the optimal number.. within the obtained adaptive interval [M-Min(Ada), M-Max(Ada)], M-Opt(Ada) , is ultimately determined. Regarding the adaptive interval, extensive experiments with typical synthetic datasets show that [M-Min(Ada) M-Max(Ada)], corresponding to the parameter [beta(Min) = 10(-11), beta(Max) = 10(-2)], is determined. The performance of the M-Opt(Ada) determination based on several typical criteria is evaluated on both synthetic and real-world datasets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available