Journal
JOURNAL OF COMPUTATIONAL SCIENCE
Volume 64, Issue -, Pages -Publisher
ELSEVIER
DOI: 10.1016/j.jocs.2022.101874
Keywords
GMM; MIGMM; chi(2) distribution; Mahalanobis distance; Adaptive optimal number; Adaptive interval
Funding
- Key Projects of Hunan Provincial Department of Education [21A0403, 21A0405]
- Hunan Provincial Natural Science Foundation of China [2022JJ30282]
- Key Laboratory of Hunan Province [2019TP1014]
- university-industry collaborative project [202102211006]
Ask authors/readers for more resources
This study introduces a novel method for adaptively determining the optimal number of components (M) in a Gaussian mixture model when fitting a dataset, avoiding underfitting and overfitting.
Regarding the determination of the number of components (M) in a Gaussian mixture model (GMM), this study proposes a novel method for adaptively locating an optimal value of M when using a GMM to fit a given dataset; this method avoids underfitting and overfitting due to an unreasonable manually specified interval. The major contributions of this study are highlighted: (1) An adaptive interval for M (denoted as M is an element of [M-Min(Ada), M-Max(Ada)]) based on two procedures of a novel method, the modified incremental Gaussian mixture model (MIGMM), is determined via an adjustable parameter beta. (2) Considering some typical criteria, the optimal number.. within the obtained adaptive interval [M-Min(Ada), M-Max(Ada)], M-Opt(Ada) , is ultimately determined. Regarding the adaptive interval, extensive experiments with typical synthetic datasets show that [M-Min(Ada) M-Max(Ada)], corresponding to the parameter [beta(Min) = 10(-11), beta(Max) = 10(-2)], is determined. The performance of the M-Opt(Ada) determination based on several typical criteria is evaluated on both synthetic and real-world datasets.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available