☆ 4.7 Article

A New Clustering Method Based on the Inversion Formula

MATHEMATICS (2022)

Journal

MATHEMATICS

Volume 10, Issue 15, Pages -

Publisher

MDPI

DOI: 10.3390/math10152559

Keywords

artificial intelligence; unsupervised machine learning; clustering; nonparametric density estimation; inversion formula

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This research presents a new density clustering method based on the modified inversion formula density estimation, which divides data into different classes by discovering the internal structure of data set objects and their relationship. The method performs well, but currently can only handle low-dimensional data.

Data clustering is one area of data mining that falls into the data mining class of unsupervised learning. Cluster analysis divides data into different classes by discovering the internal structure of data set objects and their relationship. This paper presented a new density clustering method based on the modified inversion formula density estimation. This new method should allow one to improve the performance and robustness of the k-means, Gaussian mixture model, and other methods. The primary process of the proposed clustering algorithm consists of three main steps. Firstly, we initialized parameters and generated a T matrix. Secondly, we estimated the densities of each point and cluster. Third, we updated mean, sigma, and phi matrices. The new method based on the inversion formula works quite well with different datasets compared with K-means, Gaussian Mixture Model, and Bayesian Gaussian Mixture model. On the other hand, new methods have limitations because this one method in the current state cannot work with higher-dimensional data (d > 15). This will be solved in the future versions of the model, detailed further in future work. Additionally, based on the results, we can see that the MIDEv2 method works the best with generated data with outliers in all datasets (0.5%, 1%, 2%, 4% outliers). The interesting point is that a new method based on the inversion formula can cluster the data even if data do not have outliers; one of the most popular, for example, is the Iris dataset.

A New Clustering Method Based on the Inversion Formula

Journal

MATHEMATICS

Publisher

MDPI

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

A New Clustering Method Based on the Inversion Formula

Journal

MATHEMATICS

Publisher

MDPI

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper