4.7 Article

Clustering Through Probability Distribution Analysis Along Eigenpaths

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TSMC.2018.2884839

Keywords

Connectedness index; density-based clustering; eigenpath

Funding

  1. Natural Science Foundation of China [61471216, 61771276]
  2. National Key Research and Development Program of China [2016YFB0101001]
  3. Special Foundation for the Development of Strategic Emerging Industries of Shenzhen [JCYJ20170307153940960, JCYJ20170817161845824]

Ask authors/readers for more resources

The paper introduces a one-dimensional analysis method for modeling high dimensional clustering problems as probability distributions, utilizing eigenpaths and connectedness indices to describe connections between vertices, drawing indicative curves to identify cluster forms, and partly eliminating the curse of dimensionality.
Data clustering is one of the most fundamental techniques in exploratory data analysis. It is widely used for determining the underlying data structure, classifying natural data and compressing data in engineering, business management, social statistics, computer science, and medicine. Under the assumption that clusters are high density regions in the feature space separated by relatively low density neighbors, a novel approach is proposed for modeling any high dimensional clustering problem as a one-dimensional analysis of the probability distribution. First, a special path between two vertexes, namely eigenpath, is defined in this paper to represent their close connection. Second, we propose the connectedness index based on the eigenpath for quantitatively describing the connection between two vertexes. Third, the connectedness index is applied to the candidates of cluster centers and measures the connection between different candidates. Then an indicative curve can be drawn with the knowledge of connectedness index. This approach not only provides effective indicative curve for unknown data sets but also facilitates eliminating the curse of dimensionality partly as well as correctly recognizes arbitrary cluster forms and automatically excludes outliers. Extensive experiments showed the effectiveness and efficiency of the proposed approach.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available