4.6 Article

Clustering assessment in weighted networks

期刊

PEERJ COMPUTER SCIENCE
卷 -, 期 -, 页码 -

出版社

PEERJ INC
DOI: 10.7717/peerj-cs.600

关键词

Clustering; Weighted networks; Significance; Stability; Randomized graph; Bootstrap; Mutual information; Stochastic block model; R

资金

  1. MINECO (Ministerio de Economia, Industria y Competitividad) [TIN2017-89244-R]
  2. AGAUR (Generalitat de Catalunya) [2017SGR-856]

向作者/读者索取更多资源

The study introduces a systematic approach for validating clustering results on weighted networks, assessing significance and stability through community scoring functions and a non-parametric bootstrap method. Testing on synthetic and real world networks identifies best performing algorithms, suggesting adequacy for cases with unknown clustering structures. The methods are implemented in R and will be released in the upcoming clustAnalytics package.
We provide a systematic approach to validate the results of clustering methods on weighted networks, in particular for the cases where the existence of a community structure is unknown. Our validation of clustering comprises a set of criteria for assessing their significance and stability. To test for cluster significance, we introduce a set of community scoring functions adapted to weighted networks, and systematically compare their values to those of a suitable null model. For this we propose a switching model to produce randomized graphs with weighted edges while maintaining the degree distribution constant. To test for cluster stability, we introduce a non parametric bootstrap method combined with similarity metrics derived from information theory and combinatorics. In order to assess the effectiveness of our clustering quality evaluation methods, we test them on synthetically generated weighted networks with a ground truth community structure of varying strength based on the stochastic block model construction. When applying the proposed methods to these synthetic ground truth networks' clusters, as well as to other weighted networks with known community structure, these correctly identify the best performing algorithms, which suggests their adequacy for cases where the clustering structure is not known. We test our clustering validation methods on a varied collection of well known clustering algorithms applied to the synthetically generated networks and to several real world weighted networks. All our clustering validation methods are implemented in R, and will be released in the upcoming package clustAnalytics.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据