4.6 Article

K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks

Journal

GENES
Volume 12, Issue 1, Pages -

Publisher

MDPI
DOI: 10.3390/genes12010087

Keywords

gene co-expression networks; distance correlation; connectivity; enrichment analysis

Funding

  1. National Natural Science Foundation of China [41876100]
  2. Development Project of Applied Technology in Harbin [2016RAXXJ071]

Ask authors/readers for more resources

This paper introduces a new method for constructing gene co-expression networks—k-module algorithm, which uses distance correlation to calculate the similarity matrix and assigns all genes to the module with the highest mean connectivity, improving the clustering results of WGCNA. The algorithm has fewer iterations, lower complexity, and readjusts the hierarchical clustering results while retaining the advantages of the dynamic tree cut method.
Among biological networks, co-expression networks have been widely studied. One of the most commonly used pipelines for the construction of co-expression networks is weighted gene co-expression network analysis (WGCNA), which can identify highly co-expressed clusters of genes (modules). WGCNA identifies gene modules using hierarchical clustering. The major drawback of hierarchical clustering is that once two objects are clustered together, it cannot be reversed; thus, re-adjustment of the unbefitting decision is impossible. In this paper, we calculate the similarity matrix with the distance correlation for WGCNA to construct a gene co-expression network, and present a new approach called the k-module algorithm to improve the WGCNA clustering results. This method can assign all genes to the module with the highest mean connectivity with these genes. This algorithm re-adjusts the results of hierarchical clustering while retaining the advantages of the dynamic tree cut method. The validity of the algorithm is verified using six datasets from microarray and RNA-seq data. The k-module algorithm has fewer iterations, which leads to lower complexity. We verify that the gene modules obtained by the k-module algorithm have high enrichment scores and strong stability. Our method improves upon hierarchical clustering, and can be applied to general clustering algorithms based on the similarity matrix, not limited to gene co-expression network analysis.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available