期刊
COMPUTERS IN BIOLOGY AND MEDICINE
卷 138, 期 -, 页码 -出版社
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compbiomed.2021.104933
关键词
Protein-protein interaction network; Protein complexes identification; Spectral clustering; Graph embedding; Affinity matrix
The paper introduces a new spectral clustering method named TADWSC for identifying protein complexes in attributed networks. By combining topological structure and node features, the method improves the accuracy of protein complexes through calculating embedding vectors and the affinity matrix. The proposed method shows unexpectedly good performance compared to existing state-of-the-art methods in both real protein network datasets and synthetic networks.
The identification of protein complexes in protein-protein interaction networks is the most fundamental and essential problem for revealing the underlying mechanism of biological processes. However, most existing protein complexes identification methods only consider a network's topology structures, and in doing so, these methods miss the advantage of using nodes' feature information. In protein-protein interaction, both topological structure and node features are essential ingredients for protein complexes. The spectral clustering method utilizes the eigenvalues of the affinity matrix of the data to map to a low-dimensional space. It has attracted much attention in recent years as one of the most efficient algorithms in the subcategory of dimensionality reduction. In this paper, a new version of spectral clustering, named text-associated DeepWalk-Spectral Clustering (TADWSC), is proposed for attributed networks in which the identified protein complexes have structural cohesiveness and attribute homogeneity. Since the performance of spectral clustering heavily depends on the effectiveness of the affinity matrix, our proposed method will use the text-associated DeepWalk (TADW) to calculate the embedding vectors of proteins. In the following, the affinity matrix will be computed by utilizing the cosine similarity between the two low dimensional vectors, which will be considerable to improve the accuracy of the affinity matrix. Experimental results show that our method performs unexpectedly well in comparison to existing state-of-the-art methods in both real protein network datasets and synthetic networks.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据