4.6 Article

Efficient link prediction in the protein-protein interaction network using topological information in a generative adversarial network machine learning model

期刊

BMC BIOINFORMATICS
卷 23, 期 1, 页码 -

出版社

BMC
DOI: 10.1186/s12859-022-04598-x

关键词

Edge prediction; PPI prediction; Protein interaction prediction; Interactome; Conditional GAN

资金

  1. Semmelweis University
  2. National Research, Development and Innovation Office of Hungary (NKFIA) [NVKP-16-1-2016-0017, K131458]
  3. Thematic Excellence Programme of the Ministry for Innovation and Technology in Hungary of the Semmelweis University [2020-4.1.1.-TKP2020, TKP2021-EGA-24]
  4. Research Excellence Programme of the National Research, Development and Innovation Office of the Ministry of Innovation and Technology in Hungary (TKP/ITM/NKFIH)
  5. New National Excellence Program of the Ministry for Innovation and Technology from the National Research, Development and Innovation Fund [UNKP-20-4-I-SE-7, UNKP-21-4-II-SE-18]
  6. Semmelweis 250+ Kivalosagi PhD Osztondij grant [EFOP3.6.3-VEKOP-16-2017-00009]

向作者/读者索取更多资源

The authors developed a software tool for link prediction in protein-protein interaction networks using machine learning. They utilized a modified breadth-first search algorithm for data processing and a conditional generative adversarial network model with Wasserstein distance-based loss improved with gradient penalty to predict potential unknown edge connections.
Background: The investigation of possible interactions between two proteins in intracellular signaling is an expensive and laborious procedure in the wet-lab, therefore, several in silico approaches have been implemented to narrow down the candidates for future experimental validations. Reformulating the problem in the field of network theory, the set of proteins can be represented as the nodes of a network, while the interactions between them as the edges. The resulting protein-protein interaction (PPI) network enables the use of link prediction techniques in order to discover new probable connections. Therefore, here we aimed to offer a novel approach to the link prediction task in PPI networks, utilizing a generative machine learning model. Results: We created a tool that consists of two modules, the data processing framework and the machine learning model. As data processing, we used a modified breadth-first search algorithm to traverse the network and extract induced subgraphs, which served as image-like input data for our model. As machine learning, an image-to-image translation inspired conditional generative adversarial network (cGAN) model utilizing Wasserstein distance-based loss improved with gradient penalty was used, taking the combined representation from the data processing as input, and training the generator to predict the probable unknown edges in the provided induced subgraphs. Our link prediction tool was evaluated on the protein-protein interaction networks of five different species from the STRING database by calculating the area under the receiver operating characteristic, the precision-recall curves and the normalized discounted cumulative gain (AUROC, AUPRC, NDCG, respectively). Test runs yielded the averaged results of AUROC = 0.915, AUPRC = 0.176 and NDCG = 0.763 on all investigated species. Conclusion: We developed a software for the purpose of link prediction in PPI networks utilizing machine learning. The evaluation of our software serves as the first demonstration that a cGAN model, conditioned on raw topological features of the PPI network, is an applicable solution for the PPI prediction problem without requiring often unavailable molecular node attributes. The corresponding scripts are available at https://github.com/semmelweis-pharmacology/ppi_pred.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据