4.6 Article

Structure-aware protein solubility prediction from sequence through graph convolutional network and predicted contact map

期刊

JOURNAL OF CHEMINFORMATICS
卷 13, 期 1, 页码 -

出版社

BMC
DOI: 10.1186/s13321-021-00488-1

关键词

Protein solubility prediction; Graph neural network; Predicted contact map; Deep learning

资金

  1. National Key R&D Program of China [2020YFB020003]
  2. National Natural Science Foundation of China [61772566]
  3. Guangdong Key Field RD Plan [2019B020228001, 2018B010109006]
  4. Introducing Innovative and Entrepreneurial Teams [2016ZT06D211]
  5. Guangzhou ST Research Plan [202007030010]

向作者/读者索取更多资源

This study developed a new structure-aware method GraphSol for predicting protein solubility using attentive graph convolutional network, constructing a protein topology attribute graph from the sequence. The model showed superior performance and stability, being the first to utilize GCN for sequence-based protein solubility predictions.
Protein solubility is significant in producing new soluble proteins that can reduce the cost of biocatalysts or therapeutic agents. Therefore, a computational model is highly desired to accurately predict protein solubility from the amino acid sequence. Many methods have been developed, but they are mostly based on the one-dimensional embedding of amino acids that is limited to catch spatially structural information. In this study, we have developed a new structure-aware method GraphSol to predict protein solubility by attentive graph convolutional network (GCN), where the protein topology attribute graph was constructed through predicted contact maps only from the sequence. GraphSol was shown to substantially outperform other sequence-based methods. The model was proven to be stable by consistent R-2 of 0.48 in both the cross-validation and independent test of the eSOL dataset. To our best knowledge, this is the first study to utilize the GCN for sequence-based protein solubility predictions. More importantly, this architecture could be easily extended to other protein prediction tasks requiring a raw protein sequence.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据