4.0 Article

Learning the local landscape of protein structures with convolutional neural networks

期刊

JOURNAL OF BIOLOGICAL PHYSICS
卷 47, 期 4, 页码 435-454

出版社

SPRINGER
DOI: 10.1007/s10867-021-09593-6

关键词

Protein structure; Mutation; Microenvironment; Convolutional neural network

资金

  1. Welch Foundation [F-1654]
  2. Department of Defense - Defense Threat Reduction Agency [HDTRA12010011]
  3. National Institutes of Health [R01 AI148419]
  4. Jane and Roland Blumberg Centennial Professorship in Molecular Evolution
  5. Dwight W. and Blanche Faye Reeder Centennial Fellowship in Systematic and Evolutionary Biology at UT Austin
  6. U.S. Department of Defense (DOD) [HDTRA12010011] Funding Source: U.S. Department of Defense (DOD)

向作者/读者索取更多资源

The study investigates the use of 3D convolutional neural networks for predicting protein structures from amino acid sequences. The network can accurately predict wild type with good confidence levels, but has less accuracy in predicting consensus, primarily driven by whether or not the consensus matches the wild type. High-confidence mis-predictions of the wild type may indicate sites that are primed for mutation and could be targets for protein engineering.
One fundamental problem of protein biochemistry is to predict protein structure from amino acid sequence. The inverse problem, predicting either entire sequences or individual mutations that are consistent with a given protein structure, has received much less attention even though it has important applications in both protein engineering and evolutionary biology. Here, we ask whether 3D convolutional neural networks (3D CNNs) can learn the local fitness landscape of protein structure to reliably predict either the wild-type amino acid or the consensus in a multiple sequence alignment from the local structural context surrounding site of interest. We find that the network can predict wild type with good accuracy, and that network confidence is a reliable measure of whether a given prediction is likely going to be correct or not. Predictions of consensus are less accurate and are primarily driven by whether or not the consensus matches the wild type. Our work suggests that high-confidence mis-predictions of the wild type may identify sites that are primed for mutation and likely targets for protein engineering.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.0
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据