4.8 Article

De novo protein design by deep network hallucination

期刊

NATURE
卷 600, 期 7889, 页码 547-+

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41586-021-04184-w

关键词

-

资金

  1. NSF [DBI 1937533, MCB 2032259]
  2. NIH [DP5OD026389, R01 GM120574, R35GM141818, P30GM124165, S10OD021527]
  3. Open Philanthropy
  4. Eric and Wendy Schmidt by the Schmidt Futures program
  5. Audacious project
  6. Washington Research Foundation
  7. Novo Nordisk Foundation [NNF17OC0030446]
  8. Howard Hughes Medical Institute
  9. STF at the University of Washington
  10. Rosetta@Home volunteers in ab initio structure prediction calculations
  11. DOE [DE-AC02-06CH11357]

向作者/读者索取更多资源

Recent progress in protein structure prediction using deep neural networks has shown that these networks can be used to design new proteins with novel functions, by generating new folded proteins with sequences unrelated to those of naturally occurring proteins.
There has been considerable recent progress in protein structure prediction using deep neural networks to predict inter-residue distances from amino acid sequences(1-3). Here we investigate whether the information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occurring proteins used in training the models. We generate random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting residue-residue distance maps, which, as expected, are quite featureless. We then carry out Monte Carlo sampling in amino acid sequence space, optimizing the contrast (Kullback-Leibler divergence) between the inter-residue distance distributions predicted by the network and background distributions averaged over all proteins. Optimization from different random starting points resulted in novel proteins spanning a wide range of sequences and predicted structures. We obtained synthetic genes encoding 129 of the network-'hallucinated' sequences, and expressed and purified the proteins in Escherichia coli; 27 of the proteins yielded monodisperse species with circular dichroism spectra consistent with the hallucinated structures. We determined the three-dimensional structures of three of the hallucinated proteins, two by X-ray crystallography and one by NMR, and these closely matched the hallucinated models. Thus, deep networks trained to predict native protein structures from their sequences can be inverted to design new proteins, and such networks and methods should contribute alongside traditional physics-based models to the de novo design of proteins with new functions.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据