4.8 Article

Single-sequence protein structure prediction using a language model and deep learning

期刊

NATURE BIOTECHNOLOGY
卷 40, 期 11, 页码 1617-+

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41587-022-01432-w

关键词

-

资金

  1. NVIDIA Corporation
  2. DARPA PANACEA program [HR0011-19-2-0022]
  3. National Cancer Institute [U54-CA225088]

向作者/读者索取更多资源

This article reports the development of an end-to-end differentiable recurrent geometric network (RGN) called RGN2 for predicting protein structure using the protein sequence. Compared to AlphaFold2 and RoseTTAFold, RGN2 performs better on orphan proteins and designed protein classes, while achieving a computational speedup of up to 10^6-fold. These findings demonstrate the practical and theoretical advantages of protein language models over multiple sequence alignments (MSAs) in protein structure prediction.
RGN2 predicts a protein's structure from its sequence without a multiple sequence alignment. AlphaFold2 and related computational systems predict protein structure using deep learning and co-evolutionary relationships encoded in multiple sequence alignments (MSAs). Despite high prediction accuracy achieved by these systems, challenges remain in (1) prediction of orphan and rapidly evolving proteins for which an MSA cannot be generated; (2) rapid exploration of designed structures; and (3) understanding the rules governing spontaneous polypeptide folding in solution. Here we report development of an end-to-end differentiable recurrent geometric network (RGN) that uses a protein language model (AminoBERT) to learn latent structural information from unaligned proteins. A linked geometric module compactly represents C-alpha backbone geometry in a translationally and rotationally invariant way. On average, RGN2 outperforms AlphaFold2 and RoseTTAFold on orphan proteins and classes of designed proteins while achieving up to a 10(6)-fold reduction in compute time. These findings demonstrate the practical and theoretical strengths of protein language models relative to MSAs in structure prediction.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据