4.8 Article

Evolutionary-scale prediction of atomic-level protein structure with a language model

期刊

SCIENCE
卷 379, 期 6637, 页码 1123-1130

出版社

AMER ASSOC ADVANCEMENT SCIENCE
DOI: 10.1126/science.ade2574

关键词

-

向作者/读者索取更多资源

Recent advances in machine learning have allowed for the prediction of protein structure from multiple sequence alignments. By using a large language model, we are able to directly infer full atomic-level protein structure from primary sequence. This has led to a significant acceleration in high-resolution structure prediction, enabling the characterization of a large number of metagenomic proteins. Utilizing this capability, we have constructed the ESM Metagenomic Atlas, which provides insights into the diversity of natural proteins.
Recent advances in machine learning have leveraged evolutionary information in multiple sequence alignments to predict protein structure. We demonstrate direct inference of full atomic-level protein structure from primary sequence using a large language model. As language models of protein sequences are scaled up to 15 billion parameters, an atomic-resolution picture of protein structure emerges in the learned representations. This results in an order-of-magnitude acceleration of high-resolution structure prediction, which enables large-scale structural characterization of metagenomic proteins. We apply this capability to construct the ESM Metagenomic Atlas by predicting structures for >617 million metagenomic protein sequences, including >225 million that are predicted with high confidence, which gives a view into the vast breadth and diversity of natural proteins.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据