4.8 Article

Using deep learning to annotate the protein universe

期刊

NATURE BIOTECHNOLOGY
卷 40, 期 6, 页码 932-+

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41587-021-01179-w

关键词

-

资金

  1. Simons Foundation

向作者/读者索取更多资源

This article describes a method that uses deep learning models to predict functional annotations for unaligned protein amino acid sequences. The models are trained on rigorous benchmark assessments and can accurately predict the function of sequences across different protein families. The results show that deep learning models can significantly improve remote homology detection and expand the coverage of existing annotation tools. These models are expected to be a core component of future protein annotation tools.
A deep learning model predicts protein functional annotations for unaligned amino acid sequences. Understanding the relationship between amino acid sequence and protein function is a long-standing challenge with far-reaching scientific and translational implications. State-of-the-art alignment-based techniques cannot predict function for one-third of microbial protein sequences, hampering our ability to exploit data from diverse organisms. Here, we train deep learning models to accurately predict functional annotations for unaligned amino acid sequences across rigorous benchmark assessments built from the 17,929 families of the protein families database Pfam. The models infer known patterns of evolutionary substitutions and learn representations that accurately cluster sequences from unseen families. Combining deep models with existing methods significantly improves remote homology detection, suggesting that the deep models learn complementary information. This approach extends the coverage of Pfam by >9.5%, exceeding additions made over the last decade, and predicts function for 360 human reference proteome proteins with no previous Pfam annotation. These results suggest that deep learning models will be a core component of future protein annotation tools.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据