☆ 4.1 Editorial Material

Transfer learning: The key to functionally annotate the protein universe

PATTERNS (2023)

Journal

PATTERNS

Volume 4, Issue 2, Pages -

Publisher

CELL PRESS

DOI: 10.1016/j.patter.2023.100691

Keywords

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The automatic annotation of the protein universe remains a challenge. Currently, only 0.25% of the 229,149,489 entries in the UniProtKB database have been functionally annotated. Manual annotation processes utilize knowledge from the protein families database Pfam, but the growth of annotations has been slow. Deep learning models offer potential for learning evolutionary patterns, but the lack of large-scale data for many protein families poses a limitation. Transfer learning can overcome this limitation and result in significant improvements in protein family prediction accuracy.

The automatic annotation of the protein universe is still an unresolved challenge. Today, there are 229,149,489 entries in the UniProtKB database, but only 0.25% of them have been functionally annotated. This manual process integrates knowledge from the protein families database Pfam, annotating family domains using sequence alignments and hidden Markov models. This approach has grown the Pfam annotations at a low rate in the last years. Recently, deep learning models appeared with the capability of learning evolutionary patterns from unaligned protein sequences. However, this requires large-scale data, while many families contain just a few sequences. Here, we contend this limitation can be overcome by transfer learning, exploiting the full potential of self-supervised learning on large unannotated data and then supervised learning on a small labeled dataset. We show results where errors in protein family prediction can be reduced by 55% with respect to standard methods.

Transfer learning: The key to functionally annotate the protein universe

Journal

PATTERNS

Publisher

CELL PRESS

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Transfer learning: The key to functionally annotate the protein universe

Journal

PATTERNS

Publisher

CELL PRESS

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper