☆ 4.7 Article

MULTICOM2 open-source protein structure prediction system powered by deep learning and distance prediction

SCIENTIFIC REPORTS (2021)

Journal

SCIENTIFIC REPORTS

Volume 11, Issue 1, Pages -

Publisher

NATURE RESEARCH

DOI: 10.1038/s41598-021-92395-6

Keywords

Funding

National Science Foundation [DBI1759934, IIS1763246]
National Institutes of Health [GM093123]
Department of Energy, USA [DE-AR0001213, DE-SC0020400, DE-SC0021303]
U.S. Department of Energy (DOE) [DE-SC0020400, DE-SC0021303] Funding Source: U.S. Department of Energy (DOE)

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper introduces the latest open-source protein tertiary structure prediction system MULTICOM2, which integrates template-based modeling and template-free modeling methods, capable of predicting good tertiary structures across the board. The template-free modeling method's prediction accuracy on TBM and FM targets is very close to the combination of template-based and template-free modeling methods, demonstrating that the distance-based template-free modeling method powered by deep learning can largely replace the traditional template-based modeling method even on TBM targets.

Protein structure prediction is an important problem in bioinformatics and has been studied for decades. However, there are still few open-source comprehensive protein structure prediction packages publicly available in the field. In this paper, we present our latest open-source protein tertiary structure prediction system-MULTICOM2, an integration of template-based modeling (TBM) and template-free modeling (FM) methods. The template-based modeling uses sequence alignment tools with deep multiple sequence alignments to search for structural templates, which are much faster and more accurate than MULTICOM1. The template-free (ab initio or de novo) modeling uses the inter-residue distances predicted by DeepDist to reconstruct tertiary structure models without using any known structure as template. In the blind CASP14 experiment, the average TM-score of the models predicted by our server predictor based on the MULTICOM2 system is 0.720 for 58 TBM (regular) domains and 0.514 for 38 FM and FM/TBM (hard) domains, indicating that MULTICOM2 is capable of predicting good tertiary structures across the board. It can predict the correct fold for 76 CASP14 domains (95% regular domains and 55% hard domains) if only one prediction is made for a domain. The success rate is increased to 3% for both regular and hard domains if five predictions are made per domain. Moreover, the prediction accuracy of the pure template-free structure modeling method on both TBM and FM targets is very close to the combination of template-based and template-free modeling methods. This demonstrates that the distance-based template-free modeling method powered by deep learning can largely replace the traditional template-based modeling method even on TBM targets that TBM methods used to dominate and therefore provides a uniform structure modeling approach to any protein. Finally, on the 38 CASP14 FM and FM/TBM hard domains, MULTICOM2 server predictors (MULTICOM-HYBRID, MULTICOM-DEEP, MULTICOM-DIST) were ranked among the top 20 automated server predictors in the CASP14 experiment. After combining multiple predictors from the same research group as one entry, MULTICOM-HYBRID was ranked no. 5. The source code of MULTICOM2 is freely available at https://github.com/multicom-toolbox/multicom/tree/multicom_v2.0.

MULTICOM2 open-source protein structure prediction system powered by deep learning and distance prediction

Journal

SCIENTIFIC REPORTS

Publisher

NATURE RESEARCH

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

MULTICOM2 open-source protein structure prediction system powered by deep learning and distance prediction

Journal

SCIENTIFIC REPORTS

Publisher

NATURE RESEARCH

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper