4.7 Article

Inter-domain distance prediction based on deep learning for domain assembly

Journal

BRIEFINGS IN BIOINFORMATICS
Volume 24, Issue 3, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbad100

Keywords

multi-domain protein; domain assembly; inter-domain distance; deep learning

Ask authors/readers for more resources

AlphaFold2 achieves breakthrough in protein structure prediction with an end-to-end deep learning method, accurately predicting single-domain proteins. However, accuracy in predicting full-chain proteins is lower due to incorrect domain interactions. This study introduces DeepIDDP, an inter-domain distance prediction method, which incorporates attention mechanisms and new inter-domain features to enhance capturing domain interactions. Integration of DeepIDDP into the SADA domain assembly method improves inter-domain distance prediction accuracy by 11.3% and 21.6% compared to trRosettaX and trRosetta, and the domain assembly model outperforms SADA by 2.5%. Additionally, using DeepIDDP to reassemble human multi-domain protein models enhances average TM-score by 11.8%. The online server can be found at .
AlphaFold2 achieved a breakthrough in protein structure prediction through the end-to-end deep learning method, which can predict nearly all single-domain proteins at experimental resolution. However, the prediction accuracy of full-chain proteins is generally lower than that of single-domain proteins because of the incorrect interactions between domains. In this work, we develop an inter-domain distance prediction method, named DeepIDDP. In DeepIDDP, we design a neural network with attention mechanisms, where two new inter-domain features are used to enhance the ability to capture the interactions between domains. Furthermore, we propose a data enhancement strategy termed DPMSA, which is employed to deal with the absence of co-evolutionary information on targets. We integrate DeepIDDP into our previously developed domain assembly method SADA, termed SADA-DeepIDDP. Tested on a given multi-domain benchmark dataset, the accuracy of SADA-DeepIDDP inter-domain distance prediction is 11.3% and 21.6% higher than trRosettaX and trRosetta, respectively. The accuracy of the domain assembly model is 2.5% higher than that of SADA. Meanwhile, we reassemble 68 human multi-domain protein models with TM-score <= 0.80 from the AlphaFold protein structure database, where the average TM-score is improved by 11.8% after the reassembly by our method. The online server is at .

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available