4.7 Article

Benchmarking the Accuracy of AlphaFold 2 in Loop Structure Prediction

Journal

BIOMOLECULES
Volume 12, Issue 7, Pages -

Publisher

MDPI
DOI: 10.3390/biom12070985

Keywords

AlphaFold 2; loop structure prediction

Funding

  1. National Science Foundation Graduate Research Fellowship Program [DGE-1939267]
  2. National Science Foundation [2137558]
  3. Substance Use Disorders Grand Challenge Pilot Research Award by the University of New Mexico
  4. NIH [P20GM121176]
  5. Research Allocations Committee (RAC) Award by the University of New Mexico
  6. Direct For Mathematical & Physical Scien
  7. Division Of Chemistry [2137558] Funding Source: National Science Foundation

Ask authors/readers for more resources

The inhibition of protein-protein interactions is a growing strategy in drug development, and protein loop regions are potential drug targets. AlphaFold 2 performs well in predicting protein loop structures, especially for short loops. However, as the length of the loop increases, the accuracy of AlphaFold 2's prediction decreases.
The inhibition of protein-protein interactions is a growing strategy in drug development. In addition to structured regions, many protein loop regions are involved in protein-protein interactions and thus have been identified as potential drug targets. To effectively target such regions, protein structure is critical. Loop structure prediction is a challenging subgroup in the field of protein structure prediction because of the reduced level of conservation in protein sequences compared to the secondary structure elements. AlphaFold 2 has been suggested to be one of the greatest achievements in the field of protein structure prediction. The AlphaFold 2 predicted protein structures near the X-ray resolution in the Critical Assessment of protein Structure Prediction (CASP 14) competition in 2020. The purpose of this work is to survey the performance of AlphaFold 2 in specifically predicting protein loop regions. We have constructed an independent dataset of 31,650 loop regions from 2613 proteins (deposited after the AlphaFold 2 was trained) with both experimentally determined structures and AlphaFold 2 predicted structures. With extensive evaluation using our dataset, the results indicate that AlphaFold 2 is a good predictor of the structure of loop regions, especially for short loop regions. Loops less than 10 residues in length have an average Root Mean Square Deviation (RMSD) of 0.33 angstrom and an average the Template Modeling score (TM-score) of 0.82. However, we see that as the number of residues in a given loop increases, the accuracy of AlphaFold 2's prediction decreases. Loops more than 20 residues in length have an average RMSD of 2.04 angstrom and an average TM-score of 0.55. Such a correlation between accuracy and length of the loop is directly linked to the increase in flexibility. Moreover, AlphaFold 2 does slightly over-predict alpha-helices and beta-strands in proteins.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available