4.1 Article

Sub-AQUA: real-value quality assessment of protein structure models

Journal

PROTEIN ENGINEERING DESIGN & SELECTION
Volume 23, Issue 8, Pages 617-632

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/protein/gzq030

Keywords

error estimation; homology modeling; model quality assessment; protein structure prediction; regression analysis; threading

Funding

  1. National Institute of General Medical Sciences of the National Institutes of Health [U24GM077905, R01GM075004]
  2. National Science Foundation [DMS604776, DMS800568, IIS0915801, EF0850009]
  3. Purdue Research Foundation
  4. Department of Biological Sciences, Purdue University
  5. Emerging Frontiers
  6. Direct For Biological Sciences [0850009] Funding Source: National Science Foundation

Ask authors/readers for more resources

Computational protein tertiary structure prediction has made significant progress over the past years. However, most of the existing structure prediction methods are not equipped with functionality to predict accuracy of constructed models. Knowing the accuracy of a structure model is crucial for its practical use since the accuracy determines potential applications of the model. Here we have developed quality assessment methods, which predict real value of the global and local quality of protein structure models. The global quality of a model is defined as the root mean square deviation (RMSD) and the LGA score to its native structure. The local quality is defined as the distance between the corresponding C alpha positions of a model and its native structure when they are superimposed. Three regression methods are employed to combine different types of quality assessment measures of models, including alignment-level scores, residue-position level scores, atomic-detailed structure level scores and composite scores. The regression models were tested on a large benchmark data set of template-based protein structure models of various qualities. In predicting RMSD and the LGA score, a combination of two terms, length-normalized SPAD, a score that assesses alignment stability by considering suboptimal alignments, and Verify3D normalized by the square of the model length shows a significant performance, achieving 97.1 and 83.6% accuracy in identifying models with an RMSD of < 2 and 6 A, respectively. For predicting the local quality of models, we find that a two-step approach, in which the global RMSD predicted in the first step is further combined with the other terms, can dramatically increase the accuracy. Finally, the developed regression equations are applied to assess the quality of structure models of whole E. coli proteome.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.1
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available