4.3 Article

Development and large scale benchmark testing of the PROSPECTOR_3 threading algorithm

Journal

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS
Volume 56, Issue 3, Pages 502-518

Publisher

WILEY
DOI: 10.1002/prot.20106

Keywords

protein structure prediction; fold recognition; structural alignment; weakly homologous/analogous proteins; M. genitalium; E. coli; S. cerevisiae; genomes

Funding

  1. NIGMS NIH HHS [GM-48835] Funding Source: Medline

Ask authors/readers for more resources

This article describes the PROSPECTOR_3 threading algorithm, which combines various scoring functions designed to match structurally related. target/template pairs. Each variant described was found to have a Z-score above which most identified templates have good structural (threading) alignments, Z(struct) (Z(good)). 'Easy' targets with accurate threading alignments are identified as single templates with Z > Z(good) or two templates, each with Z > Z(struct), having a good consensus structure in mutually aligned regions. 'Medium' targets have a pair of templates lacking a consensus structure, or a single template for which Z(struct) < Z < Z(good). PROSPECTOR_3 was applied to a comprehensive Protein Data Bank (PDB) benchmark composed of 1491 single domain proteins, 41-200 residues long and no more than 30% identical to any threading template. Of the proteins, 878 were found to be easy targets, with 761 having a root mean square deviation (RMSD) from native of less than 6.5 Angstrom. The average contact prediction accuracy was 46%, and on average 17.6 residue continuous fragments were predicted with RMSD values of 2.0 Angstrom. There were 606 medium targets identified, 87% (31%) of which had good structural (threading) alignments. On average, 9.1 residue, continuous fragments with RMSD of 2.5 Angstrom were predicted. Combining easy and medium sets, 63% (91%) of the targets had good threading (structural) alignments compared to native; the average target/template sequence identity was 22%. Only nine targets lacked matched templates. Moreover, PROSPECTOR_3 consistently outperforms PSI-BLAST. Similar results were predicted for open reading frames (ORFS)less than or equal to200 residues in the M. genitalium, E. coli and S. cerevisiae genomes. Thus, progress has been made in identification of weakly homologous/analogous proteins, with very high alignment cover. age, both in a comprehensive PDB benchmark as well as in genomes. (C) 2004 Wiley-Liss, Inc.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available