☆ 4.8 Article

The protein structure prediction problem could be solved using the current PDB library

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2005)

期刊

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

卷 102, 期 4, 页码 1029-1034

出版社

NATL ACAD SCIENCES

DOI: 10.1073/pnas.0407152101

关键词

类别

Multidisciplinary Sciences

资金

NIGMS NIH HHS [R01 GM037408, R01 GM048835, GM-37408, GM-48835] Funding Source: Medline

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

For single-domain proteins, we examine the completeness of the structures in the current Protein Data Bank (PDB) library for use in full-length model construction of unknown sequences. To address this issue, we employ a comprehensive benchmark set of 1,489 medium-size proteins that cover the PDB at the level of 35% sequence identity and identify templates by structure alignment. With homologous proteins excluded, we can always find similar folds to native with an average rms deviation (RMSD) from native of 2.5 Angstrom with approximate to 82% alignment coverage. These template structures often contain a significant number of insertions/deletions. The TASSER algorithm was applied to build full-length models, where continuous fragments are excised from the top-scoring templates and reassembled under the guide of an optimized force field, which includes consensus restraints taken from the templates and knowledge-based statistical potentials. For almost all targets (except for 2/1,489), the resultant full-length models have an RMSD to native below 6 Angstrom (97% of them below 4 Angstrom). On average, the RMSD of full-length models is 2.25 Angstrom, with aligned regions improved from 2.5 A to 1.88 Angstrom, comparable with the accuracy of low-resolution experimental structures. Furthermore, starting from state-of-the-art structural alignments, we demonstrate a methodology that can consistently bring template-based alignments closer to native. These results are highly suggestive that the protein-folding problem can in principle be solved based on the current PDB library by developing efficient fold recognition algorithms that can recover such initial alignments.

The protein structure prediction problem could be solved using the current PDB library

期刊

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

出版社

NATL ACAD SCIENCES

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

The protein structure prediction problem could be solved using the current PDB library

期刊

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

出版社

NATL ACAD SCIENCES

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文