期刊
ACTA CRYSTALLOGRAPHICA SECTION D-STRUCTURAL BIOLOGY
卷 63, 期 -, 页码 941-950出版社
INT UNION CRYSTALLOGRAPHY
DOI: 10.1107/S0907444907033847
关键词
-
The genomics era has seen the propagation of numerous databases containing easily accessible data that are routinely used by investigators to interpret results and generate new ideas. Most investigators consider data extracted from scientific databases to be error-free. However, data generated by all experimental techniques contain errors and some, including the coordinates in the Protein Data Bank (PDB), also integrate the subjective interpretations of experimentalists. This paper explores the determinants of protein structure quality metrics used routinely by protein crystallographers. These metrics are available for most structures in the database, including the R factor, R-free, real-space correlation coefficient, Ramachandran violations etc. All structures in the PDB were analyzed for their overall quality based on nine different quality metrics. Multivariate statistical analysis revealed that while technological improvements have increased the number of structures determined, the overall quality of structures has remained constant. The quality of structures deposited by structural genomics initiatives are generally better than the quality of structures from individual investigator laboratories. The most striking result is the association between structure quality and the journal in which the structure was first published. The worst offenders are the apparently high-impact general science journals. The rush to publish high-impact work in the competitive atmosphere may have led to the proliferation of poor-quality structures.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据