4.5 Article

Percolation of annotation errors through hierarchically structured protein sequence databases

Journal

MATHEMATICAL BIOSCIENCES
Volume 193, Issue 2, Pages 223-234

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.mbs.2004.08.001

Keywords

annotation errors; biological function; database quality; hierarchical classification; homology; probability model; protein database; protein sequence

Funding

  1. Medical Research Council [MC_U105260556] Funding Source: Medline

Ask authors/readers for more resources

Databases of protein sequences have grown rapidly in recent years as a result of genome sequencing projects. Annotating protein sequences with descriptions of their biological function ideally requires careful experimentation, but this work lags far behind. Instead, biological function is often imputed by copying annotations from similar protein sequences. This gives rise to annotation errors, and more seriously, to chains of misannotation. [Percolation of annotation errors in a database of protein sequences (2002)] developed a probabilistic framework for exploring the consequences of this percolation of errors through protein databases, and applied their theory to a simple database model. Here we apply the theory to hierarchically structured protein sequence databases, and draw conclusions about database quality at different levels of the hierarchy. (c) 2005 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available