4.8 Article

A question of size: the eukaryotic proteome and the problems in defining it

Journal

NUCLEIC ACIDS RESEARCH
Volume 30, Issue 5, Pages 1083-1090

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/nar/30.5.1083

Keywords

-

Funding

  1. NATIONAL CANCER INSTITUTE [R01CA077808] Funding Source: NIH RePORTER
  2. NATIONAL HUMAN GENOME RESEARCH INSTITUTE [P50HG002357] Funding Source: NIH RePORTER
  3. NCI NIH HHS [CA77808, R01 CA077808] Funding Source: Medline
  4. NHGRI NIH HHS [HG02357-01, P50 HG002357] Funding Source: Medline

Ask authors/readers for more resources

We discuss the problems in defining the extent of the proteomes for completely sequenced eukaryotic organisms (i.e. the total number of protein-coding sequences), focusing on yeast, worm, fly and human. (I) Six years after completion of its genome sequence, the true size of the yeast proteome is still not defined. New small genes are still being discovered, and a large number of existing annotations are being called into question, with these questionable ORFs (qORFs) comprising up to one-fifth of the 'current' proteome. We discuss these in the context of an ideal genome-annotation strategy that considers the proteome as a rigorously defined subset of all possible coding sequences ('the orfome'). (ii) Despite the greater apparent complexity of the fly (more cells, more complex physiology, longer lifespan), the nematode worm appears to have more genes. To explain this, we compare the annotated proteomes of worm and fly, relating to both genome-annotation and genome evolution issues. (iii) The unexpectedly small size of the gene complement estimated for the complete human genome provoked much public debate about the nature of biological complexity. However, in the first instance, for the human genome, the relationship between gene number and proteome size is far from simple. We survey the current estimates for the numbers of human genes and, from this, we estimate a range for the size of the human proteome. The determination of this is substantially hampered by the unknown extent of the cohort of pseudogenes ('dead' genes), in combination with the prevalence of alternative splicing. (Further information relating to yeast is available at http://genecensus.org/yeast/ orfome).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available