4.5 Article

Novel gene and gene model detection using a whole genome open reading frame analysis in proteomics

期刊

GENOME BIOLOGY
卷 7, 期 4, 页码 -

出版社

BMC
DOI: 10.1186/gb-2006-7-4-r35

关键词

-

资金

  1. NCRR NIH HHS [P41 RR018627] Funding Source: Medline
  2. NIDA NIH HHS [U54 DA021519] Funding Source: Medline
  3. NLM NIH HHS [R01 LM008106] Funding Source: Medline
  4. PHS HHS [84982] Funding Source: Medline
  5. NATIONAL CENTER FOR RESEARCH RESOURCES [P41RR018627] Funding Source: NIH RePORTER
  6. NATIONAL INSTITUTE ON DRUG ABUSE [U54DA021519] Funding Source: NIH RePORTER
  7. NATIONAL LIBRARY OF MEDICINE [R01LM008106] Funding Source: NIH RePORTER

向作者/读者索取更多资源

Background: Defining the location of genes and the precise nature of gene products remains a fundamental challenge in genome annotation. Interrogating tandem mass spectrometry data using genomic sequence provides an unbiased method to identify novel translation products. A six-frame translation of the entire human genome was used as the query database to search for novel blood proteins in the data from the Human Proteome Organization Plasma Proteome Project. Because this target database is orders of magnitude larger than the databases traditionally employed in tandem mass spectra analysis, careful attention to significance testing is required. Confidence of identification is assessed using our previously described Poisson statistic, which estimates the significance of multi-peptide identifications incorporating the length of the matching sequence, number of spectra searched and size of the target sequence database. Results: Applying a false discovery rate threshold of 0.05, we identified 282 significant open reading frames, each containing two or more peptide matches. There were 627 novel peptides associated with these open reading frames that mapped to a unique genomic coordinate placed within the start/stop points of previously annotated genes. These peptides matched 1,110 distinct tandem MS spectra. Peptides fell into four categories based upon where their genomic coordinates placed them relative to annotated exons within the parent gene. Conclusion: This work provides evidence for novel alternative splice variants in many previously annotated genes. These findings suggest that annotation of the genome is not yet complete and that proteomics has the potential to further add to our understanding of gene structures.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据