4.7 Article

Predicting Novel Human Gene Ontology Annotations Using Semantic Analysis

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2008.29

关键词

Gene function prediction; gene annotation; Gene Ontology; vector space model; latent semantic indexing; weighting schemes

资金

  1. NSF [DBI-0234806]
  2. NIH(NCRR) [1S10 RR017857-01]
  3. MLSC [MEDC-538, GR-352]
  4. NIH [1R21 CA10074001, 1R21 EB00990-01, 1R01 NS045207-01]
  5. NATIONAL CENTER FOR RESEARCH RESOURCES [S10RR017857] Funding Source: NIH RePORTER
  6. NATIONAL HUMAN GENOME RESEARCH INSTITUTE [R01HG003491] Funding Source: NIH RePORTER
  7. NATIONAL INSTITUTE OF BIOMEDICAL IMAGING AND BIOENGINEERING [R21EB000990] Funding Source: NIH RePORTER
  8. NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES [R01DK089167] Funding Source: NIH RePORTER
  9. NATIONAL INSTITUTE OF NEUROLOGICAL DISORDERS AND STROKE [R01NS045207] Funding Source: NIH RePORTER

向作者/读者索取更多资源

The correct interpretation of many molecular biology experiments depends in an essential way on the accuracy and consistency of the existing annotation databases. Such databases are meant to act as repositories for our biological knowledge as we acquire and refine it. Hence, by definition, they are incomplete at any given time. In this paper, we describe a technique that improves our previous method for predicting novel GO annotations by extracting implicit semantic relationships between genes and functions. In this work, we use a vector space model and a number of weighting schemes in addition to our previous latent semantic indexing approach. The technique described here is able to take into consideration the hierarchical structure of the Gene Ontology (GO) and can weight differently GO terms situated at different depths. The prediction abilities of 15 different weighting schemes are compared and evaluated. Nine such schemes were previously used in other problem domains, while six of them are introduced in this paper. The best weighting scheme was a novel scheme, n2tn. Out of the top 50 functional annotations predicted using this weighting scheme, we found support in the literature for 84 percent of them, while 6 percent of the predictions were contradicted by the existing literature. For the remaining 10 percent, we did not find any relevant publications to confirm or contradict the predictions. The n2tn weighting scheme also outperformed the simple binary scheme used in our previous approach.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据