4.5 Article

Integrating Biological Knowledge with Gene Expression Profiles for Survival Prediction of Cancer

期刊

JOURNAL OF COMPUTATIONAL BIOLOGY
卷 16, 期 2, 页码 265-278

出版社

MARY ANN LIEBERT, INC
DOI: 10.1089/cmb.2008.12TT

关键词

gene expression; gene ontology; microarrays; pathway analysis; survival prediction

资金

  1. NHLBI SCCOR [1-P50-HL-077107]
  2. NICHD [5P30-HD015052-25]
  3. NIH [1P50-MH078028-01A1]

向作者/读者索取更多资源

Due to the large variability in survival times between cancer patients and the plethora of genes on microarrays unrelated to outcome, building accurate prediction models that are easy to interpret remains a challenge. In this paper, we propose a general strategy for improving performance and interpretability of prediction models by integrating gene expression data with prior biological knowledge. First, we link gene identifiers in expression dataset with gene annotation databases such as Gene Ontology (GO). Then we construct supergenes for each gene category by summarizing information from genes related to outcome using a modified principal component analysis (PCA) method. Finally, instead of using genes as predictors, we use these supergenes representing information from each gene category as predictors to predict survival outcome. In addition to identifying gene categories associated with outcome, the proposed approach also carries out additional within-category selection to select important genes within each gene set. We show, using two real breast cancer microarray datasets, that the prediction models constructed based on gene sets (or pathway) information outperform the prediction models based on expression values of single genes, with improved prediction accuracy and interpretability.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据