4.5 Article

Group additive regression models for genomic data analysis

期刊

BIOSTATISTICS
卷 9, 期 1, 页码 100-113

出版社

OXFORD UNIV PRESS
DOI: 10.1093/biostatistics/kxm015

关键词

AFT models; boosting; gradient descent boosting; pathway

资金

  1. NATIONAL INSTITUTE OF ENVIRONMENTAL HEALTH SCIENCES [R01ES009911] Funding Source: NIH RePORTER
  2. NIEHS NIH HHS [ES009911] Funding Source: Medline

向作者/读者索取更多资源

One important problem in genomic research is to identify genomic features such as gene expression data or DNA single nucleotide polymorphisms (SNPs) that are related to clinical phenotypes. Often these genomic data can be naturally divided into biologically meaningful groups such as genes belonging to the same pathways or SNPs within genes. In this paper, we propose group additive regression models and a group gradient descent boosting procedure for identifying groups of genomic features that are related to clinical phenotypes. Our simulation results show that by dividing the variables into appropriate groups, we can obtain better identification of the group features that are related to the phenotypes. In addition, the prediction mean square errors are also smaller than the component-wise boosting procedure. We demonstrate the application of the methods to pathway-based analysis of microarray gene expression data of breast cancer. Results from analysis of a breast cancer microarray gene expression data set indicate that the pathways of metalloendopeptidases (MMPs) and MMP inhibitors, as well as cell proliferation, cell growth, and maintenance are important to breast cancer-specific survival.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据