期刊
BRIEFINGS IN BIOINFORMATICS
卷 15, 期 2, 页码 279-291出版社
OXFORD UNIV PRESS
DOI: 10.1093/bib/bbs087
关键词
entropy; gene-centric association; mutual information; set-based association
资金
- US National Science Foundation [DMS-1209112, IOS-1237969]
- Direct For Mathematical & Physical Scien
- Division Of Mathematical Sciences [1209112] Funding Source: National Science Foundation
- Division Of Integrative Organismal Systems
- Direct For Biological Sciences [1237969] Funding Source: National Science Foundation
Set-based association studies based on genes or pathways have shown great promise in interpreting association signals associated with complex diseases. These approaches are particularly useful when variants in a set have moderate effects and are difficult to be detected with single marker analysis, especially when variants function jointly in a complicated manner. The set-based analyses use a summary statistic such as the maximum or average of individual signal (e.g. a chi-square statistic) over all variants in a set, or consider their joint distribution to assess the significance of the set. The signal obtained with this treatment, however, could be potentially diluted when noisy variants are not taken good care of, leading to either inflated false negatives or false positives. Thus, the selection of disease informative single-nucleotide polymorphism (diSNPs) plays a crucial role in improving the power of the set-based association study. In this work, we propose an efficient diSNP selection method based on the information theory. We select diSNP variants by considering their relative information contribution to a disease status, which is different from the usual tag SNP selection. The relative merit of pre-selecting diSNPs in a set-based association analysis is demonstrated through extensive simulation studies and real data analysis.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据