4.6 Article

Variable selection for large p small n regression models with incomplete data:: Mapping QTL with epistases

Journal

BMC BIOINFORMATICS
Volume 9, Issue -, Pages -

Publisher

BMC
DOI: 10.1186/1471-2105-9-251

Keywords

-

Funding

  1. NIGMS NIH HHS [R01 GM083606, 1R01GM083606-01] Funding Source: Medline

Ask authors/readers for more resources

Background: Identifying quantitative trait loci (QTL) for both additive and epistatic effects raises the statistical issue of selecting variables from a large number of candidates using a small number of observations. Missing trait and/or marker values prevent one from directly applying the classical model selection criteria such as Akaike's information criterion (AIC) and Bayesian information criterion (BIC). Results: We propose a two-step Bayesian variable selection method which deals with the sparse parameter space and the small sample size issues. The regression coefficient priors are flexible enough to incorporate the characteristic of large p small n data. Specifically, sparseness and possible asymmetry of the significant coefficients are dealt with by developing a Gibbs sampling algorithm to stochastically search through low-dimensional subspaces for significant variables. The superior performance of the approach is demonstrated via simulation study. We also applied it to real QTL mapping datasets. Conclusion: The two-step procedure coupled with Bayesian classification offers flexibility in modeling large p small n data, especially for the sparse and asymmetric parameter space. This approach can be extended to other settings characterized by high dimension and low sample size.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available