4.3 Article

Using the EM algorithm for Bayesian variable selection in logistic regression models with related covariates

Journal

JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION
Volume 88, Issue 3, Pages 575-596

Publisher

TAYLOR & FRANCIS LTD
DOI: 10.1080/00949655.2017.1398255

Keywords

Bayesian inference; binary outcomes; deterministic annealing; expectation-maximization; grouped covariates; heredity constraint; inheritance property; variable selection; 62F15; 62J12; 68U20

Funding

  1. University of Texas School Health Science at Houston Center School of Public Health, Cancer Education and Career Development Program National Cancer Institute/NIH [R25 CA57712]
  2. University of Texas Health Science Center at Houston School of Public Health, Training Program in Biostatistics National Institute of General Medical Sciences [T32GM074902]
  3. National Institute of Child Health and Human Development [1R03HD083674]
  4. Michael & Susan Dell Foundation, Michael & Susan Dell Center for Healthy Living
  5. EUNICE KENNEDY SHRIVER NATIONAL INSTITUTE OF CHILD HEALTH & HUMAN DEVELOPMENT [R03HD083674] Funding Source: NIH RePORTER
  6. NATIONAL CANCER INSTITUTE [R25CA057712] Funding Source: NIH RePORTER
  7. NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES [T32GM074902] Funding Source: NIH RePORTER

Ask authors/readers for more resources

We develop a Bayesian variable selection method for logistic regression models that can simultaneously accommodate qualitative covariates and interaction terms under various heredity constraints. We use expectation-maximization variable selection (EMVS) with a deterministic annealing variant as the platform for our method, due to its proven flexibility and efficiency. We propose a variance adjustment of the priors for the coefficients of qualitative covariates, which controls false-positive rates, and a flexible parameterization for interaction terms, which accommodates user-specified heredity constraints. This method can handle all pairwise interaction terms as well as a subset of specific interactions. Using simulation, we show that this method selects associated covariates better than the grouped LASSO and the LASSO with heredity constraints in various exploratory research scenarios encountered in epidemiological studies. We apply our method to identify genetic and non-genetic risk factors associated with smoking experimentation in a cohort of Mexican-heritage adolescents.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available