4.6 Article

Sparse modeling of spatial environmental variables associated with asthma

Journal

JOURNAL OF BIOMEDICAL INFORMATICS
Volume 53, Issue -, Pages 320-329

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jbi.2014.12.005

Keywords

Asthma; Sparsity; Spatial statistics; Environmental variables; Electronic health record

Funding

  1. Clinical and Translational Science Award program through the National Center for Research Resources [1UL1RR025011]
  2. National Center for Advancing Translational Sciences [9U54TR000021]
  3. NIH [T32 GM008692]
  4. National Heart Lung and Blood Institute Fellowship [F30HL112491]
  5. Wisconsin Division of Public Health from the Center for Disease Control and Prevention through the Wisconsin Environmental Public Health Tracking grant [1U38EH000951-01]
  6. Public Health Improvement Initiative [5U58CD001316-02]

Ask authors/readers for more resources

Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin's Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5-50 years over a three-year period. Each patient's home address was geocoded to one of 3456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin's geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. (C) 2014 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available