4.5 Article

Joint semiparametric kernel network regression

Journal

STATISTICS IN MEDICINE
Volume 42, Issue 28, Pages 5247-5265

Publisher

WILEY
DOI: 10.1002/sim.9910

Keywords

graphical model; least square kernel machine; semiparametric model

Ask authors/readers for more resources

Variable selection and graphical modeling are crucial in analyzing highly correlated and high-dimensional data. Gaussian graphical models have limitations in handling nonadditive, nonparametric regression with high-dimensional variables. This paper proposes a joint semiparametric kernel network regression method to address this limitation and provide a connection between variable selection and graphical modeling.
Variable selection and graphical modeling play essential roles in highly correlated and high-dimensional (HCHD) data analysis. Variable selection methods have been developed under both parametric and nonparametric model settings. However, variable selection for nonadditive, nonparametric regression with high-dimensional variables is challenging due to complications in modeling unknown dependence structures among HCHD variables. Gaussian graphical models are a popular and useful tool for investigating the conditional dependence between variables via estimating sparse precision matrices. For a given class of interest, the estimated precision matrices can be mapped onto networks for visualization. However, the limitation of Gaussian graphical models is that they are only applicable to discretized response variables and for the case when plog(p)MUCH LESS-THANn$$ p\log (p)\ll n $$, where p$$ p $$ is the number of variables and n$$ n $$ is the sample size. They are necessary to develop a joint method for variable selection and graphical modeling. To the best of our knowledge, the methods for simultaneously selecting variable selection and estimating networks among variables in the semiparametric regression settings are quite limited. Hence, in this paper, we develop a joint semiparametric kernel network regression method to solve this limitation and to provide a connection between them. Our approach is a unified and integrated method that can simultaneously identify important variables and build a network among those variables. We developed our approach under a semiparametric kernel machine regression framework, which can allow for nonlinear or nonadditive associations and complicated interactions among the variables. The advantages of our approach are that it can (1) simultaneously select variables and build a network among HCHD variables under a regression setting; (2) model unknown and complicated interactions among the variables and estimate the network among these variables; (3) allow for any form of semiparametric model, including non-additive, nonparametric model; and (4) provide an interpretable network that considers important variables and a response variable. We demonstrate our approach using a simulation study and real application on genetic pathway-based analysis.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available