4.7 Article

Deep-gKnock: Nonlinear group-feature selection with deep neural networks

Journal

NEURAL NETWORKS
Volume 135, Issue -, Pages 139-147

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.neunet.2020.12.004

Keywords

Deep neural networks; False discovery rate; Group feature selection; Knockoffs

Ask authors/readers for more resources

Feature selection is crucial in high-dimensional data analysis, and group structure among features naturally occurs in scientific problems. The Deep-gKnock method proposed in this study uses a new deep neural network architecture and knockoff technique to achieve nonlinear group-feature selection with controlled group-wise False Discovery Rate, demonstrating superior performance in high-dimensional synthetic data experiments.
Feature selection is central to contemporary high-dimensional data analysis. Group structure among features arises naturally in various scientific problems. Many methods have been proposed to incorporate the group structure information into feature selection. However, these methods are normally restricted to a linear regression setting. To relax the linear constraint, we design a new Deep Neural Network (DNN) architecture and integrating it with the recently proposed knockoff technique to perform nonlinear group-feature selection with controlled group-wise False Discovery Rate (gFDR). Experimental results on high-dimensional synthetic data demonstrate that our method achieves the highest power and accurate gFDR control compared with state-of-the-art methods. The performance of Deep-gKnock is especially superior in the following five situations: (1) nonlinearity relationship; (2) dimension p greater than sample size n; (3) high between-group correlation; (4) high within-group correlation; (5) large number of associated groups. And Deep-gKnock is also demonstrated to be robust to the misspecification of the feature distribution and the change of network architecture. Moreover, Deep-gKnock achieves scientifically meaningful group-feature selection results for cutting-edge real world datasets. (c) 2020 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available