4.7 Article

MGRFE: Multilayer Recursive Feature Elimination Based on an Embedded Genetic Algorithm for Cancer Classification

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2019.2921961

关键词

Feature extraction; Microwave integrated circuits; Genetic algorithms; Cancer; Nonhomogeneous media; Gene expression; Heuristic algorithms; Gene selection; genetic algorithm; recursive feature elimination; microarray data; cancer classification

资金

  1. National Natural Science Foundation of China [61572105, 61872418, 71774154]
  2. Natural Science Foundation of Jilin Province [20180101331JC]

向作者/读者索取更多资源

Gene selection is a challenging task aimed at choosing a minimal number of genes closely associated with a phenotype, and existing feature selection algorithms mostly utilize heuristic rules.
Microarray gene expression data have become a topic of great interest for cancer classification and for further research in the field of bioinformatics. Nonetheless, due to the large p, small n paradigm of limited biosamples and high-dimensional data, gene selection is becoming a demanding task, which is aimed at selecting a minimal number of discriminatory genes associated closely with a phenotype. Feature or gene selection is still a challenging problem owing to its nondeterministic polynomial time complexity and thus most of the existing feature selection algorithms utilize heuristic rules. A multilayer recursive feature elimination method based on an embedded integer-coded genetic algorithm, MGRFE, is proposed here, which is aimed at selecting the gene combination with minimal size and maximal information. On the basis of 19 benchmark microarray datasets including multiclass and imbalanced datasets, MGRFE outperforms state-of-the-art feature selection algorithms with better cancer classification accuracy and a smaller selected gene number. MGRFE could be regarded as a promising feature selection method for high-dimensional datasets especially gene expression data. Moreover, the genes selected by MGRFE have close biological relevance to cancer phenotypes. The source code of our proposed algorithm and all the 19 datasets used in this paper are available at https://github.com/Pengeace/MGRFE-GaRFE.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据