4.7 Article

Accelerating wrapper-based feature selection with K-nearest-neighbor

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 83, Issue -, Pages 81-91

Publisher

ELSEVIER
DOI: 10.1016/j.knosys.2015.03.009

Keywords

Gene selection; Microarray data; Wrapper; Filter; k-nearest-neighbor

Funding

  1. 111 Project of the Ministry of Education and State Administration of Foreign Experts Affairs [B14025]
  2. International S&T Cooperation Program of China [2014DFA11310]
  3. Major Project of the Natural Science Foundation for Anhui Province Higher Education [KJ2011ZD06]
  4. Natural Science Foundation of China [61472057, 61305064, 51274078]
  5. University Featured Project of the Ministry of Education [TS2013HFGY031]
  6. China Scholarship Council

Ask authors/readers for more resources

Wrapper-based feature subset selection (FSS) methods tend to obtain better classification accuracy than filter methods but are considerably more time-consuming, particularly for applications that have thousands of features, such as microarray data analysis. Accelerating this process without degrading its high accuracy would be of great value for gene expression analysis. In this study, we explored how to reduce the time complexity of wrapper-based FSS with an embedded K-Nearest-Neighbor (KNN) classifier. Instead of considering KNN as a black box, we proposed to construct a classifier distance matrix and incrementally update the matrix to accelerate the calculation of the relevance criteria in evaluating the quality of the candidate features. Extensive experiments on eight publicly available microarray datasets were first conducted to demonstrate the effectiveness of the wrapper methods with KNN for selecting informative features. To demonstrate the performance gain in terms of time cost reduction, we then conducted experiments on the eight microarray datasets with the embedded KNN classifiers and analyzed the theoretical time/space complexity. Both the experimental results and theoretical analysis demonstrated that the proposed approach markedly accelerates the wrapper-based feature selection process without degrading the high classification accuracy, and the space complexity analysis indicated that the additional space overhead is affordable in practice. (C) 2015 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available