4.5 Article

Efficient feature selection filters for high-dimensional data

Journal

PATTERN RECOGNITION LETTERS
Volume 33, Issue 13, Pages 1794-1804

Publisher

ELSEVIER
DOI: 10.1016/j.patrec.2012.05.019

Keywords

Feature selection; Filters; Dispersion measures; Similarity measures; High-dimensional data

Funding

  1. Polytechnic Institute of Lisbon [SFRH/PROTEC/67605/2010]
  2. FCT project [PEst-OE/EEI/LA0008/2011]

Ask authors/readers for more resources

Feature selection is a central problem in machine learning and pattern recognition. On large datasets (in terms of dimension and/or number of instances), using search-based or wrapper techniques can be cornputationally prohibitive. Moreover, many filter methods based on relevance/redundancy assessment also take a prohibitively long time on high-dimensional datasets. In this paper, we propose efficient unsupervised and supervised feature selection/ranking filters for high-dimensional datasets. These methods use low-complexity relevance and redundancy criteria, applicable to supervised, semi-supervised, and unsupervised learning, being able to act as pre-processors for computationally intensive methods to focus their attention on smaller subsets of promising features. The experimental results, with up to 10(5) features, show the time efficiency of our methods, with lower generalization error than state-of-the-art techniques, while being dramatically simpler and faster. (c) 2012 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available