4.7 Review

A review of microarray datasets and applied feature selection methods

Journal

INFORMATION SCIENCES
Volume 282, Issue -, Pages 111-135

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2014.05.042

Keywords

Feature selection; Microarray data; Unbalanced data; Dataset shift

Funding

  1. Secretaria de Estado de Investigacion of the Spanish Government [TIN 2011-28488, TIN 2012-37954]
  2. Conselleria de Industria of the Xunta de Galicia [CN2011/007, CN2012/211]
  3. FEDER funds of the European Union
  4. Xunta de Galicia under Plan I2C Grant Program
  5. Genii Grant
  6. [P11-TIC-9704]

Ask authors/readers for more resources

Microarray data classification is a difficult challenge for machine learning researchers due to its high number of features and the small sample sizes. Feature selection has been soon considered a de facto standard in this field since its introduction, and a huge number of feature selection methods were utilized trying to reduce the input dimensionality while improving the classification performance. This paper is devoted to reviewing the most up-to-date feature selection methods developed in this field and the microarray databases most frequently used in the literature. We also make the interested reader aware of the problematic of data characteristics in this domain, such as the imbalance of the data, their complexity, or the so-called dataset shift. Finally, an experimental evaluation on the most representative datasets using well-known feature selection methods is presented, bearing in mind that the aim is not to provide the best feature selection method, but to facilitate their comparative study by the research community. (C) 2014 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available