4.7 Article

Ensemble feature selection for stable biomarker identification and cancer classification from microarray expression data

Journal

COMPUTERS IN BIOLOGY AND MEDICINE
Volume 142, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compbiomed.2021.105208

Keywords

Feature selection; Gene expression profiles; Stability; Ensemble learning

Funding

  1. Guangdong Basic and Applied Basic Research Foundation [2020A1515011499]
  2. Natural Science Foundation of China [62176082]

Ask authors/readers for more resources

In this study, an ensemble feature selection framework is proposed to improve the discrimination and stability of features. By using sampling and aggregation strategies, accurate feature selection is achieved in small sample and high dimensionality scenarios, leading to improved diagnostic accuracy and understanding of disease mechanisms.
Microarray technology facilitates the simultaneous measurement of expression of tens of thousands of genes and enables us to study cancers and tumors at the molecular level. Because microarray data are typically characterized by small sample size and high dimensionality, accurate and stable feature selection is thus of fundamental importance to the diagnostic accuracy and deep understanding of disease mechanism. Hence, we in this study present an ensemble feature selection framework to improve the discrimination and stability of finally selected features. Specifically, we utilize sampling techniques to obtain multiple sampled datasets, from each of which we use a base feature selector to select a subset of features. Afterwards, we develop two aggregation strategies to combine multiple feature subsets into one set. Finally, comparative experiments are conducted on four publicly available microarray datasets covering both binary and multi-class cases in terms of classification accuracy and three stability metrics. Results show that the proposed method obtains better stability scores and achieves comparable to and even better classification performance than its competitors.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available