4.7 Article

Generalisation Power Analysis for finding a stable set of features using evolutionary algorithms for feature selection

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 231, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.knosys.2021.107450

Keywords

Feature selection; Generalisation Power Analysis; Generalisation Power Index; Machine learning; Evolutionary computation; Feature selection stability

Funding

  1. The Leverhulme Trust, United Kingdom Research Project [RPG-2016-252]

Ask authors/readers for more resources

Evolutionary Computation (EC) algorithms are powerful techniques for feature selection, but they often suffer from the stability issue of reaching different solutions in each run. This paper introduces a novel algorithm called Generalisation Power Analysis (GPA) to evaluate feature subsets based on their generalisation power over multiple classifiers, outperforming alternative methods in achieving high generalisation power. Despite requiring more computation time, using GPA during feature selection results in a robust prediction model developed with features not biased towards a specific classifier.
Evolutionary Computation (EC) algorithms are powerful techniques for feature selection tasks however, they reach different solutions in each run, and this is known as the stability issue. Existing solutions to finding a stable subset of features when using an EC algorithm include aggregation and frequency-based methods. These methods may return feature subsets that achieve weak or inconsistent classification performance when utilised to build classifiers, and this limitation is known as 'lack of generalisation power'. To address this limitation, this paper proposes a novel algorithm called Generalisation Power Analysis (GPA) that measures the performance of feature subsets in terms of generalisation power and hence evaluates their ability to achieve optimal or near-optimal accuracy over multiple classifiers. GPA has been designed to work with the stochastic nature of EC algorithms. Experiments with eleven benchmark datasets revealed that the proposed GPA approach consistently outperformed alternative methods in finding subsets that achieved high generalisation power. Although GPA requires relatively higher computation time compared to alternative approaches as it embeds multiple classifiers, the advantages of using GPA during feature selection outweigh this limitation since the outcome will be a robust prediction model that has been developed using a subset of features that are not biased towards a specific classifier. (C) 2021 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available