4.7 Article

Surrogate Sample-Assisted Particle Swarm Optimization for Feature Selection on High-Dimensional Data

Journal

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION
Volume 27, Issue 3, Pages 595-609

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TEVC.2022.3175226

Keywords

Clustering algorithms; Computational efficiency; Feature extraction; Particle swarm optimization; Optimization; Costs; Training; Ensemble surrogate; feature selection (FS); particle swarm optimization; surrogate-assisted evolutionary optimization

Ask authors/readers for more resources

This article proposes a hybrid feature selection algorithm using surrogate sample-assisted particle swarm optimization (SS-PSO), which divides the sample and feature spaces concurrently to reduce the computational cost and search space. Experimental results show that SS-PSO can obtain good feature subsets at the smallest computational cost on most datasets, making it a highly competitive method for high-dimensional feature selection.
With the increase of the number of features and the sample size, existing feature selection (FS) methods based on evolutionary optimization still face challenges such as the curse of dimensionality and the high computational cost. In view of this, dividing or clustering the sample and feature spaces at the same time, this article proposes a hybrid FS algorithm using surrogate sample-assisted particle swarm optimization (SS-PSO). First, a nonrepetitive uniform sampling strategy is employed to divide the whole sample set into several small-size sample subsets. Regarding each sample subset as a surrogate unit, next, a collaborative feature clustering mechanism is proposed to divide the feature space, with the purpose of reducing both the computational cost of clustering feature and the search space of PSO. Following that, an ensemble surrogate-assisted integer PSO is proposed. To ensure the prediction accuracy of ensemble surrogate when evaluating particles, an ensemble surrogate construction and management strategy is designed. Since the whole sample set is replaced by a small number of surrogate units, SS-PSO significantly reduces the cost of evaluating particles in PSO. Finally, the proposed algorithm is applied to some typical datasets, and compared with six typical evolutionary FS algorithms, as well as its several variant algorithms. The experimental results show that SS-PSO can obtain good feature subsets at the smallest computational cost on most of datasets. All verify that SS-PSO is a highly competitive method for high-dimensional FS.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available