4.7 Article

Binary coordinate ascent: An efficient optimization technique for feature subset selection for machine learning

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 110, Issue -, Pages 191-201

Publisher

ELSEVIER
DOI: 10.1016/j.knosys.2016.07.026

Keywords

Machine learning; Classification; Feature selection; Wrapper; Optimization; Heuristic

Ask authors/readers for more resources

Feature subset selection (FSS) has been an active area of research in machine learning. A number of techniques have been developed for selecting an optimal or sub-optimal subset of features, because it is a major factor to determine the performance of a machine-learning technique. In this paper, we propose and develop a novel optimization technique, namely, a binary coordinate ascent (BCA) algorithm that is an iterative deterministic local optimization that can be coupled with wrapper or filter FSS. The algorithm searches throughout the space of binary coded input variables by iteratively optimizing the objective function in each dimension at a time. We investigated our BCA approach in wrapper-based FSS under area under the receiver-operating-characteristic (ROC) curve (AUC) criterion for the best subset of features in classification. We evaluated our BCA-based FSS in optimization of features for support vector machine, multilayer perceptron, and Naive Bayes classifiers with 12 datasets. Our experimental datasets are distinct in terms of the number of attributes (ranging from 18 to 11,340), and the number of classes (binary or multi-class classification). The efficiency in terms of the number of subset evaluations was improved substantially (by factors of 5-37) compared with two popular FSS meta-heuristics, i.e., sequential forward selection (SFS) and sequential floating forward selection (SFFS), while the classification performance for unseen data was maintained. (C) 2016 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available