4.7 Article

Feature selection based on artificial bee colony and gradient boosting decision tree

Journal

APPLIED SOFT COMPUTING
Volume 74, Issue -, Pages 634-642

Publisher

ELSEVIER
DOI: 10.1016/j.asoc.2018.10.036

Keywords

Bee colony algorithm; Decision tree; Feature selection; Dimensionality reduction

Funding

  1. National Natural Science Foundation of China [3177167, 31371533, 31671589]
  2. Anhui Foundation for Science and Technology Major Project, China [16030701092]
  3. Key Laboratory of Agricultural Electronic Commerce, Ministry of Agriculture of China [AEC2018003, AEC2018006]
  4. Anhui Foundation for Natural Science Major Project, China of the Higher Education Institutions [kJ2016A836]

Ask authors/readers for more resources

Data from many real-world applications can be high dimensional and features of such data are usually highly redundant. Identifying informative features has become an important step for data mining to not only circumvent the curse of dimensionality but to reduce the amount of data for processing. In this paper, we propose a novel feature selection method based on bee colony and gradient boosting decision tree aiming at addressing problems such as efficiency and informative quality of the selected features. Our method achieves global optimization of the inputs of the decision tree using the bee colony algorithm to identify the informative features. The method initializes the feature space spanned by the dataset. Less relevant features are suppressed according to the information they contribute to the decision making using an artificial bee colony algorithm. Experiments are conducted with two breast cancer datasets and six datasets from the public data repository. Experimental results demonstrate that the proposed method effectively reduces the dimensions of the dataset and achieves superior classification accuracy using the selected features. (C) 2018 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available