4.2 Article

LAGOA: Learning automata based grasshopper optimization algorithm for feature selection in disease datasets

出版社

SPRINGER HEIDELBERG
DOI: 10.1007/s12652-021-03155-3

关键词

Grasshopper optimization algorithm; Learning automata; Two-phase mutation; Biomedical data; Feature selection; Cancer data

向作者/读者索取更多资源

This paper emphasizes the importance of feature selection in predictive modeling, especially in disease datasets. The authors introduce a wrapper-based feature selection model and an improved Grasshopper Optimization Algorithm, LAGOA, which utilizes Learning Automata (LA) and two-phase mutation for enhancing algorithm performance.
In predictive modelling it is important to use any feature selection methods as irrelevant features when used with powerful classifiers can lead to over-fitting and thus create models which fail to perform as good as when these features are not used. Particularly it is important in case of disease datasets where various features or attributes are available through the patients' medical records and many features in these datasets may not be relevant to the diagnosis of some specific disease. Wrong models in this case can be disastrous and lead to wrong diagnosis, and maybe in extreme cases lead to loss of life. To this end, we have used a wrapper based feature selection model for the said purpose. In recent years, Grasshopper Optimization Algorithm (GOA) has proved its superiority over other optimization algorithms in different research areas. In this paper, we propose an improved version of GOA, called (LAGOA), which uses Learning Automata (LA) for adjusting the parameters of GOA in an adaptive way, and two-phase mutation for enhancing exploitation capability of the algorithm. LA is used for adjusting the parameter values of each grasshopper in the population individually. In two-phase mutation the first phase reduces the number of selected features while preserving high classification accuracy, while the second phase adds relevant features which increase the classification accuracy. Proposed method has been applied to Breast Cancer (Wisconsin), Breast Cancer (Diagnosis), Statlog (Heart), Lung Cancer, SpectF Heart and Hepatitis datasets taken from UCI Machine Learning Repository. Experimental results confirm its superiority over state-of-the-art methods considered here for comparison.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据