4.6 Article

HBDFA: An intelligent nature-inspired computing with high-dimensional data analytics

期刊

出版社

SPRINGER
DOI: 10.1007/s11042-023-16039-9

关键词

Binary dragonfly algorithm; COVID-19; Text mining; Nature inspired algorithms; Feature selection

向作者/读者索取更多资源

The rapid development of data science has led to the emergence of high-dimensional datasets in machine learning. The curse of dimensionality is a significant problem caused by high-dimensional data with a small sample size. This paper proposes a novel hybrid binary dragonfly algorithm (HBDFA) that incorporates a distance-based similarity evaluation algorithm to select the most discriminating features. The model achieved promising results in terms of accuracy and feature selection.
The rapid development of data science has led to the emergence of high-dimensional datasets in machine learning. The curse of dimensionality is a significant problem caused by high-dimensional data with a small sample size. This paper proposes a novel hybrid binary dragonfly algorithm (HBDFA) in which a distance-based similarity evaluation algorithm is embedded before the dragonfly algorithm (DA) searching behavior to select the most discriminating features. The two-step feature selection mechanism of HBDFA enables the method to explore the feature space reduced by the distance-based similarity evaluation algorithm. The model was evaluated on two datasets. The first dataset contained 200 reports from 4 evenly distributed categories of Daily Mail Online: COVID-19, economy, science, and sports. The second dataset was the publicly available Spam dataset. The proposed model is compared with binary versions of four popular metaheuristic algorithms. The model achieved an accuracy rate of 96.75% by reducing 66.5% of the top 100 features determined on the first dataset. Results on the Spam dataset reveal that HBDFA gives the best classification results with over 95% accuracy. The experimental results show the superiority of HBDFA in searching high-dimensional data, improving classification results, and reducing the number of selected features.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据