4.7 Article

Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction

期刊

SCIENTIFIC REPORTS
卷 12, 期 1, 页码 -

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41598-022-10358-x

关键词

-

向作者/读者索取更多资源

This paper studies different variants of the k-nearest neighbour (KNN) algorithm and compares their performance in disease prediction. By implementing and experimenting on eight datasets, the study found that accuracy values ranged from 64.22% to 83.62%, with Hassanaat KNN showing the highest accuracy. The study also proposes a relative performance index based on accuracy, precision, and recall measures, identifying Hassanaat KNN as the best performing variant.
Disease risk prediction is a rising challenge in the medical domain. Researchers have widely used machine learning algorithms to solve this challenge. The k-nearest neighbour (KNN) algorithm is the most frequently used among the wide range of machine learning algorithms. This paper presents a study on different KNN variants (Classic one, Adaptive, Locally adaptive, k-means clustering, Fuzzy, Mutual, Ensemble, Hassanat and Generalised mean distance) and their performance comparison for disease prediction. This study analysed these variants in-depth through implementations and experimentations using eight machine learning benchmark datasets obtained from Kaggle, UCI Machine learning repository and OpenML. The datasets were related to different disease contexts. We considered the performance measures of accuracy, precision and recall for comparative analysis. The average accuracy values of these variants ranged from 64.22% to 83.62%. The Hassanaat KNN showed the highest average accuracy (83.62%), followed by the ensemble approach KNN (82.34%). A relative performance index is also proposed based on each performance measure to assess each variant and compare the results. This study identified Hassanat KNN as the best performing variant based on the accuracy-based version of this index, followed by the ensemble approach KNN. This study also provided a relative comparison among KNN variants based on precision and recall measures. Finally, this paper summarises which KNN variant is the most promising candidate to follow under the consideration of three performance measures (accuracy, precision and recall) for disease prediction. Healthcare researchers and stakeholders could use the findings of this study to select the appropriate KNN variant for predictive disease risk analytics.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据