4.7 Article

Medical data mining by fuzzy modeling with selected features

期刊

ARTIFICIAL INTELLIGENCE IN MEDICINE
卷 43, 期 3, 页码 195-206

出版社

ELSEVIER
DOI: 10.1016/j.artmed.2008.04.004

关键词

feature selection; fuzzy models; data mining; medical data; diagnosis

向作者/读者索取更多资源

Objective: Medical data is often very high dimensional. Depending upon the use, some data dimensions might be more relevant than others. In processing medical data, choosing the optimal. subset of features is such important, not only to reduce the processing cost but also to improve the usefulness of the model built from the selected data. This paper presents a data mining study of medical data with fuzzy modeling methods that use feature subsets selected by some indices/methods. Methods: Specifically, three fuzzy modeling methods including the fuzzy k-nearest neighbor algorithm, a fuzzy clustering-based modeling, and the adaptive network-based fuzzy inference system are employed. For feature selection, a total of 11 indices/methods are used. Medical data mined include the Wisconsin breast cancer dataset and the Pima Indians diabetes dataset. The classification accuracy and computational time are reported. To show how good the best performer is, the globally optimal. was also found by carrying out an exhaustive testing of all possible combinations of feature subsets with three features. Results: For the Wisconsin breast cancer dataset, the best accuracy of 97.17% was obtained, which is only 0.25% tower than that was obtained by exhaustive testing. For the Pima Indians diabetes dataset, the best accuracy of 77.65% was obtained, which is only 0.13% lower than that obtained by exhaustive testing. Conclusion: This paper has shown that feature selection is important to mining medical data for reducing processing time and for increasing classification accuracy. However, not all combinations of feature selection and modeling methods are equally effective and the best combination is often data-dependent, as supported by the breast cancer and diabetes data analyzed in this paper. (C) 2008 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据