4.7 Article

Data mining for case-based reasoning in high-dimensional biological domains

期刊

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2005.124

关键词

machine learning; data mining; clustering; feature selection; case-based reasoning classifiers; microarray data analysis; mass spectrometry data analysis; biomarker discovery

向作者/读者索取更多资源

Case-based reasoning (CBR) is a suitable paradigm for class discovery in molecular biology, where the rules that define the domain knowledge are difficult to obtain and the number and the complexity of the rules affecting the problem are too large for formal knowledge representation. To extend the capabilities of CBR, we propose the mixture of experts for case-based reasoning (MOE4CBR), a method that combines an ensemble of CBR classifiers with spectral clustering and logistic regression. Our approach not only achieves higher prediction accuracy, but also leads to the selection of a subset of features that have meaningful relationships with their class labels. We evaluate MOE4CBR by applying the method to a CBR system called TA3-a computational framework for CBR systems. For two ovarian mass spectrometry data sets, the prediction accuracy improves from 80 percent to 93 percent and from 90 percent to 98.4 percent, respectively. We also apply the method to leukemia and lung microarray data sets with prediction accuracy improving from 65 percent to 74 percent and from 60 percent to 70 percent, respectively. Finally, we compare our list of discovered biomarkers with the lists of selected biomarkers from other studies for the mass spectrometry data sets.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据