4.7 Article

Random subspace and random projection nearest neighbor ensembles for high dimensional data

期刊

EXPERT SYSTEMS WITH APPLICATIONS
卷 191, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2021.116078

关键词

Nearest neighbor ensemble; High dimensional data; Random subspace method; Random projection method

向作者/读者索取更多资源

The random subspace and random projection methods were investigated for forming ensembles of nearest neighbor classifiers in high dimensional feature spaces, with results showing improvements in predictive performance compared to standard nearest neighbor classifiers. The choice between the two methods depends on the type of data, with random projection outperforming random subspace for microarray and chemoinformatics datasets, while the opposite is true for image datasets. Additionally, the resulting ensembles using random projection perform on par with random forests for microarray and chemoinformatics datasets.
The random subspace and the random projection methods are investigated and compared as techniques for forming ensembles of nearest neighbor classifiers in high dimensional feature spaces. The two methods have been empirically evaluated on three types of high-dimensional datasets: microarrays, chemoinformatics, and images. Experimental results on 34 datasets show that both the random subspace and the random projection method lead to improvements in predictive performance compared to using the standard nearest neighbor classifier, while the best method to use depends on the type of data considered; for the microarray and chemoinformatics datasets, random projection outperforms the random subspace method, while the opposite holds for the image datasets. An analysis using data complexity measures, such as attribute to instance ratio and Fisher's discriminant ratio, provide some more detailed indications on what relative performance can be expected for specific datasets. The results also indicate that the resulting ensembles may be competitive with state-of-the-art ensemble classifiers; the nearest neighbor ensembles using random projection perform on par with random forests for the microarray and chemoinformatics datasets.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据