4.6 Article

The Effect of Points Dispersion on the k-nn Search in Random Projection Forests

期刊

IEEE ACCESS
卷 10, 期 -, 页码 80858-80868

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2022.3195488

关键词

k-nearest neighbor search; random projection trees; random projection forests; unsupervised learning

向作者/读者索取更多资源

Partitioning trees are efficient for k-nearest neighbor search, but kd-trees can be ineffective in high dimensions. Random projection trees (rpTrees) solve this problem and are influenced by point dispersion and the number of rpTrees in an rpForest.
Partitioning trees are efficient data structures for k-nearest neighbor search. Machine learning libraries commonly use a special type of partitioning trees called kd-trees to perform k-nn search. Unfortunately, kd-trees can be ineffective in high dimensions because they need more tree levels to decrease the vector quantization (VQ) error. Random projection trees rpTrees solve this scalability problem by using random directions to split the data. A collection of rpTrees is called rpForest. k-nn search in an rpForest is influenced by two factors: 1) the dispersion of points along the random direction and 2) the number of rpTrees in the rpForest. In this study, we investigate how these two factors affect the k-nn search with varying k values and different datasets. We found that with larger number of trees, the dispersion of points has a very limited effect on the k-nn search. One should use the original rpTree algorithm by picking a random direction regardless of the dispersion of points.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据