4.7 Article

Scalable feature selection using ReliefF aided by locality-sensitive hashin

期刊

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS
卷 36, 期 11, 页码 6161-6179

出版社

WILEY
DOI: 10.1002/int.22546

关键词

big data; feature selection; locality-sensitive hashing; ReliefF; scalability

资金

  1. Ministerio de Economia y Competitividad [PID2019-109238GB-C2, TIN 2015-65069-C2-1-R, TIN 2015-65069-C2-2-R]
  2. Xunta de Galicia [ED431C 2018/34]
  3. European Union

向作者/读者索取更多资源

The ReliefF-LSH algorithm simplifies the costliest step of the ReliefF algorithm by approximating the nearest neighbor graph using locality-sensitive hashing. It can process large data sets and obtains better results and is more generally applicable than the original ReliefF.
Feature selection algorithms, such as ReliefF, are very important for processing high-dimensionality data sets. However, widespread use of popular and effective such algorithms is limited by their computational cost. We describe an adaptation of the ReliefF algorithm that simplifies the costliest of its step by approximating the nearest neighbor graph using locality-sensitive hashing (LSH). The resulting ReliefF-LSH algorithm can process data sets that are too large for the original ReliefF, a capability further enhanced by distributed implementation in Apache Spark. Furthermore, ReliefF-LSH obtains better results and is more generally applicable than currently available alternatives to the original ReliefF, as it can handle regression and multiclass data sets. The fact that it does not require any additional hyperparameters with respect to ReliefF also avoids costly tuning. A set of experiments demonstrates the validity of this new approach and confirms its good scalability.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据