4.8 Article

Random Features for Kernel Approximation: A Survey on Algorithms, Theory, and Beyond

Journal

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2021.3097011

Keywords

Kernel; Approximation algorithms; Taxonomy; Scalability; Risk management; Prediction algorithms; Loss measurement; Random features; kernel approximation; generalization properties; over-parameterized models

Funding

  1. European Research Council through the European Union's Horizon 2020 research and innovation program/ERC Advanced Grant EDUALITY [787960]
  2. Research Council KU Leuven: Through Optimization frameworks for deep kernel machines [C14/18/068]
  3. Flemish Government through FWO [GOA4917N]
  4. Flemish Government (AI Research Program)
  5. Ford KU Leuven Research Alliance Project [KUL0076]
  6. EU H2020 ICT-48 Network TAILOR (Foundations of Trustworthy AI -Integrating Reasoning, Learning and Optimization), Leuven. AI Institute
  7. National Natural Science Foundation of China [61977046]
  8. National Science Foundation [CCF-1657420, CCF-1704828]
  9. SJTU Global Strategic Partnership Fund (2020 SJTUCORNELL)
  10. Shanghai Municipal Science and Technology Major Project [2021SHZDZX0102]

Ask authors/readers for more resources

This survey reviews the research on random features from the past ten years, summarizing the motivations, characteristics, and contributions of representative algorithms. The theoretical results related to the key question of the number of random features needed for high approximation quality are discussed. The survey also evaluates popular algorithms on large-scale benchmark datasets and explores the relationship between random features and modern deep neural networks. It serves as an introduction to the topic and a guide for practitioners interested in applying algorithms and understanding theoretical results.
The class of random features is one of the most popular techniques to speed up kernel methods in large-scale problems. Related works have been recognized by the NeurIPS Test-of-Time award in 2017 and the ICML Best Paper Finalist in 2019. The body of work on random features has grown rapidly, and hence it is desirable to have a comprehensive overview on this topic explaining the connections among various algorithms and theoretical results. In this survey, we systematically review the work on random features from the past ten years. First, the motivations, characteristics and contributions of representative random features based algorithms are summarized according to their sampling schemes, learning procedures, variance reduction properties and how they exploit training data. Second, we review theoretical results that center around the following key question: how many random features are needed to ensure a high approximation quality or no loss in the empirical/expected risks of the learned estimator. Third, we provide a comprehensive evaluation of popular random features based algorithms on several large-scale benchmark datasets and discuss their approximation quality and prediction performance for classification. Last, we discuss the relationship between random features and modern over-parameterized deep neural networks (DNNs), including the use of high dimensional random features in the analysis of DNNs as well as the gaps between current theoretical and empirical results. This survey may serve as a gentle introduction to this topic, and as a users' guide for practitioners interested in applying the representative algorithms and understanding theoretical results under various technical assumptions. We hope that this survey will facilitate discussion on the open problems in this topic, and more importantly, shed light on future research directions. Due to the page limit, we suggest the readers refer to the full version of this survey https://arxiv.org/abs/2004.11154.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available