☆ 4.7 Article

Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings

JOURNAL OF MACHINE LEARNING RESEARCH (2021)

Journal

JOURNAL OF MACHINE LEARNING RESEARCH

Volume 22, Issue -, Pages 1-59

Publisher

MICROTOME PUBL

Keywords

Adjacency Matrix; Spectral Embedding; Graph Laplacian; Out-of-Sample Extension; Random Dot Product Graph

Funding

NSF [DMS-1646108, DMS-2052632]
University of Wisconsin-Madison OVCRGE
Wisconsin Alumni Research Foundation
Australian Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS)
Discovery Early Career Researcher Award [DE180100923]
DARPA [FA8750-17-2-0112]
ARO
DARPA
NSF
ONR
Australian Research Council [DE180100923] Funding Source: Australian Research Council

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper investigates the out-of-sample extension problem for graph embedding techniques, proving that out-of-sample extension based on a random dot product graph follows a central limit theorem. It provides a convenient framework for analyzing trade-offs between estimation accuracy and computational expenses, and demonstrates the performance of out-of-sample extensions on simulated and real-world data, showing significant computational savings with minimal loss in embedding quality.

Graph embeddings, a class of dimensionality reduction techniques designed for relational data, have proven useful in exploring and modeling network structure. Most dimensionality reduction methods allow out-of-sample extensions, by which an embedding can be applied to observations not present in the training set. Applied to graphs, the out-of-sample extension problem concerns how to compute the embedding of a vertex that is added to the graph after an embedding has already been computed. In this paper, we consider the out-of-sample extension problem for two graph embedding procedures: the adjacency spectral embedding and the Laplacian spectral embedding. In both cases, we prove that when the underlying graph is generated according to a latent space model called the random dot product graph, which includes the popular stochastic block model as a special case, an out-of-sample extension based on a least-squares objective obeys a central limit theorem. In addition, we prove a concentration inequality for the out-of-sample extension of the adjacency spectral embedding based on a maximum-likelihood objective. Our results also yield a convenient framework in which to analyze trade-offs between estimation accuracy and computational expenses, which we explore briefly. Finally, we explore the performance of these out-of-sample extensions as applied to both simulated and real-world data. We observe significant computational savings with minimal losses to the quality of the learned embeddings, in keeping with our theoretical results.

Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings

Journal

JOURNAL OF MACHINE LEARNING RESEARCH

Publisher

MICROTOME PUBL

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings

Journal

JOURNAL OF MACHINE LEARNING RESEARCH

Publisher

MICROTOME PUBL

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper