3.8 Proceedings Paper

Embedding Comparator: Visualizing Differences in Global Structure and Local Neighborhoods via Small Multiples

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3490099.3511122

Keywords

machine learning; embedding spaces; visualization system; interactive; small multiples

Funding

  1. MIT-IBM Watson AI Lab
  2. United States Air Force Research Laboratory [FA8750-19-2-1000]

Ask authors/readers for more resources

This article presents an interactive system called "Embedding Comparator," which provides both fine-grained inspection of local neighborhoods and a global comparison of embedding spaces. Through case studies across multiple modalities, it is demonstrated that the system can rapidly reveal insights and accelerate the comparison process.
Embeddings mapping high-dimensional discrete input to lower-dimensional continuous vector spaces have been widely adopted in machine learning applications as a way to capture domain semantics. Interviewing 13 embedding users across disciplines, we find comparing embeddings is a key task for deployment or downstream analysis but unfolds in a tedious fashion that poorly supports systematic exploration. In response, we present the Embedding Comparator, an interactive system that presents a global comparison of embedding spaces alongside fine-grained inspection of local neighborhoods. It systematically surfaces points of comparison by computing the similarity of the k-nearest neighbors of every embedded object between a pair of spaces. Through case studies across multiple modalities, we demonstrate our system rapidly reveals insights, such as semantic changes following fine-tuning, language changes over time, and differences between seemingly similar models. In evaluations with 15 participants, we find our system accelerates comparisons by shifting from laborious manual specification to browsing and manipulating visualizations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available