4.7 Article

Efficient and Effective Multi-Modal Queries Through Heterogeneous Network Embedding

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Volume 34, Issue 11, Pages 5307-5320

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2021.3052871

Keywords

Information retrieval; Data models; Semantics; Videos; Games; Task analysis; Heterogeneous networks; Query embedding; graph embedding; heterogeneous information network

Funding

  1. ARC Discovery Early Career Researcher Award [DE200101465]

Ask authors/readers for more resources

The heterogeneity of today's Web sources requires information retrieval systems to handle multi-modal queries. Existing methods for handling multi-modal queries are either inefficient or ineffective. To address this issue, we propose an information retrieval system based on heterogeneous network embedding, which can accurately answer multi-modal queries with a single pass over the data.
The heterogeneity of today's Web sources requires information retrieval (IR) systems to handle multi-modal queries. Such queries define a user's information needs by different data modalities, such as keywords, hashtags, user profiles, and other media. Recent IR systems answer such a multi-modal query by considering it as a set of separate uni-modal queries. However, depending on the chosen operationalisation, such an approach is inefficient or ineffective. It either requires multiple passes over the data or leads to inaccuracies since the relations between data modalities are neglected in the relevance assessment. To mitigate these challenges, we present an IR system that has been designed to answer genuine multi-modal queries. It relies on a heterogeneous network embedding, so that features from diverse modalities can be incorporated when representing both, a query and the data over which it shall be evaluated. By embedding a query and the data in the same vector space, the relations across modalities are made explicit and exploited for more accurate query evaluation. At the same time, multi-modal queries are answered with a single pass over the data. An experimental evaluation using diverse real-world and synthetic datasets illustrates that our approach returns twice the amount of relevant information compared to baseline techniques, while scaling to large multi-modal databases.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available