4.7 Article

A Survey on Multi-modal Summarization

Journal

ACM COMPUTING SURVEYS
Volume 55, Issue 13S, Pages -

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3584700

Keywords

Summarization; multi-modal content processing; neural networks

Ask authors/readers for more resources

This article presents a comprehensive survey of existing research on automatic multi-modal summarization (MMS), covering various modalities such as text, image, audio, and video. It discusses the importance of MMS, highlights different evaluation metrics and datasets used for this task, and identifies current challenges and future directions in the field.
The new era of technology has brought us to the point where it is convenient for people to share their opinions over an abundance of platforms. These platforms have a provision for the users to express themselves in multiple forms of representations, including text, images, videos, and audio. This, however, makes it difficult for users to obtain all the key information about a topic, making the task of automatic multi-modal summarization (MMS) essential. In this article, we present a comprehensive survey of the existing research in the area of MMS, covering various modalities such as text, image, audio, and video. Apart from highlighting the different evaluation metrics and datasets used for the MMS task, our work also discusses the current challenges and future directions in this field.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available