☆ 4.7 Article

LD-MAN: Layout-Driven Multimodal Attention Network for Online News Sentiment Recognition

IEEE TRANSACTIONS ON MULTIMEDIA (2021)

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Volume 23, Issue -, Pages 1785-1798

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TMM.2020.3003648

Keywords

Sentiment analysis; Visualization; Layout; Feature extraction; Analytical models; Neural networks; Image recognition; Multimodal sentiment recognition; online news; attention mechanism; article layout

Funding

Major Project for New Generation of AI Grant [2018AAA0100403]
NSFC [61876094, U1933114, U1836109, U1903128, U1936206]
Natural Science Foundation of Tianjin, China [18JCYBJC15400, 18ZXZNGX00110]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The prevailing use of both images and text on the web necessitates multimodal sentiment recognition. It is challenging to predict readers' sentiment after reading online news articles due to their complex structures. A layout-driven multimodal attention network is proposed to address this issue effectively.

The prevailing use of both images and text to express opinions on the web leads to the need for multimodal sentiment recognition. Some commonly used social media data containing short text and few images, such as tweets and product reviews, have been well studied. However, it is still challenging to predict the readers' sentiment after reading online news articles, since news articles often have more complicated structures, e.g., longer text and more images. To address this problem, we propose a layout-driven multimodal attention network (LD-MAN) to recognize news sentiment in an end-to-end manner. Rather than modeling text and images individually, LD-MAN uses the layout of online news to align images with the corresponding text. Specifically, it exploits a set of distance-based coefficients to model the image locations and measure the contextual relationship between images and text. LD-MAN then learns the affective representations of the articles from the aligned text and images using a multimodal attention mechanism. Considering the lack of relevant datasets in this field, we collect two multimodal online news datasets, containing a total of 14,566 articles with 56,260 images and 251,202 words. Experimental results demonstrate that the proposed method performs favorably compared with state-of-the-art approaches. We will release all the codes, models and datasets to the community.

LD-MAN: Layout-Driven Multimodal Attention Network for Online News Sentiment Recognition

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

LD-MAN: Layout-Driven Multimodal Attention Network for Online News Sentiment Recognition

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper