☆ 4.7 Article

Region Reinforcement Network With Topic Constraint for Image-Text Matching

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2022)

Journal

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

Volume 32, Issue 1, Pages 388-397

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TCSVT.2021.3060713

Keywords

Semantics; Visualization; Electronic mail; Petroleum; Linear programming; Cameras; Training data; Image-text matching; cross-modal retrieval; topic constraint

Funding

Key Research and Development Plan of Shandong Province [2019GGX101015]
National Natural Science Foundation of China [61671482]
Fundamental Research Funds for the Central Universities [19CX05003A11]
China National Study Abroad Fund

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Image and sentence matching, which combines vision and language, has gained increasing attention. Previous methods ignored the relationships between image regions and considered all region-word pairs equally. This paper proposes a novel method, the Region Reinforcement Network with Topic Constraint (RRTC), to explore the correspondences between images and texts. It builds a region reinforcement network to infer fine-grained correspondence by considering the relationships of regions and re-assigning region-word similarities. The topic constraint module summarizes the central theme of images and constrains the deviation of the original image.

Image and sentence matching has attracted increasing attention since it is associated with two important modalities of vision and language. Previous methods aim to find the latent correspondences between image regions and words by aggregating the similarities of the region-word pairs. However, these approaches consider little about the relationships of diverse regions in the image and treat the similarities of all region-word pairs equally. Moreover, focusing on fine-grained alignment overly, the true meaning of the original image will be likely distorted. In this paper, a novel Region Reinforcement Network with Topic Constraint (RRTC) is proposed to explore the correspondences between images and texts. Specifically, the region reinforcement network is built to infer fine-grained correspondence by considering the relationships of regions and re-assigning region-word similarities. Meanwhile, the topic constraint module is presented to summarize the central theme of images, which constrains the original image deviation. Extensive experimental results on MSCOCO and Flickr30k datasets verify the effectiveness of our proposed RRTC.

Region Reinforcement Network With Topic Constraint for Image-Text Matching

Journal

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Region Reinforcement Network With Topic Constraint for Image-Text Matching

Journal

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper