Journal
IEEE TRANSACTIONS ON IMAGE PROCESSING
Volume 24, Issue 12, Pages 5543-5556Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIP.2015.2466106
Keywords
Multi modal; pairwise constraint; subspace clustering
Funding
- National Natural Science Foundation of China [61135002, 61473289]
- National Basic Research Program of China [2012CB316300]
Ask authors/readers for more resources
In multimedia applications, the text and image components in a web document form a pairwise constraint that potentially indicates the same semantic concept. This paper studies cross-modal learning via the pairwise constraint and aims to find the common structure hidden in different modalities. We first propose a compound regularization framework to address the pairwise constraint, which can be used as a general platform for developing cross-modal algorithms. For unsupervised learning, we propose a multi-modal subspace clustering method to learn a common structure for different modalities. For supervised learning, to reduce the semantic gap and the outliers in pairwise constraints, we propose a cross-modal matching method based on compound l(21) regularization. Extensive experiments demonstrate the benefits of joint text and image modeling with semantically induced pairwise constraints, and they show that the proposed cross-modal methods can further reduce the semantic gap between different modalities and improve the clustering/matching accuracy.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available