Journal
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022
Volume -, Issue -, Pages 4700-4709Publisher
IEEE
DOI: 10.1109/CVPRW56347.2022.00516
Keywords
-
Categories
Ask authors/readers for more resources
In this paper, the problem of cross-modal retrieval in the presence of multi-view and multi-label data is addressed. The authors propose a Multi-view Multi-label Canonical Correlation Analysis (MVMLCCA) method, which generalizes CCA for multi-view data and utilizes high-level semantic information in the form of multi-label annotations. The proposed MVMLCCA method establishes correspondence across multiple views without explicit pairing of multi-view samples. Extensive experiments demonstrate that this approach offers more flexibility without compromising scalability and cross-modal retrieval performance.
In this paper, we address the problem of cross-modal retrieval in presence of multi-view and multi-label data. For this, we present Multi-view Multi-label Canonical Correlation Analysis (or MVMLCCA), which is a generalization of CCA for multi-view data that also makes use of high-level semantic information available in the form of multi-label annotations in each view. While CCA relies on explicit pairings/associations of samples between two views (or modalities), MVMLCCA uses the available multi-label annotations to establish correspondence across multiple (two or more) views without the need of explicit pairing of multi-view samples. Extensive experiments on two multi-modal datasets demonstrate that the proposed approach offers much more flexibility than the related approaches without compromising on scalability and cross-modal retrieval performance. Our code and precomputed features are available at https://github.com/Rushil231100/MVMLCCA.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available