3.8 Proceedings Paper

Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching

相关参考文献

注意:仅列出部分参考文献,下载原文获取全部文献信息。
Proceedings Paper Computer Science, Artificial Intelligence

Probabilistic Embeddings for Cross-Modal Retrieval

Sanghyuk Chun et al.

Summary: Cross-modal retrieval methods aim to build a common representation space for samples from different modalities, such as vision and language. This paper introduces Probabilistic Cross-Modal Embedding (PCME) to represent samples as probabilistic distributions, showing improved retrieval performance and providing uncertainty estimates for better interpretability. By evaluating on the CUB dataset with exhaustive annotations, PCME outperforms deterministic methods in capturing one-to-many correspondences.

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 (2021)

Proceedings Paper Computer Science, Artificial Intelligence

StacMR: Scene-Text Aware Cross-Modal Retrieval

Andres Mafla et al.

Summary: This paper introduces a new dataset for cross-modal retrieval involving scene-text instances, proposes approaches leveraging scene text, and conducts experiments to confirm the benefits of utilizing scene text.

2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021 (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Hierarchical Multimodal LSTM for Dense Visual-Semantic Embedding

Zhenxing Niu et al.

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2017)

Proceedings Paper Computer Science, Artificial Intelligence

Self-supervised learning of visual features through embedding images into text topic spaces

Lluis Gomez et al.

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)

Proceedings Paper Computer Science, Artificial Intelligence

Beyond instance-level image retrieval: Leveraging captions to learn a global visual representation for semantic retrieval

Albert Gordo et al.

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)

Article Computer Science, Artificial Intelligence

Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics

Micah Hodosh et al.

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH (2013)