4.6 Article

Gaze-Dependent Image Re-Ranking Technique for Enhancing Content-Based Image Retrieval

期刊

APPLIED SCIENCES-BASEL
卷 13, 期 10, 页码 -

出版社

MDPI
DOI: 10.3390/app13105948

关键词

re-ranking; gaze trace; content-based image retrieval; image captioning

向作者/读者索取更多资源

This article proposes a re-ranking method for content-based image retrieval (CBIR) that utilizes a user's gaze trace as interactive information to predict their inherent attention. By generating image captions based on the gaze trace, the proposed method effectively expresses the relationship between images and gaze information, resulting in more accurate alignment with user preferences or interests.
Content-based image retrieval (CBIR) aims to find desired images similar to the image input by the user, and it is extensively used in the real world. Conventional CBIR methods do not consider user preferences since they only determine retrieval results by referring to the degree of resemblance or likeness between the query and potential candidate images. Because of the above reason, a semantic gap appears, as the model may not accurately understand the potential intention that a user has included in the query image. In this article, we propose a re-ranking method for CBIR that considers a user's gaze trace as interactive information to help the model predict the user's inherent attention. The proposed method uses the user's gaze trace corresponding to the image obtained from the initial retrieval as the user's preference information. We introduce image captioning to effectively express the relationship between images and gaze information by generating image captions based on the gaze trace. As a result, we can transform the coordinate data into a text format and explicitly express the semantic information of the images. Finally, image retrieval is performed again using the generated gaze-dependent image captions to obtain images that align more accurately with the user's preferences or interests. The experimental results on an open image dataset with corresponding gaze traces and human-generated descriptions demonstrate the efficacy or efficiency of the proposed method. Our method considers visual information as the user's feedback to achieve user-oriented image retrieval.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据