4.6 Article

Gaze-Dependent Image Re-Ranking Technique for Enhancing Content-Based Image Retrieval

Journal

APPLIED SCIENCES-BASEL
Volume 13, Issue 10, Pages -

Publisher

MDPI
DOI: 10.3390/app13105948

Keywords

re-ranking; gaze trace; content-based image retrieval; image captioning

Ask authors/readers for more resources

This article proposes a re-ranking method for content-based image retrieval (CBIR) that utilizes a user's gaze trace as interactive information to predict their inherent attention. By generating image captions based on the gaze trace, the proposed method effectively expresses the relationship between images and gaze information, resulting in more accurate alignment with user preferences or interests.
Content-based image retrieval (CBIR) aims to find desired images similar to the image input by the user, and it is extensively used in the real world. Conventional CBIR methods do not consider user preferences since they only determine retrieval results by referring to the degree of resemblance or likeness between the query and potential candidate images. Because of the above reason, a semantic gap appears, as the model may not accurately understand the potential intention that a user has included in the query image. In this article, we propose a re-ranking method for CBIR that considers a user's gaze trace as interactive information to help the model predict the user's inherent attention. The proposed method uses the user's gaze trace corresponding to the image obtained from the initial retrieval as the user's preference information. We introduce image captioning to effectively express the relationship between images and gaze information by generating image captions based on the gaze trace. As a result, we can transform the coordinate data into a text format and explicitly express the semantic information of the images. Finally, image retrieval is performed again using the generated gaze-dependent image captions to obtain images that align more accurately with the user's preferences or interests. The experimental results on an open image dataset with corresponding gaze traces and human-generated descriptions demonstrate the efficacy or efficiency of the proposed method. Our method considers visual information as the user's feedback to achieve user-oriented image retrieval.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available