☆ 4.7 Article

Generating Descriptive Visual Words and Visual Phrases for Large-Scale Image Applications

IEEE TRANSACTIONS ON IMAGE PROCESSING (2011)

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING

卷 20, 期 9, 页码 2664-2677

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TIP.2011.2128333

关键词

Image retrieval; image search re-ranking; object recognition; visual phrase; visual word

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

Microsoft Research Asia (MSRA)
National Science Foundation [IIS 1052581]
National Natural Science Foundation of China [61025011, 60833006]
National Basic Research Program of China (973 Program) [2009CB320906]
Beijing Natural Science Foundation [4092042]
Google Faculty Research Award
FXPAL Faculty Research Award
Div Of Information & Intelligent Systems
Direct For Computer & Info Scie & Enginr [1052851] Funding Source: National Science Foundation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Bag-of-visual Words (BoWs) representation has been applied for various problems in the fields of multimedia and computer vision. The basic idea is to represent images as visual documents composed of repeatable and distinctive visual elements, which are comparable to the text words. Notwithstanding its great success and wide adoption, visual vocabulary created from single-image local descriptors is often shown to be not as effective as desired. In this paper, descriptive visual words (DVWs) and descriptive visual phrases (DVPs) are proposed as the visual correspondences to text words and phrases, where visual phrases refer to the frequently co-occurring visual word pairs. Since images are the carriers of visual objects and scenes, a descriptive visual element set can be composed by the visual words and their combinations which are effective in representing certain visual objects or scenes. Based on this idea, a general framework is proposed for generating DVWs and DVPs for image applications. In a large-scale image database containing 1506 object and scene categories, the visual words and visual word pairs descriptive to certain objects or scenes are identified and collected as the DVWs and DVPs. Experiments show that the DVWs and DVPs are informative and descriptive and, thus, are more comparable with the text words than the classic visual words. We apply the identified DVWs and DVPs in several applications including large-scale near-duplicated image retrieval, image search re-ranking, and object recognition. The combination of DVW and DVP performs better than the state of the art in large-scale near-duplicated image retrieval in terms of accuracy, efficiency and memory consumption. The proposed image search re-ranking algorithm: DWPRank outperforms the state-of-the-art algorithm by 12.4% in mean average precision and about 11 times faster in efficiency.

Generating Descriptive Visual Words and Visual Phrases for Large-Scale Image Applications

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Generating Descriptive Visual Words and Visual Phrases for Large-Scale Image Applications

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文