4.7 Article

Actively Learning Human Gaze Shifting Paths for Semantics-Aware Photo Cropping

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING
卷 23, 期 5, 页码 2235-2245

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIP.2014.2311658

关键词

Photo cropping; semantics; active graphlet path; aesthetics

资金

  1. Project of National Science Foundation of China [61125106, 61327902, 61035002, 61373076, 61002009]
  2. Fundamental Research Funds for the Central Universities [2013121026]
  3. 985 Project of Xiamen University
  4. Key Science and Technology Program of Zhejiang Province of China [2012C01035-1]
  5. Zhejiang Provincial Natural Science Foundation of China [LZ13F020004]
  6. Singapore National Research Foundation under its International Research Centre@ Singapore Funding Initiative

向作者/读者索取更多资源

Photo cropping is a widely used tool in printing industry, photography, and cinematography. Conventional cropping models suffer from the following three challenges. First, the deemphasized role of semantic contents that are many times more important than low-level features in photo aesthetics. Second, the absence of a sequential ordering in the existing models. In contrast, humans look at semantically important regions sequentially when viewing a photo. Third, the difficulty of leveraging inputs from multiple users. Experience from multiple users is particularly critical in cropping as photo assessment is quite a subjective task. To address these challenges, this paper proposes semantics-aware photo cropping, which crops a photo by simulating the process of humans sequentially perceiving semantically important regions of a photo. We first project the local features (graphlets in this paper) onto the semantic space, which is constructed based on the category information of the training photos. An efficient learning algorithm is then derived to sequentially select semantically representative graphlets of a photo, and the selecting process can be interpreted by a path, which simulates humans actively perceiving semantics in a photo. Furthermore, we learn a prior distribution of such active graphlet paths from training photos that are marked as aesthetically pleasing by multiple users. The learned priors enforce the corresponding active graphlet path of a test photo to be maximally similar to those from the training photos. Experimental results show that: 1) the active graphlet path accurately predicts human gaze shifting, and thus is more indicative for photo aesthetics than conventional saliency maps and 2) the cropped photos produced by our approach outperform its competitors in both qualitative and quantitative comparisons.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据