4.7 Article

Repurposing existing deep networks for caption and aesthetic-guided image cropping

期刊

PATTERN RECOGNITION
卷 126, 期 -, 页码 -

出版社

ELSEVIER SCI LTD
DOI: 10.1016/j.patcog.2021.108485

关键词

Image cropping; Aesthetics; Deep network re-purposing; Image captioning

资金

  1. Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant Deep Visual Geometry Machines [RGPIN-2018-03788]
  2. MoD/Dstl [EP/N019415/1]
  3. EPSRC [EP/N019415/1]
  4. NVIDIA
  5. Institute of Information and communications Technology Planning and evaluation (IITP) grant - Korean government (MSIT) [2021-0-00537, IITP-2020-0-01789]

向作者/读者索取更多资源

This study proposes a novel optimization framework that optimizes image cropping parameters based on user description and aesthetics. Instead of training a separate network, pre-trained networks on image captioning and aesthetic tasks are repurposed. The framework employs three strategies to ensure stable optimization and produces crops that align with user descriptions and aesthetics.
We propose a novel optimization framework that crops a given image based on user description and aesthetics. Unlike existing image cropping methods, where one typically trains a deep network to regress to crop parameters or cropping actions, we propose to directly optimize for the cropping parameters by re purposing pre-trained networks on image captioning and aesthetic tasks, without any fine-tuning, thereby avoiding training a separate network. Specifically, we search for the best crop parameters that minimize a combined loss of the initial objectives of these networks. To make the optimization stable, we propose three strategies: (i) multi-scale bilinear sampling, (ii) annealing the scale of the crop region, therefore effectively reducing the parameter space, (iii) aggregation of multiple optimization results. Through various quantitative and qualitative evaluations, we show that our framework can produce crops that are well-aligned to intended user descriptions and aesthetically pleasing. (c) 2022 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据