4.7 Article

Segmentation mask and feature similarity loss guided GAN for object-oriented image-to-image translation

Journal

INFORMATION PROCESSING & MANAGEMENT
Volume 59, Issue 3, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.ipm.2022.102926

Keywords

Image-to-image translation; Object transfiguration; GAN; Image-to-image translation; Object transfiguration; GAN

Funding

  1. National Natural Science Foundation of China [62072074, 62076054, 62027827, 61902054, 62002047]
  2. Sichuan Science and Technology Innovation Platform and Talent Plan [2020JDJQ0020, 2022JDJQ0039]
  3. Sichuan Science and Technology Support Plan [2020YFSY0010, 2022YFQ0045, 2022YFS0220, 2019YJ0636, 2021YFG0131]
  4. Cloud Technology Endowed Professorship

Ask authors/readers for more resources

This paper proposes a novel approach called ObjectVariedGAN to handle geometric translation in image-to-image transformation. The approach focuses on maintaining the shape of foreground objects and utilizes feature similarity loss and cycle-consistency loss to generate the desired output without requiring paired training data.
While image-to-image translation has been extensively studied, there are a number of limitations in existing methods designed for transformation between instances of different shapes from different domains. In this paper, a novel approach was proposed (hereafter referred to as ObjectVariedGAN) to handle geometric translation. One may encounter large and significant shape changes during image-to-image translation, especially object transfiguration. Thus, we focus on synthesizing the desired results to maintain the shape of the foreground object without requiring paired training data. Specifically, our proposed approach learns the mapping between source domains and target domains, where the shapes of objects differ significantly. Feature similarity loss is introduced to encourage generative adversarial networks (GANs) to obtain the structure attribute of objects (e.g., object segmentation masks). Additionally, to satisfy the requirement of utilizing unaligned datasets, cycle-consistency loss is combined with context preserving loss. Our approach feeds the generator with source image(s), incorporated with the instance segmentation mask, and guides the network to generate the desired target domain output. To verify the effectiveness of proposed approach, extensive experiments are conducted on pre-processed examples from the MS-COCO datasets. A comparative summary of the findings demonstrates that ObjectVariedGAN outperforms other competing approaches, in the terms of Inception Score, Frechet Inception Distance, and human cognitive preference.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available