☆ 4.8 Article

CIPS-3D++: End-to-End Real-Time High-Resolution 3D-Aware GANs for GAN Inversion and Stylization

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Journal

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Volume 45, Issue 10, Pages 11502-11520

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/TPAMI.2023.3285648

Keywords

Terms-Generative adversarial networks; GANs; 3Daware; inversion; stylization.

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This article presents CIPS3D++, an upgraded version of style-based GANs aiming at high-robust, high-resolution, and high-efficiency 3D-aware image generation. It introduces CIPS-3D as the basic model with rotation-invariance and robustness and builds upon it with geometric regularization and upsampling operations to achieve high-resolution and high-quality image generation and editing.

Style-based GANs achieve state-of-the-art results for generating high-quality images, but lack explicit and precise control over camera poses. Recently proposed NeRF-based GANs have made great progress towards 3D-aware image generation. However, the methods either rely on convolution operators which are not rotationally invariant, or utilize complex yet suboptimal training procedures to integrate both NeRF and CNN sub-structures, yielding un-robust, low-quality images with a large computational burden. This article presents an upgraded version called CIPS3D++, aiming at high-robust, high-resolution and high-efficiency 3D-aware GANs. On the one hand, our basic model CIPS-3D, encapsulated in a style-based architecture, features a shallow NeRF-based 3D shape encoder as well as a deep MLP-based 2D image decoder, achieving robust image generation/editing with rotation-invariance. On the other hand, our proposed CIPS-3D++, inheriting the rotational invariance of CIPS-3D, together with geometric regularization and upsampling operations, encourages high-resolution high-quality image generation/editing with great computational efficiency. Trained on raw single-view images, without any bells and whistles, CIPS-3D++ sets new records for 3Daware image synthesis, with an impressive FID of 3.2 on FFHQ at the 1024 x 1024 resolution. In the meantime, CIPS-3D++ runs efficiently and enjoys a low GPUmemory footprint so that it can be trained end-to-end on high-resolution images directly, in contrast to previous alternate/progressive methods. Based on the infrastructure of CIPS-3D++, we propose a 3D-aware GAN inversion algorithm named FlipInversion, which can reconstruct the 3D object from a single-view image. We also provide a 3D-aware stylization method for real images based on CIPS-3D++ and FlipInversion. In addition, we analyze the problem of mirror symmetry suffered in training, and solve it by introducing an auxiliary discriminator for the NeRF network. Overall, CIPS-3D++ provides a strong base model that can serve as a testbed for transferring GAN-based image editing methods from 2D to 3D.

CIPS-3D++: End-to-End Real-Time High-Resolution 3D-Aware GANs for GAN Inversion and Stylization

Journal

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

CIPS-3D++: End-to-End Real-Time High-Resolution 3D-Aware GANs for GAN Inversion and Stylization

Journal

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper