☆ 4.6 Article

A survey on generative adversarial network-based text-to-image synthesis

NEUROCOMPUTING (2021)

Journal

NEUROCOMPUTING

Volume 451, Issue -, Pages 316-336

Publisher

ELSEVIER

DOI: 10.1016/j.neucom.2021.04.069

Keywords

Deep learning; Generative adversarial network (GAN); Text-to-image synthesis; Scene layout

Funding

National Key Research and Development Plan of China [2020AAA0108903, 2017YFB1300205]
National Natural Science Foundation of China [61573213, 61803227, 61603214, 61673245]
Natural Science Foundation of Shandong Province [2018GGX101039, ZR2020MD041, ZR2020MF077]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Text-to-image synthesis is a new challenge in the field of image synthesis. With the development of deep learning and the application of GANs, significant progress has been achieved in this area. The input of GANs-based text-to-image synthesis includes general text description, scene layout, and dialog text, with a focus on improving text information utilization, network structure, and output control conditions.

The task of text-to-image synthesis is a new challenge in the field of image synthesis. In the earlier research, the task of text-to-image synthesis is mainly to achieve the alignment of words and images by the way of retrieval based on the sentences or keywords. With the development of deep learning, especially the application of deep generative models in image synthesis, image synthesis achieves promising progress. The Generative adversarial networks (GANs) are one of the most significant generative models, and GANs have been successfully applied in computer vision, natural language processing and so on. In this paper, we review and summarize the recent research in GANs-based text-to-image synthesis, and provide a summary of the development of classic and advanced models. The input of the GANs-based text-to image synthesis is not only the general text description as earlier studies, also includes scene layout and dialog text. The typical structure of each categories is elaborated. The general text-based image synthesis is the most commonly in the text-to-image synthesis, and it is subdivided into three groups based on the improvements of text information utilization, network structure and output control conditions. Through the survey, the detailed and logical overview of the evolution of GANs-based text-to-image synthesis is presented. Finally, the challenged problems and the future development of text-to-image synthesis are discussed. (c) 2021 Elsevier B.V. All rights reserved.

A survey on generative adversarial network-based text-to-image synthesis

Journal

NEUROCOMPUTING

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

A survey on generative adversarial network-based text-to-image synthesis

Journal

NEUROCOMPUTING

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper