4.7 Article

Improving the quality of generative models through Smirnov transformation

Journal

INFORMATION SCIENCES
Volume 609, Issue -, Pages 1539-1566

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2022.07.066

Keywords

Generative adversarial network; Smirnov transformation; Jaccard index; generative model

Funding

  1. European Union [833685, 101015857]
  2. H2020 Societal Challenges Programme [833685] Funding Source: H2020 Societal Challenges Programme

Ask authors/readers for more resources

This paper proposes a novel activation function based on the Smirnov probabilistic transformation to improve the quality of generated data in Generative Adversarial Networks (GANs). Unlike previous works, this activation function is applicable to any type of data distribution and can be seamlessly integrated into GAN training processes. Experimental results demonstrate that using this new activation function can significantly enhance the quality of generated data in GANs.
Solving the convergence issues of Generative Adversarial Networks (GANs) is one of the most outstanding problems in generative models. In this work, we propose a novel activa-tion function to be used as output of the generator agent. This activation function is based on the Smirnov probabilistic transformation and it is specifically designed to improve the quality of the generated data. In sharp contrast to previous works, our activation function provides a more general approach that deals not only with the replication of categorical variables but with any type of data distribution (continuous or discrete). Moreover, our activation function is derivable and therefore, it can be seamlessly integrated in the back -propagation computations during the GAN training processes. To validate this approach, we firstly evaluate our proposal on two different data sets: a) an artificially rendered data set containing a mixture of discrete and continuous variables, and b) a real data set of flow -based network traffic data containing both normal connections and cryptomining attacks. In addition, three publicly available data sets were added to the evaluation to generalize the obtained results. To evaluate the fidelity of the generated data, we analyze their results both in terms of quality measures of statistical nature and regarding the use of these syn-thetic data to feed a nested machine learning-based classifier.The experimental results evince a clear outperformance of a Wasserstein GAN network (WGAN) tuned with this new activation function with respect to both a naive mean -based generator and a standard WGAN. The quality of the generated data allows to fully substitute real data with synthetic data for training the nested classifier without a signif-icant fall in the obtained accuracy.(c) 2022 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available