☆ 4.4 Article

Data Synthesis based on Generative Adversarial Networks

PROCEEDINGS OF THE VLDB ENDOWMENT (2018)

期刊

PROCEEDINGS OF THE VLDB ENDOWMENT

卷 11, 期 10, 页码 1071-1083

出版社

ASSOC COMPUTING MACHINERY

DOI: 10.14778/3231751.3231757

关键词

类别

Computer Science, Information Systems Computer Science, Theory & Methods

资金

National Research Council of Science & Technology (NST) - Korea government (MSIP) [CRC-15-05-ETRI]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Privacy is an important concern for our society where sharing data with partners or releasing data to the public is a frequent occurrence. Some of the techniques that are being used to achieve privacy are to remove identifiers, alter quasi-identifiers, and perturb values. Unfortunately, these approaches suffer from two limitations. First, it has been shown that private information can still be leaked if attackers possess some background knowledge or other information sources. Second, they do not take into account the adverse impact these methods will have on the utility of the released data. In this paper, we propose a method that meets both requirements. Our method, called table-GAN, uses generative adversarial networks (GANs) to synthesize fake tables that are statistically similar to the original table yet do not incur information leakage. We show that the machine learning models trained using our synthetic tables exhibit performance that is similar to that of models trained using the original table for unknown testing cases. We call this property model compatibility. We believe that anonymization/perturbation/synthesis methods without model compatibility are of little value. We used four real-world datasets from four different domains for our experiments and conducted in-depth comparisons with state-of-the-art anonymization, perturbation, and generation techniques. Throughout our experiments, only our method consistently shows balance between privacy level and model compatibility.

Data Synthesis based on Generative Adversarial Networks

期刊

PROCEEDINGS OF THE VLDB ENDOWMENT

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Data Synthesis based on Generative Adversarial Networks

期刊

PROCEEDINGS OF THE VLDB ENDOWMENT

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文