4.3 Article Book Chapter

Synthetic Data

Publisher

ANNUAL REVIEWS
DOI: 10.1146/annurev-statistics-040720-031848

Keywords

full synthesis; partial synthesis; differential privacy; statistical inference; multiple imputation; Bayesian inference

Ask authors/readers for more resources

Synthetic data sets are an attractive framework to provide widespread access to data for analysis while mitigating privacy and confidentiality concerns. This article aims to review various methods for generating and analyzing synthetic data sets, inferential justification, limitations of the approaches, and future research directions.
Demand for access to data, especially data collected using public funds, is ever growing. At the same time, concerns about the disclosure of the identities of and sensitive information about the respondents providing the data are making the data collectors limit the access to data. Synthetic data sets, generated to emulate certain key information found in the actual data and provide the ability to draw valid statistical inferences, are an attractive framework to afford widespread access to data for analysis while mitigating privacy and confidentiality concerns. The goal of this article is to provide a review of various approaches for generating and analyzing synthetic data sets, inferential justification, limitations of the approaches, and directions for future research.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available