4.6 Article

Bayesian Copula Density Deconvolution for Zero-Inflated Data in Nutritional Epidemiology

Journal

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
Volume 116, Issue 535, Pages 1075-1087

Publisher

TAYLOR & FRANCIS INC
DOI: 10.1080/01621459.2020.1782220

Keywords

Copula; Density deconvolution; Measurement error; Nutritional epidemiology; Zero inflated data

Funding

  1. NSF [DMS1613156, CCF-1934904]
  2. National Cancer Institute [R01CA194391, U01-CA057030]

Ask authors/readers for more resources

Estimating the densities of long-term average intakes of different dietary components in nutritional epidemiology is challenging due to the inability to directly measure these variables. A Bayesian semiparametric solution was proposed in this study, demonstrating its effectiveness through simulation experiments. Compared to other methods, this approach provides more realistic estimates of consumption patterns for episodically consumed dietary components.
Estimating the marginal and joint densities of the long-term average intakes of different dietary components is an important problem in nutritional epidemiology. Since these variables cannot be directly measured, data are usually collected in the form of 24-hr recalls of the intakes, which show marked patterns of conditional heteroscedasticity. Significantly compounding the challenges, the recalls for episodically consumed dietary components also include exact zeros. The problem of estimating the density of the latent long-time intakes from their observed measurement error contaminated proxies is then a problem of deconvolution of densities with zero-inflated data. We propose a Bayesian semiparametric solution to the problem, building on a novel hierarchical latent variable framework that translates the problem to one involving continuous surrogates only. Crucial to accommodating important aspects of the problem, we then design a copula based approach to model the involved joint distributions, adopting different modeling strategies for the marginals of the different dietary components. We design efficient Markov chain Monte Carlo algorithms for posterior inference and illustrate the efficacy of the proposed method through simulation experiments. Applied to our motivating nutritional epidemiology problems, compared to other approaches, our method provides more realistic estimates of the consumption patterns of episodically consumed dietary components. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available