4.7 Article

DataSHIELD: resolving a conflict in contemporary bioscience-performing a pooled analysis of individual-level data without sharing the data

Journal

INTERNATIONAL JOURNAL OF EPIDEMIOLOGY
Volume 39, Issue 5, Pages 1372-1382

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/ije/dyq111

Keywords

Pooling; analysis; meta-analysis; individual-level; study-level; generalized linear model; GLM; ethico-legal; ELSI; identification; disclosure; distributed computing; bioinformatics; information technology; IT

Funding

  1. Genome Canada
  2. Genome Quebec
  3. European Framework 6 [LSHG-CT-2006-518418]
  4. Medical Research Council [G0601625]
  5. Wellcome Trust [086160/Z/08/A]
  6. Leverhulme Research Fellowship [RF/9/RFG/2009/0062]
  7. Leicester Biomedical Research Unit in Cardiovascular Science (National Institute for Health Research)
  8. British Heart Foundation Studentship [FS/06/040]
  9. Wellcome Trust [086160/Z/08/A] Funding Source: Wellcome Trust
  10. MRC [G0501942, G0601625] Funding Source: UKRI
  11. Medical Research Council [G9815508, G0601625, G0501942] Funding Source: researchfish

Ask authors/readers for more resources

Methods Data aggregation through anonymous summary-statistics from harmonized individual-level databases (DataSHIELD), provides a simple approach to analysing pooled data that circumvents this conflict. This is achieved via parallelized analysis and modern distributed computing and, in one key setting, takes advantage of the properties of the updating algorithm for generalized linear models (GLMs). Results The conceptual use of DataSHIELD is illustrated in two different settings. Conclusions As the study of the aetiological architecture of chronic diseases advances to encompass more complex causal pathways-e.g. to include the joint effects of genes, lifestyle and environment-sample size requirements will increase further and the analysis of pooled individual-level data will become ever more important. An aim of this conceptual article is to encourage others to address the challenges and opportunities that DataSHIELD presents, and to explore potential extensions, for example to its use when different data sources hold different data on the same individuals.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available