☆ 4.3 Article

Bias due to participant overlap in two-sample Mendelian randomization

GENETIC EPIDEMIOLOGY (2016)

Journal

GENETIC EPIDEMIOLOGY

Volume 40, Issue 7, Pages 597-608

Publisher

WILEY

DOI: 10.1002/gepi.21998

Keywords

aggregated data; instrumental variables; Mendelian randomization; summarized data; weak instrument bias

Funding

British Heart Foundation [SP/09/002] Funding Source: Medline
European Research Council [268834] Funding Source: Medline
Medical Research Council [MC_UU_00002/7, G0800270] Funding Source: Medline
Wellcome Trust [100114] Funding Source: Medline
Medical Research Council [G0800270, MC_UU_12013/9, MC_UU_12013/1] Funding Source: researchfish
European Research Council (ERC) [268834] Funding Source: European Research Council (ERC)
MRC [MC_UU_12013/9, G0800270, MC_UU_12013/1] Funding Source: UKRI

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Mendelian randomization analyses are often performed using summarized data. The causal estimate from a one-sample analysis (in which data are taken from a single data source) with weak instrumental variables is biased in the direction of the observational association between the risk factor and outcome, whereas the estimate from a two-sample analysis (in which data on the risk factor and outcome are taken from non-overlapping datasets) is less biased and any bias is in the direction of the null. When using genetic consortia that have partially overlapping sets of participants, the direction and extent of bias are uncertain. In this paper, we perform simulation studies to investigate the magnitude of bias and Type 1 error rate inflation arising from sample overlap. We consider both a continuous outcome and a case-control setting with a binary outcome. For a continuous outcome, bias due to sample overlap is a linear function of the proportion of overlap between the samples. So, in the case of a null causal effect, if the relative bias of the one-sample instrumental variable estimate is 10% (corresponding to an F parameter of 10), then the relative bias with 50% sample overlap is 5%, and with 30% sample overlap is 3%. In a case-control setting, if risk factor measurements are only included for the control participants, unbiased estimates are obtained even in a one-sample setting. However, if risk factor data on both control and case participants are used, then bias is similar with a binary outcome as with a continuous outcome. Consortia releasing publicly available data on the associations of genetic variants with continuous risk factors should provide estimates that exclude case participants from case-control samples.

Bias due to participant overlap in two-sample Mendelian randomization

Journal

GENETIC EPIDEMIOLOGY

Publisher

WILEY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Bias due to participant overlap in two-sample Mendelian randomization

Journal

GENETIC EPIDEMIOLOGY

Publisher

WILEY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper