Contextualizing selection bias in Mendelian randomization: how bad is it likely to be?

Int J Epidemiol. 2019 Jun 1;48(3):691-701. doi: 10.1093/ije/dyy202.

Abstract

Background: Selection bias affects Mendelian randomization investigations when selection into the study sample depends on a collider between the genetic variant and confounders of the risk factor-outcome association. However, the relative importance of selection bias for Mendelian randomization compared with other potential biases is unclear.

Methods: We performed an extensive simulation study to assess the impact of selection bias on a typical Mendelian randomization investigation. We considered inverse probability weighting as a potential method for reducing selection bias. Finally, we investigated whether selection bias may explain a recently reported finding that lipoprotein(a) is not a causal risk factor for cardiovascular mortality in individuals with previous coronary heart disease.

Results: Selection bias had a severe impact on bias and Type 1 error rates in our simulation study, but only when selection effects were large. For moderate effects of the risk factor on selection, bias was generally small and Type 1 error rate inflation was not considerable. Inverse probability weighting ameliorated bias when the selection model was correctly specified, but increased bias when selection bias was moderate and the model was misspecified. In the example of lipoprotein(a), strong genetic associations and strong confounder effects on selection mean the reported null effect on cardiovascular mortality could plausibly be explained by selection bias.

Conclusions: Selection bias can adversely affect Mendelian randomization investigations, but its impact is likely to be less than other biases. Selection bias is substantial when the effects of the risk factor and confounders on selection are particularly large.

Keywords: causal inference; collider bias; instrumental variables; inverse probability weighting; selection bias.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cardiovascular Diseases / genetics
  • Cardiovascular Diseases / mortality
  • Causality
  • Computer Simulation
  • Confounding Factors, Epidemiologic
  • Coronary Disease / epidemiology
  • Coronary Disease / genetics
  • Humans
  • Lipoprotein(a) / genetics
  • Lipoprotein(a) / metabolism
  • Mendelian Randomization Analysis*
  • Risk Factors
  • Selection Bias*

Substances

  • Lipoprotein(a)