MVMRmode: Introducing an R package for plurality valid estimators for multivariable Mendelian randomisation

Benjamin Woolf; Dipender Gill; Andrew J Grant; Stephen Burgess

doi:10.1101/2023.01.09.23284345

Abstract

Background Mendelian randomisation (MR) is the use of genetic variants as instrumental variables. Mode-based estimators (MBE) are one of the most popular types of estimators used in univariable-MR studies. However, because there are no plurality valid regression estimators, there are no existent modal estimators for multivariable-MR.

Methods We use the residual method for multivariable-MR to introduce two multivariable modal estimators: multivariable-MBE, which uses IVW to create residuals fed into a traditional plurality valid estimator, and multivariable-CM which instead has the residuals fed into the contamination mixture method. We then use Monte-Carlo simulations to explore the performance of these estimators when compared to existing ones and re-analyse the data used by Grant and Burgess (2021) looking at the causal effect of intelligence, education, and household income on Alzheimer’s disease as an applied example.

Results In our simulation, we found that multivariable-MBE was generally too variable to be much use. Multivariable-CM produced more precise estimates on the other hand. Multivariable-CM performed better than MR-Egger in almost all settings, and Weighted Median under balanced pleiotropy. However, it underperformed Weighted Median when there was a moderate amount of directional pleiotropy. Our re-analysis supported the conclusion of Grant and Burgess (2021), that intelligence had a protective effect on Alzheimer’s disease, while education, and household income do not have a causal effect.

Conclusions Here we introduced two, non-regression-based, plurality valid estimators for multivariable MR. Of these, “multivariable-CM” which uses IVW to create residuals fed into a contamination-mixture model, performed the best. This method uses a plurality of variants valid assumption, and appears to provided precise and unbiased estimates in the presence of balanced pleiotropy and small amounts of directional pleiotropy. We developed the MVMRmode R package (available from https://github.com/bar-woolf/MVMRmode/wiki) to facilitate the use of this estimator. We hope this will further enable the future triangulation of univariable MR studies which have used plurality valid estimators with multivariable MR designs.

Background

Mendelian randomisation (MR) is an increasingly popular method for causal inference in epidemiology which uses the random assignment of genetic variants at birth to justify the assumptions of an Instrumental variables analysis (1,2). In a traditional MR study, genetic variants (typically single-nucleotide polymorphisms, SNPs) which robustly associate (typically at genome-wide significance) with an exposure of interest are selected as instruments (3). Because of the easy accessibility of Genome-Wide Association Study (GWAS) summary statistics for many epidemiological traits, MR is often implemented using summary data, in a so-called ‘two-sample MR’ analysis (4). In such a setting, the effect of the exposure on the outcome is estimated using a Wald ratio as the variant-outcome association divided by the genotype-exposure association. When there are multiple variants, their effects are generally combined using an inverse variance weighted (IVW) meta-analysis.

On top of requiring a robust genotype-exposure association, instrumental variables analysis requires that there are no variant-outcome confounders, and that the variant can only cause the outcome via the exposure. The first of these assumptions is justified by Mendel’s laws of independent and random segregation. However, the second assumption is less plausible due to pleiotropy (the association of most variants with multiple traits). Pleiotropy can occur for two reasons: Firstly, if the exposure causes many other traits, then the genetic variants which associate with it should also associate with these other traits. This type of pleiotropy (often called vertical pleiotropy) is required for MR to work. However, the second type of pleiotropy (horizontal pleiotropy) occurs when the genetic variants independently cause two phenotypes. A second advantage of two-sample MR is that it allows for the implementation of ‘pleiotropy robust’ estimators (5). These methods generally allow for some variants to be pleiotropic by modifying the assumptions of the instrumental variables framework. One of the first methods proposed for doing this is MR-Egger. IVW can be conceptualised as a weighted intercept-free regression of the variant-outcome associations on the variant-exposure associations. MR-Egger fits the same model as IVW but with an intercept. This model is robust to pleiotropy if the instrument strength is independent of the strength of the direct, pleiotropic, effect (called the InSIDE assumption) (5).

A recent systematic review of two-sample MR studies found that the most frequently implemented pleiotropy robust estimators were MR-Egger, weighted median, and weighted mode (7). Weighted Median will provide valid estimates if at least half the variants are valid instruments, and so is called a ‘majority valid’ estimator. Weighted mode makes the ZEro Modal Pleiotropy Assumption (ZEMPA), i.e. that there is zero pleiotropy in the modal estimand of the causal effect (8). ZEMPA is plausible because we should expect the causal effects for variants which are valid instruments to be similar, but each invalid variant to have its own unique pleiotropic bias (9). If the unique paths are independent of each other, then so too should the biases they exert on invalid variants. Thus, valid variants should have clustered effect estimates, while invalid variants should create heterogeneity. Hence, in settings where there are some valid instruments, we should expect the most common effect estimated to be the valid causal parameter. Here in, we call this type of estimator, which will produce valid estimates when a plurality of SNPs are valid, ‘plurality valid’ estimators.

Estimating modes directly from observed data can be difficult because no two estimates are ever exactly equal. Therefore, the most common observation at a given level of precision may be very different from the true mode. Traditional MBEs avoid this dilemma by smoothing the observed distribution using a parametric kernel-density-smoothed function. This converts the observed estimates into a probability density distribution, and then select the mode of this distribution. An alternative plurality valid estimator comes from the contamination mixture method (17).

The contamination mixture method uses a maximum likelihood approach, assuming the variant specific Wald ratios are normally distributed (17). It produces a consistent estimator of the causal effect under the plurality valid (ZEMPA) assumption. The advantages of the contamination mixture method are that it does not require the parametric assumptions of the kernel-density function, is more computationally efficient, and generally produces more precise estimates with potentially asymmetric confidence intervals (17).

Multivariable MR (MVMR) is an extension of MR to allow for the simultaneous modelling of the effect of multiple exposures on an outcome (10). The effects of each exposure in an MVMR model are the direct effects of the exposure on the outcome conditional on the other exposures. This has resulted in MVMR being applied as a method for mediation analyses (11), but it is also used to adjust for known biases in an MR model (12–14). MVMR modifies the three instrumental Variables assumptions so that the variant is a valid instrument if: 1) the variant is robustly associated with at least one exposure, 2) there are no variant-outcome confounders, 3) the variant can only cause the outcome via one or more of the exposures.

MVMR was originally introduced using a residual-based method, in which the effect of a second exposure on the outcome was removed from the variant-outcome association, and the effect of the second exposure on the exposure was removed from the variant-exposure association (13). These modified associations were then used as the input to a traditional MR estimator. However, given the analogy between IVW and weighted regression, two-sample MVMR is typically implemented as a type of multiple regression, in which the variant-outcome associations for the variants which associate with either exposure of interest are regressed on the variant-exposure associations in an intercept-free linear regression, inversely weighted by the variance in the variant-outcome association. MR-Egger can also be implemented by allowing for a non-zero regression intercept, and weighted median can be implemented using weighted quantile regression (15).

However, we are not aware of an existing method for doing mode-based regression, and there are no existing estimators for MVMR which make a plurality valid-type assumption like ZEMPA. The absence of a plurality valid estimator could hamper the comparison of univariable and multivariable MR studies. Here, we, therefore, introduce and validate a framework for implementing plurality valid estimators in two-sample MVMR.

Methods

Theoretical background

Let β_y be a vector of genetic variant-outcome associations and β_x be a vector of genetic variant-exposure associations. We shall denote with subscript i the ith element of any vector, which relates to the ith genetic variant. We assume the genetic variants are independently distributed. In practice, we do not observe β_y and β_x, but may obtain estimates, for example from GWAS. We denote the vectors of association estimates by and , and suppose that these are related according to the model proposed by Bowden et al. (16): where the elements of ε are independent with . The scalar represents the casual effect of the exposure on the outcome. The vector α represents pleiotropic effects, with α_i = 0 if the ith variant is a valid instrument. A plurality valid estimator should produce consistent estimates of θ provided that a plurality of the α_i are zero, i.e. under the ZEMPA assumption.

Now suppose we have estimates for two exposures, denoted by x₁ and x₂, and we extend (1) as follows: If we take the residuals from regressing (without an intercept) both sides of (2) on , then we have where are the residuals from regressing on are the residuals from regressing on are the residuals from regressing α on , and are the residuals from regressing ε on (where each of these regressions is performed without an intercept).

Because we have now reformulated the equation for the variant-outcome association so that it is in terms of a univariable regression model, and can be used as the inputs to a traditional univariable mode-based estimator. When more than one exposure is of interest, then this process can be iterated for each exposure.

It follows that a plurality valid estimator for the estimate θ from (3) will produce a valid estimate provided that a plurality of the values of are zero. This seems likely to be the case if a plurality of the elements of α are zero and the non-zero elements are distributed around zero (i.e., balanced pleiotropy).

In settings with only two exposures, the residuals could be obtained through univariable MR of the outcome on the second exposure, and of the first exposure on the second exposure. Where there are more than two exposures, an existing multivariable MR method could be used instead be used to create residuals. Here we explore two types of plurality valid estimators. Firstly, we explore an estimator which uses a regression model to create the residuals fed into a traditional mode-based estimator (MBE) (8), which we dub ‘multivariable-MBE’. This regression model could be created using any of the existing MVMR-estimators. Here we model the residuals using IVW (i.e. intercept-free linear regression). Although ultimately arbitrary, we focused on IVW, rather than another type of MR estimator, because it provides the most intuitive to understand validity conditions. Since the contamination mixture method has several advantages, discussed above, we also implemented this method using both the contamination mixture method. This ‘multivariable-CM’ estimator uses IVW to create residuals fed into a contamination mixture model.

Deriving a standard error multivariable-MBE and multivariable-CM

Assuming we have strong instruments (i.e. the first MR assumption is valid) we can use the first order approximation for the standard error of the Wald ratio that is typically used in two-sample MR studies. In a traditional univariable model this is defined as: Where SE_y,i is the standard error of the ith variant-outcome association estimate.

In effect, this standard error is assuming that the variant-exposure association is measured with sufficient precision that we can assume that it contributes no error to the estimate of the causal effect. Under this assumption, the process of creating residuals will not increase the random error in the standard error of the Wald ratio. Hence, we model the standard error of the ith Wald ratio estimate as:

Simulation study

We report our simulation study using the ADEMP (aims, data-generating mechanisms, estimands, methods, and performance measures) approach (18).

Aims

We ran a simple simulation study to assess the performance of our plurality valid estimators when compared to other MVMR estimators.

Data-generating mechanisms

We broadly simulate a setting in which there are two putative causal exposures for a single outcome. In the primary simulation we explore a setting in which the second exposure is pleiotropic (Figure 1), and where either both or neither of the exposures have a casual association with the outcome. We then explore how well the methods do under varying amounts of balanced and directional pleiotropy.

Figure 1.

Directed Acyclic Graphs of the simulation data generative models. E and E2 are the first and second exposures respectively, GRS is the genetic liability to the exposures, and O is the outcome, and C is a confounder.

More formally, we simulated 200 single nucleotide polymorphisms (SNPs, which are common genetic variants) as independent and identically distributed binomial variables with the following parameters: We additionally simulated the SNP effects on the exposures as independent and identically distributed normal variables For settings in which we simulated pleiotropy (Figures 1.2A and 1.2B), the pleiotropic SNP effects were simulated as: Each simulation was repeated with BETA being set to either 0 or -0.03 to represent balanced and directional pleiotropy respectively. SE was always set to 0.1.

We then simulated a confounder as a normally distributed variable with the following parameters: We then defined the first exposure as: where ε is an error term such that ε₁ ∼ N(0, 1²).

The second exposure was defined as: where ε is an error term such that ε₃ ∼ N(0, 1²).

When both exposures had null effects on the outcome (Figures 1.1B and 1.2B), the outcome was defined as: where ε is an error term such that ε₄ ∼ N(0, 1²). P could take the value of 0, 20, 40, or 80 to represent pleiotropic effects for 0, 10%, 20% or 40% of SNPs.

When both exposures had non-null effects on the outcome (Figures 1.1A and 1.2A),the outcomes were defined as: GWAS summary statistics for each exposure variable were estimated from linear regression models. Each genetic association with each exposure, and the outcome, were estimated from a unique sample of 200,000 participants with no sample overlap with the other GWASs.

Estimands

The causal effects of each exposure on the outcome.

Methods

We compare five methods for estimating the causal effect of the exposure on the outcome: IVW (intercept free multiple regression of the variant-outcome associations on the variant-exposure associations weighted by the inverse variance in the variant-outcome association), MR-Egger (multiple regression of the variant-outcome associations on the variant-exposure associations weighted by the inverse variance in the variant-outcome association), Weighted Median (quantile regression of the variant-outcome associations on the variant-exposure associations weighted by the inverse variance in the variant-outcome association), multivariable-MBE (using IVW to create the residuals and an MBE to estimate the causal effect), and multivariable-CM (using IVW to create the residuals and the contamination mixture method estimate the causal effect). IVW, MR-Egger, and weighted median were chosen because they appear to be some of the most widely used estimators which use different assumptions.

Performance measure

The primary performance measures were mean bias, 95% CI width, and the percentage of times that the confidence intervals include zero. When there is no causal effect, this will represent the type-2 error rate. When there is a casual effect, it measures one minus the type-1 error rate. In additional analyses we also explore the standard deviation of the effect estimate (overall 1000 simulations), and coverage for the causal effect of the exposure on the outcome over the 1000 iterations.

Applied example

We re-analysed the applied example (on the effect of intelligence, education, and household income on Alzheimer’s disease) from Grant and Burgess’ (2021) paper on pleiotropy robust estimators for MMVR (19). This had previously been studied by Davies et al and Anderson et al. (20,21). Anderson et al., in particular, had shown that a multivariable model was important for accounting for the collinearity between intelligence and education. Grant and Burgess then added household income to explore how the models worked with an additional risk factor.

Here we re-analysed the data used by Grant and Burgess (2021). They used 213 genetic variants from Davies et al. as instruments. These instruments had been clumped to ensure independence from each other and all had F statistics greater than 10, although the mean conditional F statistics ranged between 1.5 and 2.5. They used the Hill et al GWAS of intelligence (n = 199,242 male and female European ancestry individuals) (22), Okbay et al GWAS of years of education (n = 293,723 male and female European ancestry individuals) (23), and the Neale Lab UK Biobank GWAS of household income (n = 337,199 male and female European ancestry individuals) as sources of exposure data (24). Since household income is an ordinal categorical variable, the genetic variant associations represent the increase in log odds of being in a higher income category per extra effect allele. Grant and Burgess (2021) additionally used Lambert et al. as a source of Alzheimer’s data (n = 74,046 male and female European ancestry individuals) (25). More information on the data sources can be found in the original publications.

We implemented our two novel estimators, as well as IVW, MR-Egger, and MR-Median. Since the genetic associations with education and intelligence were in the same direction, the MR-Egger estimates can be interpreted as being oriented in the direction of either of these exposures.

Results

Simulation

Table 1 presents the results for the primary performance measures (bias and 95% CI width) of the simulations from the settings in which both exposures cause the outcome, while in Table 2 neither exposure exerts a causal effect on the outcome. The mean conditional F statistic for Exposure 1 was around 197, and 186 for Exposure 2 (Supplementary Table 1).

View this table:

Table 1.

Primary results for setting where both exposures cause the outcome, and exposure 2 is pleiotropic.

View this table:

Table 2.

Primary results for setting where neither exposure causes the outcome, and exposure 2 is pleiotropic.

Bias

In both Table 1 and Table 2, all methods performed well in the no-bias setting. The small amount of bias observed (0.1% - 0.5%) is explicable by weak instrument bias (Supplementary Table 1) and the variability in the estimates (Supplementary Table 2 and 3). When there was balanced pleiotropy, the multivariable-MBE seemed to underperform the non-plurality valid estimators while the multivariable -CM estimator appeared to do slightly better. Multivariable-CM was comparatively unbiased by even large amounts of balanced pleiotropy. However, moderate amounts of directional pleiotropy were sufficient to bias estimates more than the Median estimator. For example, in the setting where both exposures are causal and there was 40% directional pleiotropy, the first and second exposure estimates were biased by -0.055 and -0.008 respectively for the Median estimator, but 0.073 and 0.054 for multivariable-CM. Multivariable-MBE was more biased than multivariable-CM in all settings. For example, using the same simulation as above, multivariable-MBE was biased by 0.253 and -0.113 in the estimates for exposure 1 and 2 respectively.

95% CI Width

The multivariable-MBE had the widest 95% CIs of all the estimators. For example, in the no bias simulation, the 95% CI widths were five to ten time larger than for the other estimators. The non-plurality valid estimators generally had similarly wide 95% CI. Multivariable-CM generally had tighter 95% CI than the other estimators.

Coverage and power

Since it had wide 95% CI, multivariable-MBE unsurprisingly had a low type-1 error rate (the 95% CI included the null in all settings > 98% when there was no association), but a high type-2 (the 95% CI included the null up to 35% of the time in settings where there was a true association). Multivariable-CM conversely had a very low type-2 error rate (the 95% CI never included the null when there was a true association). Multivariable-CM had a type-1 error rate at the nominal level (5%) for the 0% and 10% balance pleiotropy scenarios. In contrast, the Median estimator had type-1 error rates well below the nominal level in these scenarios. The type-1 error rates for Multivariable-CM were above the nominal level from 20% balanced pleiotropy, and for all levels of directional pleiotropy.

Additional outcomes

Standard deviation of the effect estimates across the 1000 simulations

The SD of effect estimates between the multivariable-CM estimator and the non-plurality valid estimators were similar in the no-bias setting and when there was balanced pleiotropy (Supplementary Tables 2 and 3). However, multivariable-MBE had much wider SD, possibility because MBE produces less precise estimates than the contamination mixture method. In addition, all the plurality valid estimators had larger standard deviations when there was directional pleiotropy.

Coverage

Although all the estimators achieved 95% coverage when neither exposure was causals and there was no bias (Supplementary Table 3), surprisingly, except for Weighted Median and Multivariable-MBE, most estimators did not achieve at least 95% coverage when both exposures were causal (Supplementary Table 2). This might be because Weighted Median and Multivariable-MBE had the widest CI width (Table 1 and 2) and all estimators were being effected by weak-instrument bias.

Applied example

As with Grant and Burgess (2021), the pleiotropy robust estimators provided consistent estimates of the effects of education, intelligence, and household income on Alzheimer’s disease (Table 3). All estimators concluded a null effect of education on Alzheimer’s, conditional on the other exposures. However, they all implied a negative effect on intelligence, although the 95% CI for MR-Egger and multivariable-MBE included the null hypotheses. All estimators estimated a log odds ratio of household income around 0.3, but again with 95% CI which included zero. As the original study concluded “[t]he consistency of the findings give strength to the assertion that intelligence has a causally protective effect on Alzheimer’s disease, conditional on years of education and household income. However, there is no evidence of a direct effect of years of education or household income on Alzheimer’s disease.”

Discussion

Here we introduce two plurality valid estimators for multivariable Mendelian randomisation. Unlike most existing estimators, these use residual methods rather than multivariable regression models to produce the final effect estimates. We then used simulations with varying amounts of directional and balanced pleiotropy, as well as a re-analysis of the effect of intelligence, years of education, and household income on Alzheimer’s disease to compare the relative performance of our estimators with each other and existing estimators for MVMR.

As with previous analyses, our estimators implied that intelligence has a protective effect on Alzheimer’s disease, while years of education and household income do not. This has two important implications, firstly that as the years of mandatory education increase, there should not be a corresponding increase in Alzheimer’s. Secondly, our results imply that public health interventions to boost intelligence, beyond additional years of education, may be useful in reducing the burden of Alzheimer’s, although further research would be needed to confirm this hypothesis.

Of the two plurality valid estimators, multivariable-CM, which uses IVW to create the residuals fed into a contamination mixture model, overall performed the best. It generally performed at least as well, if not better, than MR-Egger and IVW in terms of bias and precision in all settings. Indeed, when there was balanced pleiotropy, it was both more precise and less biased than IVW. However, in settings with moderate-to-high amounts of directional pleiotropy it was a lot more biased than Weighted median. Indeed, the high precision of the CM estimates is probably detrimental in this setting as it resulted in lower coverage than the other methods. The divergence in performance between balanced and directional settings is probably, as discussed in the methods section, due to the multivariable-CM method assuming balanced pleiotropy. Hence, we would expect the method to perform better under situations where the distribution of Wald ratios with directional pleiotropy is similar to the assumed model with balanced pleiotropy, such as when the absolute amount of directional bias is small. Thus, while we think it can help triangulate results between a univariate and multivariable setting by allowing the use of a plurality valid estimator in both analyses, or between multiple multivariable estimators, we cannot recommend using it alone unless there is a priori evidence that there should be no directional pleiotropy. Multivariable-MBE was sufficiently imprecise that they might be uninformative in practice. We developed the MVMRmode R package (available from https://github.com/bar-woolf/MVMRmode/wiki) to help facilitate the use of the estimators.

Our simulations are not without limitations. Firstly, although pleiotropy can vary continuously between studies, we explore only discrete amounts of this biases. This could potentially mask non-linearities in the performance of pleiotropy robust estimators for MVMR in the presence of these biases. In addition, all our simulations assume linearity and homogeneity (i.e. no effect modification or interaction) of the effects of the risk factors on the outcomes. Finally, although multivariable-CM and multivariable-MBE could be implemented using estimates other than IVW to create residuals, here we have implemented it explicitly using IVW because the interpretation of the validity assumption using the other estimators is unclear.

In summary, here we introduce a framework for implementing plurality valid estimators for multivariable Mendelian randomisation in the absence of modal regression. Of these, the multivariable-CM estimator, which uses IVW to create residuals then fed into a contamination mixture method, appeared to perform the best. Although it performed very well with large amounts of balanced pleiotropy, it underperformed methods like Weighted median when there was directional pleiotropy. We hope these estimators (available from https://github.com/bar-woolf/MVMRmode/wiki) will further enable the future triangulation of univariable MR studies which have used plurality valid estimators with multivariable MR designs.

Data Availability

The R code used in this study is available from https://doi.org/10.17605/OSF.IO/8DZKU. We have also created an R package available from github (https://github.com/bar-woolf/MVMRmode/wiki) to facilitate the implementation of our proposed method. All data produced in the present study are available upon reasonable request to the authors.

Declarations

Ethics approval and consent to participate

Not applicable

Consent for publication

All authors consent to this work being published.

Availability of data and materials

Competing interests

DG is employed part-time by Novo Nordisk. The other authors declare no conflicts of interest.

Funding

Benjamin Woolf is funded by an Economic and Social Research Council (ESRC) South West Doctoral Training Partnership (SWDTP) 1+3 PhD Studentship Award (ES/P000630/1).

Authors’ contributions

BW conceived of the study. All authors contributed to the design and writing of the manuscript.

Acknowledgements

This work was carried out using the computational facilities of the Advanced Computing Research Centre, University of Bristol - http://www.bris.ac.uk/acrc/.

References

1.↵
Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Human Molecular Genetics. 2014 Sep 15;23(R1):R89–98.
OpenUrl CrossRef PubMed Web of Science
2.↵
Davey Smith G, Holmes MV, Davies NM, Ebrahim S. Mendel’s laws, Mendelian randomization and causal inference in observational data: substantive and nomenclatural issues. Eur J Epidemiol. 2020 Feb 1;35(2):99–111.
OpenUrl CrossRef
3.↵
Burgess S, Small DS, Thompson SG. A review of instrumental variable estimators for Mendelian randomization. Stat Methods Med Res. 2017 Oct;26(5):2333–55.
OpenUrl CrossRef PubMed
4.↵
Hartwig FP, Davies NM, Hemani G, Smith GD. Two-sample Mendelian randomization: avoiding the downsides of a powerful, widely applicable but potentially fallible technique. International Journal of Epidemiology. 2016 Dec;45(6):1717–26.
OpenUrl CrossRef PubMed
5.↵
Slob EAW, Burgess S. A comparison of robust Mendelian randomization methods using summary data. Genetic Epidemiology. 2020;44(4):313–29.
OpenUrl PubMed
6.
Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. International Journal of Epidemiology. 2015 Apr 1;44(2):512–25.
OpenUrl CrossRef PubMed
7.↵
Woolf B, Di Cara N, Moreno-Stokoe C, Skrivankova V, Drax K, Higgins JPT, et al. Investigating the transparency of reporting in two-sample summary data Mendelian randomization studies using the MR-Base platform. International Journal of Epidemiology. 2022 Apr 6;dyac074.
8.↵
Hartwig FP, Davey Smith G, Bowden J. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int J Epidemiol. 2017 Dec;46(6):1985–98.
OpenUrl CrossRef PubMed
9.↵
Bowden J, Hemani G, Davey Smith G. Invited Commentary: Detecting Individual and Global Horizontal Pleiotropy in Mendelian Randomization—A Job for the Humble Heterogeneity Statistic? Am J Epidemiol. 2018 Dec 1;187(12):2681–5.
OpenUrl PubMed
10.↵
Sanderson E, Spiller W, Bowden J. Testing and correcting for weak and pleiotropic instruments in two-sample multivariable Mendelian randomization. Statistics in Medicine. 2021;40(25):5434–52.
OpenUrl CrossRef PubMed
11.↵
Carter AR, Sanderson E, Hammerton G, Richmond RC, Davey Smith G, Heron J, et al. Mendelian randomisation for mediation analysis: current methods and challenges for implementation. Eur J Epidemiol. 2021 May 1;36(5):465–78.
OpenUrl CrossRef PubMed
12.↵
Schooling CM, Lopez PM, Yang Z, Zhao JV, Au Yeung SL, Huang JV. Use of Multivariable Mendelian Randomization to Address Biases Due to Competing Risk Before Recruitment. Frontiers in Genetics [Internet]. 2021 [cited 2022 Apr 6];11. Available from: https://www.frontiersin.org/article/10.3389/fgene.2020.610852
13.↵
Burgess S, Thompson SG. Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am J Epidemiol. 2015 Feb 15;181(4):251–60.
OpenUrl CrossRef PubMed
14.↵
Woolf B. mesrument error and MR. 2021 Dec 13 [cited 2022 Apr 23]; Available from: https://osf.io/yxzwc/
15.↵
Rees JMB, Wood AM, Burgess S. Extending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy. Stat Med. 2017 Dec 20;36(29):4705–18.
OpenUrl CrossRef PubMed
16.↵
Bowden J, Del Greco M F, Minelli C, Davey Smith G, Sheehan N, Thompson J. A framework for the investigation of pleiotropy in two-sample summary data Mendelian randomization. Stat Med. 2017 May 20;36(11):1783–802.
OpenUrl CrossRef PubMed
17.↵
Burgess S, Foley CN, Allara E, Staley JR, Howson JMM. A robust and efficient method for Mendelian randomization with hundreds of genetic variants. Nat Commun. 2020 Jan 17;11(1):376.
OpenUrl PubMed
18.↵
Morris TP, White IR, Crowther MJ. Using simulation studies to evaluate statistical methods. Statistics in Medicine. 2019;38(11):2074–102.
OpenUrl CrossRef PubMed
19.↵
Grant AJ, Burgess S. Pleiotropy robust methods for multivariable Mendelian randomization. Stat Med. 2021 Nov 20;40(26):5813–30.
OpenUrl
20.↵
Anderson EL, Howe LD, Wade KH, Ben-Shlomo Y, Hill WD, Deary IJ, et al. Education, intelligence and Alzheimer’s disease: evidence from a multivariable two-sample Mendelian randomization study. International Journal of Epidemiology. 2020 Aug 1;49(4):1163–72.
OpenUrl CrossRef PubMed
21.↵
Davies NM, Hill WD, Anderson EL, Sanderson E, Deary IJ, Davey Smith G. Multivariable two-sample Mendelian randomization estimates of the effects of intelligence and education on health. Teare MD, Franco E, Burgess S, editors. eLife. 2019 Sep 17;8:e43990.
OpenUrl
22.↵
Hill WD, Marioni RE, Maghzian O, Ritchie SJ, Hagenaars SP, McIntosh AM, et al. A combined analysis of genetically correlated traits identifies 187 loci and a role for neurogenesis and myelination in intelligence. Mol Psychiatry. 2019 Feb;24(2):169–81.
OpenUrl
23.↵
Okbay A, Beauchamp JP, Fontana MA, Lee JJ, Pers TH, Rietveld CA, et al. Genome-wide association study identifies 74 loci associated with educational attainment. Nature. 2016 May;533(7604):539–42.
OpenUrl CrossRef PubMed
24.↵
Rapid GWAS of thousands of phenotypes for 337,000 samples in the UK Biobank [Internet]. Neale lab. [cited 2022 Jul 18]. Available from: http://www.nealelab.is/blog/2017/7/19/rapid-gwas-of-thousands-of-phenotypes-for-337000-samples-in-the-uk-biobank
25.↵
Lambert JC, Ibrahim-Verbaas CA, Harold D, Naj AC, Sims R, Bellenguez C, et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease. Nat Genet. 2013 Dec;45(12):1452–8.
OpenUrl CrossRef PubMed

View the discussion thread.

Posted January 11, 2023.

Download PDF

Supplementary Material

Data/Code

Citation Tools

Subject Area

Epidemiology

Subject Areas

All Articles

Addiction Medicine (405)
Allergy and Immunology (714)
Anesthesia (209)
Cardiovascular Medicine (2986)
Dentistry and Oral Medicine (337)
Dermatology (254)
Emergency Medicine (446)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1056)
Epidemiology (12845)
Forensic Medicine (12)
Gastroenterology (838)
Genetic and Genomic Medicine (4650)
Geriatric Medicine (428)
Health Economics (735)
Health Informatics (2959)
Health Policy (1076)
Health Systems and Quality Improvement (1097)
Hematology (394)
HIV/AIDS (937)
Infectious Diseases (except HIV/AIDS) (14172)
Intensive Care and Critical Care Medicine (858)
Medical Education (430)
Medical Ethics (116)
Nephrology (478)
Neurology (4439)
Nursing (239)
Nutrition (653)
Obstetrics and Gynecology (819)
Occupational and Environmental Health (741)
Oncology (2310)
Ophthalmology (655)
Orthopedics (260)
Otolaryngology (328)
Pain Medicine (285)
Palliative Medicine (85)
Pathology (504)
Pediatrics (1205)
Pharmacology and Therapeutics (511)
Primary Care Research (506)
Psychiatry and Clinical Psychology (3823)
Public and Global Health (7036)
Radiology and Imaging (1557)
Rehabilitation Medicine and Physical Therapy (930)
Respiratory Medicine (926)
Rheumatology (447)
Sexual and Reproductive Health (450)
Sports Medicine (387)
Surgery (494)
Toxicology (60)
Transplantation (213)
Urology (186)

[1] 1.↵
Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Human Molecular Genetics. 2014 Sep 15;23(R1):R89–98.
OpenUrl CrossRef PubMed Web of Science

[2] 2.↵
Davey Smith G, Holmes MV, Davies NM, Ebrahim S. Mendel’s laws, Mendelian randomization and causal inference in observational data: substantive and nomenclatural issues. Eur J Epidemiol. 2020 Feb 1;35(2):99–111.
OpenUrl CrossRef

[3] 3.↵
Burgess S, Small DS, Thompson SG. A review of instrumental variable estimators for Mendelian randomization. Stat Methods Med Res. 2017 Oct;26(5):2333–55.
OpenUrl CrossRef PubMed

[4] 4.↵
Hartwig FP, Davies NM, Hemani G, Smith GD. Two-sample Mendelian randomization: avoiding the downsides of a powerful, widely applicable but potentially fallible technique. International Journal of Epidemiology. 2016 Dec;45(6):1717–26.
OpenUrl CrossRef PubMed

[5] 5.↵
Slob EAW, Burgess S. A comparison of robust Mendelian randomization methods using summary data. Genetic Epidemiology. 2020;44(4):313–29.
OpenUrl PubMed

[6] 6.
Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. International Journal of Epidemiology. 2015 Apr 1;44(2):512–25.
OpenUrl CrossRef PubMed

[7] 7.↵
Woolf B, Di Cara N, Moreno-Stokoe C, Skrivankova V, Drax K, Higgins JPT, et al. Investigating the transparency of reporting in two-sample summary data Mendelian randomization studies using the MR-Base platform. International Journal of Epidemiology. 2022 Apr 6;dyac074.

[8] 8.↵
Hartwig FP, Davey Smith G, Bowden J. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int J Epidemiol. 2017 Dec;46(6):1985–98.
OpenUrl CrossRef PubMed

[9] 9.↵
Bowden J, Hemani G, Davey Smith G. Invited Commentary: Detecting Individual and Global Horizontal Pleiotropy in Mendelian Randomization—A Job for the Humble Heterogeneity Statistic? Am J Epidemiol. 2018 Dec 1;187(12):2681–5.
OpenUrl PubMed

[10] 10.↵
Sanderson E, Spiller W, Bowden J. Testing and correcting for weak and pleiotropic instruments in two-sample multivariable Mendelian randomization. Statistics in Medicine. 2021;40(25):5434–52.
OpenUrl CrossRef PubMed

[11] 11.↵
Carter AR, Sanderson E, Hammerton G, Richmond RC, Davey Smith G, Heron J, et al. Mendelian randomisation for mediation analysis: current methods and challenges for implementation. Eur J Epidemiol. 2021 May 1;36(5):465–78.
OpenUrl CrossRef PubMed

[12] 12.↵
Schooling CM, Lopez PM, Yang Z, Zhao JV, Au Yeung SL, Huang JV. Use of Multivariable Mendelian Randomization to Address Biases Due to Competing Risk Before Recruitment. Frontiers in Genetics [Internet]. 2021 [cited 2022 Apr 6];11. Available from: https://www.frontiersin.org/article/10.3389/fgene.2020.610852

[13] 13.↵
Burgess S, Thompson SG. Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am J Epidemiol. 2015 Feb 15;181(4):251–60.
OpenUrl CrossRef PubMed

[14] 14.↵
Woolf B. mesrument error and MR. 2021 Dec 13 [cited 2022 Apr 23]; Available from: https://osf.io/yxzwc/

[15] 15.↵
Rees JMB, Wood AM, Burgess S. Extending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy. Stat Med. 2017 Dec 20;36(29):4705–18.
OpenUrl CrossRef PubMed

[16] 16.↵
Bowden J, Del Greco M F, Minelli C, Davey Smith G, Sheehan N, Thompson J. A framework for the investigation of pleiotropy in two-sample summary data Mendelian randomization. Stat Med. 2017 May 20;36(11):1783–802.
OpenUrl CrossRef PubMed

[17] 17.↵
Burgess S, Foley CN, Allara E, Staley JR, Howson JMM. A robust and efficient method for Mendelian randomization with hundreds of genetic variants. Nat Commun. 2020 Jan 17;11(1):376.
OpenUrl PubMed

[18] 18.↵
Morris TP, White IR, Crowther MJ. Using simulation studies to evaluate statistical methods. Statistics in Medicine. 2019;38(11):2074–102.
OpenUrl CrossRef PubMed

[19] 19.↵
Grant AJ, Burgess S. Pleiotropy robust methods for multivariable Mendelian randomization. Stat Med. 2021 Nov 20;40(26):5813–30.
OpenUrl

[20] 20.↵
Anderson EL, Howe LD, Wade KH, Ben-Shlomo Y, Hill WD, Deary IJ, et al. Education, intelligence and Alzheimer’s disease: evidence from a multivariable two-sample Mendelian randomization study. International Journal of Epidemiology. 2020 Aug 1;49(4):1163–72.
OpenUrl CrossRef PubMed

[21] 21.↵
Davies NM, Hill WD, Anderson EL, Sanderson E, Deary IJ, Davey Smith G. Multivariable two-sample Mendelian randomization estimates of the effects of intelligence and education on health. Teare MD, Franco E, Burgess S, editors. eLife. 2019 Sep 17;8:e43990.
OpenUrl

[22] 22.↵
Hill WD, Marioni RE, Maghzian O, Ritchie SJ, Hagenaars SP, McIntosh AM, et al. A combined analysis of genetically correlated traits identifies 187 loci and a role for neurogenesis and myelination in intelligence. Mol Psychiatry. 2019 Feb;24(2):169–81.
OpenUrl

[23] 23.↵
Okbay A, Beauchamp JP, Fontana MA, Lee JJ, Pers TH, Rietveld CA, et al. Genome-wide association study identifies 74 loci associated with educational attainment. Nature. 2016 May;533(7604):539–42.
OpenUrl CrossRef PubMed

[24] 24.↵
Rapid GWAS of thousands of phenotypes for 337,000 samples in the UK Biobank [Internet]. Neale lab. [cited 2022 Jul 18]. Available from: http://www.nealelab.is/blog/2017/7/19/rapid-gwas-of-thousands-of-phenotypes-for-337000-samples-in-the-uk-biobank

[25] 25.↵
Lambert JC, Ibrahim-Verbaas CA, Harold D, Naj AC, Sims R, Bellenguez C, et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease. Nat Genet. 2013 Dec;45(12):1452–8.
OpenUrl CrossRef PubMed

MVMRmode: Introducing an R package for plurality valid estimators for multivariable Mendelian randomisation

Abstract

Background

Methods

Theoretical background

Deriving a standard error multivariable-MBE and multivariable-CM

Simulation study

Aims

Data-generating mechanisms

Estimands

Methods

Performance measure

Applied example

Results

Simulation

Bias

95% CI Width

Coverage and power

Additional outcomes

Standard deviation of the effect estimates across the 1000 simulations

Coverage

Applied example

Discussion

Data Availability

Declarations

Ethics approval and consent to participate

Consent for publication

Availability of data and materials

Competing interests

Funding

Authors’ contributions

Acknowledgements

References

Citation Manager Formats

Subject Area