A new robust and accurate two-sample Mendelian randomization method with a large number of genetic variants

Lei Zhang; Jun-Jie Niu; Xian-Mei He; Xiao Zheng; Qi-Gang Zhao; Xiu-Juan Yu; Li Luo; Hai-Gang Ren; Yu-Fang Pei

doi:10.1101/2024.04.16.24305933

Abstract

Horizontal pleiotropy can significantly confound causal estimates in Mendelian randomization (MR) studies, particularly when numerous instrumental variables (IVs) are employed. In this study we propose a novel statistical method, Mendelian Randomization analysis based on Z-scores (MRZ), to conduct robust and accurate MR analysis in the presence of pleiotropy. MRZ models the IV-outcome association z-score as a mixture distribution, separating the causal effect of the exposure on the outcome from pleiotropic effects specific to each IV. By classifying IVs into distinct categories (valid, uncorrelated pleiotropic, and correlated pleiotropic), MRZ constructs a likelihood function to estimate both causal and pleiotropic effects. Simulation studies demonstrate MRZ’s robustness, power, and accuracy in identifying causal effects under diverse pleiotropic scenarios and overlapped samples. In a bidirectional MR analysis of appendicular lean mass (ALM) and four lipid traits using both the UK Biobank (UKB)-internal datasets and the UKB-Global Lipids Genetics Consortium (GLGC) joint datasets, MRZ consistently identified a causal effect of ALM on total cholesterol (TC) and low-density lipoprotein cholesterol (LDL-C). Conversely, existing methods often detected mutual causal relationship between lipid traits and ALM, highlighting their susceptibility to confounding by horizontal pleiotropy. A randomized controlled experiment conducted in mice validated the absence of causal effect of TC on ALM, corroborating the MRZ findings and further emphasizing its resilience against pleiotropic biases.

Introduction

Determining whether a modifiable exposure causes a particular disease outcome is crucial for understanding disease mechanisms and guiding prevention and clinical interventions. However, traditional observational analyses often face limitations in establishing such causal effects due to potential confounding by unobserved variables or reverse causation [1]. Mendelian randomization (MR) provides a solution by offering a statistical approach to infer causation from observational data while circumventing both unobserved confounding and reverse causation [1, 2]. MR utilizes genetic variation associated with an exposure as an instrumental variable (IV) to investigate the causal effect of the exposure on an outcome [2].

MR has proven highly successful in uncovering causal relationships across a wide range of epidemiological conditions and diseases, establishing itself as a popular approach in modern genetic epidemiology [3]. The rapid growth of genome-wide association studies (GWASs) further accelerates the application of MR by providing a wealth of data. With larger sample sizes, modern GWASs generate numerous IVs suitable for MR analysis. For instance, the latest human height GWAS identified over 11,000 independent association signals [4], enhancing the power and accuracy of MR estimation.

However, many current MR analyses are challenged by a potentially severe confounding factor known as horizontal pleiotropy [5]. Horizontal pleiotropy violates the exclusive restriction assumption underlying the MR principle, casting doubt on the validity of MR analysis [6-8]. This issue becomes more pronounced as the number of IVs increases, as chance can lead to an increase in shared heritability factors between exposure and outcome. Therefore, effective correction for confounding due to horizontal pleiotropy is essential, especially in large sample settings.

Considerable efforts in the literature have addressed the pleiotropic effect and sought to mitigate its adverse impact on MR analysis [9, 10]. In a two-sample setting, multiple popular statistical methods accommodating pleiotropic IVs have been proposed and widely utilized [11-37]. However, under certain conditions, some methods may fail to adequately correct for horizontal pleiotropy or accurately detect true causal effects [9].

In this study, with the aim of conducting robust and accurate MR studies in the presence of horizontal pleiotropy, we introduce a novel statistical method. This method explicitly distinguishes causal effect from pleiotropic effect and estimates both within the maximum likelihood framework. Through simulation studies, we demonstrate that our proposed method effectively corrects for pleiotropic effects across various confounding scenarios while maintaining statistical power at a comparable level. As an application, we investigate the bidirectional causal effects of four lipid traits and appendicular lean mass (ALM) using data from the UK biobank (UKB) cohort and summary statistics from the Global Lipids Genetics Consortium (GLGC) [38]. Finally, we conduct a randomized controlled experiment in mice to address controversial findings revealed by our proposed method compared to alternative methods.

Results

Outline of the proposed method

The diagram illustrating the proposed method is presented in Figure 1. In this context, we consider the causal effect, denoted as r, of a continuous exposure trait X on a continuous outcome trait Y. Our objective is to estimate this causal effect using a collection of independent IVs that exhibit an association with X. These IVs are categorized into three distinct types based on their pleiotropic effects on the outcome and the correlation of these pleiotropic effects with the IV-exposure effects at corresponding IVs:

Figure 1. Diagram of the proposed method.

Four elements are involved: IV G, exposure X, outcome Y, and confounder U. X and Y are represented by their association z-scores with G. There are three potential pathways from G to Y, depending on G’s pleiotropic status: valid (purple), uncorrelated pleiotropic (blue) and correlated pleiotropic (red). The causal effect Δ (green) exists in all pathways. After removing the causal effect Δ from z_Y (lower panel), the residual outcome z-score contains information on pleiotropic effects only. Depending on in which pathway G is, has a pathway-specific distribution (right panel), which is used to differentiate valid and pleiotropic IVs.

Valid IVs: These IVs do not exert horizontal pleiotropic effects on the outcome. They serve as reliable indicators of the exposure’s effect on the outcome without introducing additional confounding factors.
Uncorrelated pleiotropic IVs: This category comprises IVs that exhibit horizontal pleiotropic effects on the outcome. However, these effects are not correlated with the corresponding IV-exposure effects.
Correlated pleiotropic IVs: In this category, IVs demonstrate horizontal pleiotropic effects on the outcome that are correlated with the IV-exposure effects. This correlation violates the Instrument Strength Independent of Direct Effect (INSIDE) assumption [39].

In our analysis, we used genetic association z-scores, denoted as z_X and z_Y, to estimate r. The outcome z_Y follows a normal distribution with a mean μ_γ and a variance of one, i.e. Z_Y∼N(μ_γ,1), where μ_γ represents the effect of the IV on the outcome. The causal effect of X on Y can be interpreted as shifting the mean μ_γ by an IV-specific offset denoted as , where N and N are the sample sizes of the exposure and outcome, respectively. Importantly, this shift applies uniformly across all IVs regardless of whether they are valid or pleiotropic. Consequently, if we subtract this offset Δ from z_Y, we obtain a residual z-score This residual z-score no longer contains any information about the causal effect.

We formulated at different types of IVs to follow distinct distributions.

1. Valid IVs: at valid IVs asymptotically follows a standard normal distribution, i.e., 2. Uncorrelated pleiotropic IVs: At uncorrelated pleiotropic IVs, follows a normal distribution with a fixed mean parameter μ₁ and an enlarged variance, i.e., , where μ₁ and represent the mean and variance of the uncorrelated pleiotropic effects, respectively.

3. Correlated pleiotropic IVs: at correlated pleiotropic IVs follows another normal distribution. This distribution is characterized by a mean parameter related to z_X and an enlarged variance, i.e., , where μ_a and represent the mean and variance of the pleiotropic correlation, respectively. μ₂ and represent the mean and variance of the residual pleiotropic effects after adjusting for the pleiotropic correlation.

These formulations allow us to account for the different distributions of residual z-scores at various types of IVs, thereby capturing the distinct effects of horizontal pleiotropy on the outcome.

Given the presence of multiple independent IVs, we employed the maximum likelihood approach to estimate the parameters from the data. We assumed that the proportions of valid IVs, uncorrelated pleiotropic IVs, and correlated pleiotropic IVs are (1-τ), τ (1-ρ), and τ ρ, respectively. Here, τ and ρ, both fallowing within the range [0,1], represent the proportion of total pleiotropic IVs and the relative proportion of correlated pleiotropic IVs, respectively. Our model contains nine parameters in total: . We constructed a likelihood function of θ as follows Maximizing L with respect to θ provides the maximum likelihood estimate of θ. From this estimate, we can derive both the estimated causal effect, denoted as , and the distribution of pleiotropic effects.

To assess the statistical significance of the estimates, we employed the likelihood ratio test (LRT) approach. Specifically, we examined the significance of using a one-degree-of-freedom (df) chi-squared test. This test compares the maximized likelihood value with that obtained for a reduced likelihood function under the setting r=0. Additionally, we evaluated the existence of pleiotropic IVs using an 8-df chi-squared test. This test compares the maximized likelihood value with that obtained for another reduced likelihood function under the setting τ=0, in which r is the only model parameter.

The distribution of

The distribution of is central to our proposed method, Mendelian Randomization analysis using Z-scores (MRZ), as it forms the basis for distinguishing between valid and pleiotropic IVs. Among the two parameters shaping the distribution of , variance plays a crucial role in distinguishing between these categories. Through simulations, we investigated the distribution of and evaluated the impact of three key factors on its variance: the sample size of the outcome (N_Y), the variance of the pleiotropic effect , and the frequency of positive pleiotropic IVs (f_p). Here, was defined as the IV-attributable portion of outcome variance, that is, IV-specific outcome heritability. As expected, at valid IVs consistently conforms to a standard normal distribution (Figure 2). Conversely, at pleiotropic IVs, the variance of exceeds one and escalates with increasing N_Y and/or. Specifically, when all pleiotropic IVs align in the same direction as the IV-exposure effects (f_p=1), the variance of aligns well with theoretical expectations, which is . This variance amplifies when pleiotropic IVs consist of a mixture of positive and negative effects, reaching its maximum when the positive and negative pleiotropic IVs are evenly balanced (f_p=0.5).

Figure 2. The distribution of outcome z-scores at valid and pleiotropic IVs.

The influence of three factors on the distribution of outcome z-score was evaluated: outcome sample size (N_Y=1e5 or 2e5), the variance of IV-specific heritability to the outcome , and the proportion of positive pleiotropic IVs (fp=0.5, 0.7 or 1.0). The IV-specific heritability was drawn from an exponential distribution with a mean 5e-5 or 1e-4 so that its variance or (1e-4)². The exposure sample size was set to be N_X =2e5. The causal effect of X on Y was set to be r=0.05. z_Y is IV-outcome z-score. is the residual of z_Y after removing the causal effect. is the theoretical variance assuming fp=1.0. At valid IVs, both z_Y and conform to a standard normal distribution. At pleiotropic IVs, the variance of exceeds one and escalates with increasing N_Y and/or. When f_p=1.0, the variance of aligns well with , and amplifies when f_p<1.0, peaking when the positive and negative pleiotropic IVs are evenly balanced (f_p=0.5). z_Y has a similar distribution to .

Our simulations also revealed a striking resemblance between the distributions of and z_Y. This similarity arises from the positive nature of all z_X by definition, leading to a consistent shift of z_Y towards in the same direction across all IVs. Another fact reinforcing this similarity is the typically modest magnitude of the shift Δ, attributed to the small values of r in most real applications (e.g., <0.2). Consequently, we approximated the variance of by studying z_Y as if no causal effect were present. By assuming an exponential distribution for pleiotropic effects, we estimated the distribution’s sole parameter [40]. Subsequently, we estimated the variance of through Monto-Carlo sampling of . While this estimate assumes all pleiotropic effects align in the same direction and may thus underestimate the true variance, it provides a practical lower boundary from which the optimization of the likelihood function starts to work.

Detecting pleiotropic effects

When all IVs are valid, MRZ does not detect any pleiotropy in both null and causal simulations (P<0.05). Notably, it is more conservative than MRPRESSO [19] and MREGGER [22], both of which identify pleiotropy and horizontal pleiotropy in approximately 5% of iterations.

In scenarios where pleiotropic IVs are present, both MRZ and MRPRESSO demonstrate remarkable efficacy, presenting 100% power in detecting pleiotropy even when the proportion of pleiotropic IVs is as low as 10% (Figure 3A). Conversely, MREGGER exhibits significantly lower power in discerning the directionality of pleiotropic effects. Even at the highest proportion (60%), MR-EGGER’s power rate is only 55.5%.

Figure 3. The performance of MRZ in detecting pleiotropic effects.

A total of 200 IVs and 1000 iterations were simulated. A, The proportion of pleiotropic IVs ranges from 10%-60%. The relative proportion of correlated pleiotropic IVs is 50%. The power of detecting pleiotropy was declared at α =.0.05 level. B, Presented is the mean estimated proportion of pleiotropic IVs. Error bar is its standard deviation. Four scenarios were simulated, correlated pleiotropic effects and no causal effect (correlated_null), correlated pleiotropic effects and causal effect (correlated_causal), uncorrelated pleiotropic effects and no causal effect (uncorrelated_null), and uncorrelated pleiotropic effects and causal effect (uncorrelated_causal). C, In this simulation, the proportion of pleiotropic IVs was fixed at 60%, while the relative proportion of correlated pleiotropic IVs ranges from 10%-60%. Both causal and null scenarios were simulated. Presented is the mean estimated proportion of correlated pleiotropic IVs. The error bar is its standard deviation. D, the per-IV correlated pleiotropic effect was defined as the product of the proportion of correlated pleiotropic IVs and the mean correlated pleiotropic effect. The IV-specific heritability to the confounder was drawn from an exponential distribution with a mean 1e-4, 5e-5, or 0, respectively. These different settings correspond to different magnitudes of correlated pleiotropic effects.

The proportion of pleiotropic IVs estimated by MRZ exhibits a high correlation with the actual proportion, irrespective of the presence of a causal effect or the correlation status of pleiotropic effects (Figure 3B). However, a slight overestimate of the mean proportion by a relative proportion of approximately 10% is observed across all scenarios.

In instances where correlated pleiotropic IVs are present, MRZ’s ability to accurately estimate their proportion is unsatisfactory (Figure 3C). This suggests that uncorrelated and correlated pleiotropic IVs are indistinguishable by MRZ. Nevertheless, the per-IV mean correlated pleiotropic effect, defined as τρμ_a, exhibits a perfect linear relationship with the proportion of correlated pleiotropic IVs (Figure 3D). The slope of this linear trend remains consistent between null and causal simulations, indicating minimal influence of the causal effect on the trend. Furthermore, the slope increases with stronger correlated pleiotropic effects and diminishes when no correlated pleiotropic IVs are included. Therefore, MRZ demonstrates the capability to capture correlated pleiotropic effects by jointly modeling the proportion of correlated pleiotropic IVs and the magnitudes of their effects.

Causal effect: type-I error rate

Through a series of null simulations, we evaluated the type-I error rate of MRZ in testing causal effect. For comparative purposes, we incorporated 14 existing two-sample MR methods into our analysis. These include MREGGER [22], IVW [23], weighted-median (W-median) [21], weighted-mode (W-mode) [20], MRPRESSO [19], contamination mixture (CMix) [16], MRMix [18], MRAID [11], MRcML [12], GRAPPLE [13], MRLASSO [17], MRRAPS [14], MRROBUST [17], and CAUSE [15]. It’s worth noting that there are other promising methods not included in our analysis, some of which are extensively discussed elsewhere [9].

In the absence of pleiotropic effects, MRZ, along with other methods, effectively maintains type-I error rates close to the desired level of 5% (Table 1). However, W-mode (0.2%) and CAUSE (0.1%) demonstrate conservativeness, exhibiting lower type-I error rates.

View this table:

Table 1. Performance of various methods with all valid IVs.

In scenarios where pleiotropic effects are present, MRZ consistently maintains the correct type-I error rate across various proportions of pleiotropic IVs, irrespective of whether the pleiotropic effects are uncorrelated or correlated (Table 2). This robust performance also holds true under balanced pleiotropic effects (Supplemental Table 1). Among alternative methods, W-mode and CAUSE tend to be overly conservative, resulting in significantly reduced type-I error rates in all scenarios (Table 2). MREGGER shows validity only when pleiotropic effects are uncorrelated but exhibits an inflated type-I error rate of up to 28.1% when the pleiotropic effects are correlated. MRMix generally performs well under low to modest proportions of pleiotropic IVs but shows an inflated error rate reaching up to 21.6% under higher proportions. All the other alternative methods have inflated type-I error rates that increase with increased proportion of pleiotropic IVs, whereas the inflation is more severe under correlated pleiotropy than under uncorrelated pleiotropy. The type-I error rates for certain methods can reach nearly 100% in some extreme scenarios, rendering them invalid at all under such conditions. Even when the pleiotropic effects are balanced, the inflation of type-I error rates is still observed for most alternative methods, especially in correlated pleiotropic scenarios (Supplemental Table 1).

View this table:

Table 2. Type-I error rates of various methods under pleiotropic effects.

Further simulation studies involving a smaller set of 100 IVs (Supplemental Table 2) or a larger set of 500 IVs (Supplemental Table 3) reaffirm the validity of MRZ. Among alternative methods, W-mode and CAUSE exhibit inflated type-I error rates under high proportions of pleiotropic IVs when the number of IVs is 100 and 500, respectively, rendering them invalid (Supplemental Tables 2-3). Therefore, our simulations reveal MRZ’s efficacy in rectifying horizontal pleiotropy across diverse confounding scenarios, where existing methods often lose validity.

Notably, MRZ demonstrates strong robustness against sample overlap due to its effective control of horizontal pleiotropy. It maintains a valid type-I error rate even when exposure and outcome samples completely overlap (Supplemental Table 4), making it suitable for scenarios involving a single biobank-scale dataset such as the UKB cohort.

Causal effect: power and effect size

We conducted a series of causal simulations to examine the power and effect size of various methods. In scenarios without pleiotropic effects, nearly all alternative liberal methods—those prone to type-I error under pleiotropic scenarios—demonstrate remarkably high power rates almost reaching 100% (Table 1). MRZ notably achieves a power rate of 99.9%, positioning it among the most powerful methods. Conversely, W-mode exhibits the lowest power rate (10.4%), followed by MRMix (25.9%) and CAUSE (80.0%). Regarding effect size estimation, most methods, including MRZ, estimate mean effect sizes close to the true value of 0.050. MRZ displays one of the lowest mean errors (MEs, 0.008) among all methods. In contrast, W-mode (0.033) and MRMix (0.025) exhibit larger variations.

In the presence of horizontal pleiotropy, MRZ exhibits a decline in power as the proportion of pleiotropic IVs increases, as expected. This decline is similar in both positive and negative causal effect settings (Supplemental Table 5), indicating minimal impact of pleiotropic effects on MRZ’s power across diverse scenarios. Even with the highest proportion of 60% pleiotropic IVs, MRZ’s power rate maintains between 40%-58%.

Alternative methods demonstrate scenario-dependent performance, lacking a universal trend across all settings. The two overly conservative methods CAUSE and W-mode, along with MREGGER, MRMix and MRAID, generally exhibit declining power with increased proportion of pleiotropic IVs (Supplemental Table 5). For other alternative methods, two distinct trends are observed depending on the relative directions of the causal effect and the pleiotropic effects. When they align, these methods maintain high power rates at nearly 100% at all proportions of pleiotropic IVs. Conversely, when they oppose, these methods experience rapid decline in power rates with increased proportion of pleiotropic IVs (Supplemental Table 5). These conflicting trends highlight potential confounding effects of uncorrected pleiotropy on power estimation.

To ensure a fair comparison of power rates among methods despite varying type-I error rates, we corrected each method’s raw power rate by its type-I error rate at the corresponding null setting, assuming a correct type-I error rate of 5%. Like MRZ, all alternative methods have a decrease in their corrected power rates when the proportion of pleiotropic IVs increases (Figure 4). Among the methods, MRZ generally maintains the highest corrected power rate across various proportions of pleiotropic IVs. This improvement is particularly notable when the causal effect opposes the pleiotropic effects, in which cases uncorrected pleiotropic effects counteract or even reverse the true causal effect, resulting in a severe loss of power for alternative methods.

Figure 4. Corrected power rate of various methods for testing causal effect.

A total of 200 IVs were simulated. The sample sizes for both exposure and outcome samples were 2e5. The 200 IVs explained approximately a portion of 5.8% of the exposure variance. A positively dominated pleiotropic setting was simulated, in which 70% pleiotropic IVs were simulated to have positive pleiotropic effects while the remaining 30% were simulated to have negative pleiotropic effects. In addition, half pleiotropic IVs were simulated to be correlated with the corresponding IV-exposure effects. The causal effect r was set to be 0.05 (positive effect) or -0.05 (negative effect). The statistical significance was declared at α =.0.05 level. Raw power rate was corrected by calibrating it with the type-I error rate estimated at the corresponding null condition, using the equation

Uncorrected pleiotropic effects not only affect the power to detect the causal effect but also influence estimated effect size. MRZ’s estimated effect sizes align with the true value (0.05 or -0.05) across all scenarios (Figure 5), regardless of the proportion of pleiotropic IVs or the direction of the causal effect. Conversely, for all alternative methods, including W-mode and CAUSE, distinct trends are observed based on the relative directions of the causal effect to pleiotropic effects. When the directions align, mean effect sizes of all alternative methods tend to increase with the proportion of pleiotropic IVs, whereas they decrease when the directions oppose. At the highest proportion of 60% pleiotropic IVs, the decrease is so severe that the estimated effect sizes from all alternative methods are opposite to the true size.

Figure 5. Estimated causal effect size of various methods.

The causal effect r was set to be 0.05 (positive effect) or -0.05 (negative effect). The mean effect size across 1,000 iterations is presented.

Additional simulation studies with 100 and 500 IVs (Supplemental Figure 1) reveal an increased trend of MRZ’s power rate with an increasing number of IVs, underscoring the critical importance of including a substantial number of IVs for robust and powerful causal inference.

When modeling no pleiotropic effects, the proposed test statistic T₁ closely approximates the causal effect size estimated by the IVW test across all simulated scenarios (Supplemental Figure 2). This suggests that the IVW test provides a reasonable anchor for MRZ to adjust its estimated effect size when analyzing un-standardized summary statistics, such as case-control data.

Running time

The computation process of MRZ primarily focuses on optimizing the likelihood function twice: once for the alternative hypothesis and once for the null hypothesis. Despite involving a high dimension of nine parameters, the derivative-free optimization algorithm nmkb that we employed offers an efficient solution. MRZ completes the optimization within seconds even on datasets containing 500 IVs. However, it’s important to acknowledge that there is no assurance that the optimization algorithm will converge to its global maximum under such a high-dimensional parameter space. Therefore, it’s recommended to conduct repeated optimizations with varying initial parameter settings to enhance the likelihood of obtaining a robust solution. In this study, a total of 20 repeats were performed to ensure the reliability of the results.

Real data analysis

As an application, we conducted a bidirectional MR study examining the relationship between ALM and four circulating lipid traits (high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), total cholesterol (TC), and triglycerides (TG)). We utilized two data sources: the UKB internal cohort and the summary statistics released by GLGC.

In the UKB-internal analysis, we randomly divided the entire UKB cohort into two independent sub-samples (UKB_S1 and UKB_S2). For each exposure-outcome pair (e.g., ALM-TC), one sub-sample served as the exposure sample while the other served as the outcome sample. This process was repeated twice by reversing the roles of the two sub-samples so that two independent sets of summary statistics were generated [41]. In the UKB-GLGC joint analysis, we used the entire UKB cohort to generate GWAS summary statistics for ALM, while GLGC data (excluding UKB participants) provided summary statistics for the lipid traits. Consequently, we generated three datasets that are mutually independent in exposure and/or outcome, allowing for cross-validation of results.

To assess robustness against sample overlap, we conducted an MR analysis using both exposure and outcome summary statistics derived from the entire UKB cohort, resulting in complete sample overlap.

Observational analyses reveal significant correlations between raw ALM values and all lipid traits in the UKB cohort, and these correlations persist after adjusting for age and sex in both ALM and lipid traits (Supplemental Table 6). In the MR settings, the number of eligible IVs ranges from 85 to 719 across all exposure-outcome pairs (Supplemental Tables 7-8). Strong IV-exposure associations are observed, with R² values ranging from 0.05 to 0.10 and F-statistics ranging from 52.1 to 188.8.

At a significance level of 6.25×10⁻³ (0.05/(2×4)), forward MRZ analysis identified a negative association between ALM and both TC and LDL-C (Table 3). This association is statistically significant in all three datasets for both traits, strengthening the evidence of causality. The estimated effect sizes are consistent across datasets for both TC (−0.079, -0.083 and -0.074) and LDL-C (−0.054, -0.063 and -0.037). Notably, the proportion of pleiotropic IVs was estimated to be modest to high for both traits, being around 30% and 40% in the UKB-internal datasets (261-268 IVs).

View this table:

Table 3. Bidirectional causal effects of ALM and lipid traits identified by MRZ.

Forward MR analyses using MRZ did not detect significant associations between ALM and either HDL-C or TG in any of the datasets, suggesting no causal effects on both traits. Further reverse MR analyses did not identify significant associations for any lipid traits with ALM, indicating no reverse causal effect. Analysis on the completely overlapped whole UKB cohort yielded similar results (Supplemental Table 9), confirming MRZ’s robustness against sample overlap.

We compared the MRZ results with those from alternative methods. In the forward MR analyses, all alternative methods identified the same negative causal effect of ALM on TC in one or more datasets (Supplemental Figure 3). However, in the reverse MR analyses, all alternative methods revealed a same negative causal effect of TC on ALM in at least one dataset (Supplemental Figure 3). The results of mutual causality at the same direction are observed from alternative methods on the other three lipid traits too (Supplemental Figures 4-6). Observing mutually reinforcing causal effects at the same direction suggests a high likelihood of pleiotropy, making it difficult to definitively determine the true causal relationship. Therefore, leaving aside the true causal relationships, none of the alternative methods can produce results that are free of pleiotropic effects and that are self-validated across all datasets.

Experimental validation

Given the unidirectional causal link identified by MRZ versus the disputed bidirectional link suggested by alternative methods, there are conflicting views on the causal effect of lipid traits on ALM. To address this controversy empirically, we conducted an in vivo randomized controlled experiment using a mouse model. The experiments involved intervention with TC and compared to normal controls. The results reveal a significant increase in circulating TC levels in mice fed TC for a period of 8 weeks (N=12) compared to normal controls (N=12, Wilcoxon rank test P=3.59 ×10⁻⁵, Figure 6), indicating successful implementation of the TC intervention. Additionally, significant changes are observed in HDL-C (P=1.24 ×10⁻¹⁴), LDL-C (P=5.98 ×10⁻⁹), and TG (P=5.98 ×10⁻⁵) levels.

Figure 6. The randomized controlled experiment intervened by TC in mice.

Male C57BL/6 mice were randomly allocated to control group (N=12) and intervention group (N=12), followed by intervention by TC diet in the TC group for 8 weeks. For each trait, the difference of measurement between the endpoint and the baseline was analyzed. The comparison was made between the control group and the intervention group using the Wilcoxon rank test. ***, P<0.001; **, P<0.001; NS, non-significant (P>0.05).

As the outcome, we observed a significant increase in total body mass (P=1.64 ×10⁻³) and fat body mass (P=1.03 ×10⁻⁴) in the TC intervention group. However, there is no significant difference in lean body mass (P=0.80), suggesting no causal effect of lipid traits on ALM. These findings are consistent with the results obtained from MRZ analyses, but differ from the conclusions drawn by most alternative methods. This highlights MRZ’s resilience against horizontal pleiotropic effects, to which existing methods are more susceptible.

Discussion

Under the prevailing polygenic genetic architecture of complex traits, pleiotropy is anticipated to be a common occurrence [42]. While theoretical assertions suggest that perfectly balanced pleiotropic effects could cancel out bias [22], achieving such equilibrium is impractical. Current MR practices often employ sensitivity analyses like the MREGGER intercept test to evaluate pleiotropy balance [22]. However, our simulation studies revealed that the MREGGER intercept test lacks sufficient power to detect even substantial pleiotropic imbalances. Thus, relying solely on these methods cannot ensure that empirical MR analyses are conducted under balanced pleiotropic conditions. Even under balanced pleiotropy, our simulation studies demonstrated that the type-I error rate of certain methods may inflate as the proportion of pleiotropic IVs increases. The presence of correlated pleiotropic effects exacerbates this issue. Therefore, robust methods to address horizontal pleiotropy are crucial to ensure the validity of MR analyses.

Our proposed method, MRZ, categorizes IVs into three distinct groups reflecting potential pathways from IVs to the outcome, as depicted in the classical MR rationale diagram. By assuming a general normality distribution for underlying pleiotropic effects, we derived a precise distribution for IV-outcome z-scores specific to each category. In simulated scenarios encompassing diverse pleiotropic settings, MRZ demonstrates enhanced control of type-I error rate as well as more precise and powerful causal effect estimation than existing methods. Notably, the normality assumption assumed by MRZ was not met in our simulations. However, we did not observe inflated type-I error rates or imprecise effect estimates, demonstrating the robustness of MRZ against data distribution.

Multiple sophisticated models have been proposed to address horizontal pleiotropy in MR analyses, with some examples provided in this study. While these methods demonstrate excellent performance under specific conditions, our simulation studies reveal limitations when the number of IVs reaches the hundreds. In such scenarios, certain alternative methods struggle to effectively correct for pleiotropic effects. This challenge arises because pleiotropic effects are often systematic, meaning they are not simply averaged out with an increasing number of IVs. In fact, the presence of systematic pleiotropy can worsen as the number of IVs grows.

Among alternative methods, MRMix shares a similar strategy of categorizing IVs based on their potential causal effects [18]. However, MRMix utilizes four categories reflecting all combinations of the presence or absence of direct effects between IVs and both the exposure and the outcome. This classification allows MRMix to include IVs with no association to the exposure. In contrast, MRZ assumes all IVs to be associated with the exposure, aligning with the standard practice of screening and filtering IVs in empirical MR analyses to ensure this condition.

Another key distinction between MRZ and MRMix lies in the data modeling strategy. MRMix assumes both IV-exposure and IV-outcome effects are random variables following a bivariate normal distribution. Conversely, MRZ treats the IV-exposure effect as a fixed value and models only the IV-outcome effect as a random variable with a distribution conditional on the former. This approach in MRZ eliminates the requirement for a normality assumption on the IV-exposure effect, making it more robust to non-normality, a frequent characteristic in non-infinitesimal genetic models.

Another comparable method, CMix, employs distinct probability functions to model valid and pleiotropic IVs as well. Key disparities between MRZ and CMix include the following: 1) MRZ categorizes pleiotropic IVs into uncorrelated and correlated groups, offering a more detailed classification compared to CMix, which treats all pleiotropic IVs collectively; 2) Both methods assume a normal distribution for effect sizes of pleiotropic IVs. However, MRZ allows the mean parameter of this distribution to vary freely, while CMix constrains it to zero but allows for an expanded variance; 3) MRZ integrates uncertainty regarding IV validity into its likelihood function without explicitly designating each IV as valid or pleiotropic. Conversely, CMix explicitly assigns a label to each IV based on its estimated probability of being valid or pleiotropic under each parameter setting. We argue that incorporating IV uncertainty may provide valuable insights into the distribution of pleiotropic IVs. Our simulation studies demonstrate that MRZ can effectively estimate the proportion of pleiotropic IVs, thereby supporting its efficacy.

Skeletal muscle and circulating lipids play a critical role in regulating energy balance [43-45]. They are intricately linked and share common genetic background [46, 47]. Our real data analysis revealed a high degree of pleiotropy between ALM and all the lipid traits we examined. This pleiotropy poses a significant challenge for most existing MR methods. These methods tend to have inflated type-I error rates when the proportion of pleiotropic IVs is modest to high. As evidence, all alternative methods discovered a causal effect of TC on ALM in at least one dataset. However, this finding contradicted the results of controlled experiments where TC levels were directly manipulated. In contrast, MRZ yielded a different result that aligned with the experiment results. This highlights the importance of robust methods for accurate causal inference in MR studies, especially when pleiotropy is a major concern.

The observed causal effect of ALM on TC and LDL-C in our real data analysis is supported by nearly all alternative methods. Several plausible biological mechanisms could underlie this causality: 1) Metabolic rate and energy balance: Lean mass contributes significantly to basal metabolic rate [48, 49], which in turn can influence lipid metabolism and cholesterol levels. 2) Insulin sensitivity: Muscle tissue plays a crucial role in glucose metabolism and insulin sensitivity [50]. Higher lean mass is often associated with improved insulin sensitivity and glucose uptake [50], which in turn lead to changes in TC and LDL-C levels [51]; 3) Anti-inflammatory effects of muscle: Muscle secretes beneficial hormones such as interleukin-6 (IL-6) and irisin [52, 53], which have anti-inflammatory properties. These hormones can influence lipid metabolism and cholesterol levels [54, 55].

It should be notable that ALM is usually altered by other modifiable factors such as exercise and diet. In this case, they are in the same causal pathway, and it is unclear whether the changes in circulating TC and LDL-C are directly caused by changes in ALM or if ALM acts as a mediator of some modifiable factor. Further investigation is warranted to elucidate the biological mechanism underlying this causal relationship.

Certain limitations exist in the proposed method. Firstly, it does not account for certain biases common in MR analyses, such as weak instrument bias [56] and winner’s curse bias [57]. These biases can potentially be mitigated by implementing stringent IV filtering, such as applying a more rigorous significance threshold [57]. Secondly, optimizing a high-dimensional likelihood function poses challenges in converging to its global solution. To enhance the likelihood of reaching the global maximum, it is advisable to perform multiple optimizations with varying initial parameters. However, this approach increases computational burden and does not guarantee attainment of the global maximum. Thirdly, the proposed method requires both exposure and outcome sample sizes as input, unlike other methods that do not have this requirement. This necessity arises because z-scores are meaningful only within the context of a specific sample size. If exposure and outcome sample sizes are uniform so that their ratios are constant across all IVs, then the detailed sample sizes are indeed unnecessary because the estimated effect size remains unchanged after calibrating to the effect size estimated by the IVW test. However, if sample sizes vary across IVs, then providing this information is essential for unbiased estimation by the proposed method. Furthermore, in scenarios involving case-control studies with imbalanced case-to-control ratios, employing an effective sample size rather than a raw sample size is preferable for accurate estimation [58].

In summary, we have proposed a novel two-sample MR method that demonstrates robustness against horizontal pleiotropic effects, while also offering accuracy and power across a wide range of scenarios. By applying this method to investigate the relationship between ALM and lipid traits, we identified a negative causal effect of ALM on TC and LDL-C. Our proposed method serves as a valuable alternative to existing MR methods, particularly in the analysis of large-scale biobank datasets with numerous IVs. With its ability to provide reliable and precise causal inference, our method contributes to advancing the field of MR analysis and enhances our understanding of complex biological relationships.

Online methods

Basic model

We assumed a continuous exposure trait X, a continuous outcome trait Y, and a bi-allelic SNP G taking values between 0, 1, and 2. For ease of presentation, we assumed X and Y are standardized so that their variances are one. Assumed that G is associated with X and serves as its IV and that the associations of G with both X and Y were tested under an additive mode of inheritance.

The basic phenotype model for MR analysis was formulated as following , where β₁ measures the effect of G on X, r measures the causal effect of X on Y, and ε_X and ε_Y are independently and normally distributed random errors. Here, we used the term r for a causal effect of X on Y because it is equivalent to the correlation coefficient for two standardized phenotypes.

In the above model, the parameter r is the focus of MR analysis. In a typical two-sample MR analysis, regression coefficients of both X and Y on G are available from GWAS analyses, denoted by and . Then an unbiased estimator of r would be [23] Below, we modeled the estimation problem using the genetic association z-score, which is defined as the regression coefficient divided by its standard error. Specifically, , where and are standard errors of and .

The squared z-score statistic, termed s_X or s_Y, is commonly used to test the genetic association between G and X or Y. When there is no association, the s statistic asymptotically follows a 1-df central chi-squared distribution. Accordingly, the z-score asymptotically follows a standard normal distribution. In contrast, when genetic association exists, the s statistic asymptotically follows a 1-df non-central chi-squared distribution characterized by a non-central parameter (NCP) λ, and the z-score asymptotically follows a normal distribution with variance one but with a non-zero mean parameter μ where μ²=λ, i.e., In our previous study, we proved that the NCP parameter λ is a function of sample size N and SNP effect size h² [40]. Specifically, , where ln() represents natural logarithm transformation, and h² is the portion of phenotypic variance explained by the SNP, e.g., the SNP-specific heritability. Specifically, we had , so that the mean parameters for X and Y have the following forms Clearly, the mean parameter μ_Y is completely determined by μ_X and their causal effect r.

In the above formula, the parameter μ_X is unknown and could be estimated from the GWAS summary statistics with the following formula , where f is minor allele frequency (MAF) of the IV. However, in practical released GWAS summary statistics X may not be standardized so that its variance var(X) is unknown. In this case, recall that Therefore, we replaced μ_X by its maximum likelihood estimation, which is z_X, and re-wrote the form of μ_Y Given Δ, the probability of observing z_Y is simply the density of a normal distribution N(Δ, 1). Equivalently, if we let , then follows a standard normal distribution N(0, 1)

Estimating r using z-scores

When multiple independent IVs are available, we built the estimation model within the maximum likelihood framework. Assumed that there are a total of M independent IVs. Let Z_X = (Z_X1,Z_X2,Z_XM)’s and Z_Y = (Z_Y1,Z_Y2,Z_YM)′ be corresponding z-score vectors.

We constructed the likelihood function of r as Maximizing the likelihood function L(r) yielded the maximum likelihood estimate of r, denoted by , and the maximum likelihood, denoted by .

The significance of was evaluated using the LRT approach. Specifically, by restricting r=0, we obtained the null likelihood L₀. We then constructed the likelihood ratio test statistic as follows Under the null hypothesis of r=0, T₁ approximately follows a 1-df central chi-squared distribution, which is used to judge the significance of .

Modeling uncorrelated and correlated horizontal pleiotropy effects

We defined an IV to be valid if it has no horizontal pleiotropic effect. For a pleiotropic IV, the estimator is not necessarily an unbiased estimator of r anymore. To see this, we re-wrote the phenotype model for Y as , where β₂ measures the pleiotropic effect of G on Y that is not mediated by X. In this model, is an unbiased estimator of the following parameter instead Depending on the direction of β₂, could overestimate or underestimate the true effect r.

In the presence of pleiotropy, the association of G with Y is a mixture of the X-mediated causal effect and the pleiotropic effect. Accordingly, the mean parameter μ_Y is a mixture of two components (Supplemental Notes) so that , where is a parameter defined by pleiotropy driven heritability In the above formula, the mean parameter μ_Y is composed of two components: the first one is the causal effect, and the second one is the pleiotropic effect. The causal effect is fixed for not only valid IVs but also pleiotropic IVs. In contrast, the pleiotropic effect exists at pleiotropic IVs only and may vary depending on the strength of the IV-outcome effect, and was therefore modeled as a random effect. After removing the causal effect, the residual outcome z-score contains information about the pleiotropic effect only. Depending on whether it is correlated to z_X or not, the pleiotropic effect is further classified into two types. In the first type, the pleiotropic effect is directly on the outcome, so it is uncorrelated to z_X. In this case, μ_p was modeled as a z_X-independent normally distributed effect In the second type, the pleiotropic effect is influenced by some unmeasured confounder that is correlated to both the exposure and the outcome. μ_p in this type is correlated to z_X and the INSIDE assumption is violated [39]. Accordingly, it was modeled as a z_X-related normally distributed effect , where a_m measures the correlation between and Z_Xm at the mth IV. We added the subscript m in to emphasize that a_m is IV-specific whose magnitude depends on the magnitude of the pleiotropy effect at the mth IV. To account for the variation of a_m, it was further modeled as a normally distributed random effect where μ_a and measure mean pleiotropic correlation and its variance, respectively.

Until now, we have built a hierarchical model towards the mean parameter of distribution. We aimed to obtain its marginal distribution. To integrate out the intermediate dummy variables, we introduced the following lemma,

Lemma I, If a random variable X follows a normal distributio, and if its mean parameter μ is another random variable following a second normal distribution , then the marginal distribution of X is again a normal distribution of the form (Supplemental Notes).

Applying this lemma, we first integrated out a_m to obtain the marginal distribution of We then in turn integrated out μ_punc and μ_pcar to obtain the marginal distribution of and Taken together, for the three types of IVs (valid, uncorrelated pleiotropic and correlated pleiotropic), the residual z-score after removing the causal effect r has a type-specific distribution One key inference from the above distributions is that at different types has distinct variances. Specifically, at valid IVs has an exact variance of one. at pleiotropic IVs has a greater variance. This feature may make the three types of IVs distinguishable.

Likelihood function under horizontal pleiotropy

We assumed that the proportions of valid IVs, uncorrelated pleiotropic IVs and correlated pleiotropic IVs are(1-τ), τ (1-ρ), and τ ρ, where τ and ρ ∈[0,1] measure the proportion of pleiotropic IVs and the relative proportion of correlated pleiotropic IVs, respectively. Our model contains nine parameters in total: . We constructed a likelihood function of θ as follows , where the probability densities were defined by equation (4).

Maximizing L regarding θ yields the maximum likelihood estimate of θ, denoted by , and the maximum likelihood, denoted by. We maximized L using the function nmkb in the R package dfoptim, which implements a derivative-free Nelder–Mead algorithm for high-dimensional function.

Test of causal effect

The causal effect is tested against the null hypothesis of r=0 by means of the LRT approach. Specifically, by restricting r=0 in (5), we constructed the null likelihood function, enoted by L₀, which contains eight parameters. Maximizing L₀ over its sample space yields the maximum likelihood under the null hypothesis, denoted by . The LRT statistic was then constructed as follows Under the null hypothesis of r=0, T₂ approximately follows a 1-df central chi-squared distribution, which was used to declare the significance of .

Test of pleiotropic effects

The existence of pleiotropic effects was tested against the null hypothesis of τ=0 using the same LRT approach. Under the null hypothesis, the likelihood function contains only one free parameter r. Therefore, we used a 8-df central chi-squared distribution to declare the significance of the estimate .

In implementation, we will first test pleiotropic effects at the significance level P<0.05. If there is evidence of pleiotropic effects, then we will use T₂ to test the causal effect; otherwise, we will use T₁ instead.

Unstandardized phenotypes

Since is estimated based on z-scores, its test significance remains unchanged regardless of whether exposure or outcome is standardized. However, its magnitude may differ from the original effect size when unstandardized exposure or outcome is analyzed. The ratio of to the original effect size is a constant C that is determined by the unit of exposure and outcome under analysis. In certain scenarios, it is desirable to obtain the original effect size, such as to recover the odds ratio for case-control data types.

To recover the original effect size, we estimated C by anchoring our test to a comparable alternative test. As demonstrated in the Results section, our proposed test statistic T₁, which assumes no pleiotropic effects, yields an effect size approximately equal to the IVW test, which also assumes no pleiotropic effects. Consequently, we conducted both the T₁ test and the IVW test on the same dataset and estimated C by calculating the ratio of their effect sizes.

Simulation studies

We performed a series of simulation studies to evaluate the performance of the proposed method. We simulated one continuous exposure X, one continuous outcome Y and one continuous confounder U. We simulated a set of bi-allelic SNP as IVs for the exposure. We studied a two-sample context so that the exposure and outcome summary statistics were generated from two separate samples. Parameter settings were as follows:

The number of IVs was set to be M=200.
Both exposure and outcome sample sizes were set to be N_X=N_Y=200,000.
The causal effect of X on Y was set to be r=0 (null), 0.05 (positive effect), or -0.05 (negative effect). In the latter two cases, X explained r²=0.25% of Y’s variance.
Confounder effect. In the case of confounding effect, the effect of the confounder U was set to explain 20% of phenotypic variance in both X and Y. The effect directions of U on both X and Y were set to be the same.
Proportion of pleiotropic IVs to the outcome (τ). The proportion of pleiotropic IVs τ varied from 10% to 60% at an increment of 10%. The default direction of the pleiotropic effects was assumed to be positively dominated, in which 70%: 30% of pleiotropic IVs were simulated to have positive: negative pleiotropic effects. In the case of balanced pleiotropic effects, 50%: 50% of pleiotropic IVs were simulated to have positive: negative pleiotropic effects.
Relative proportion of correlated IVs (ρ). From the pleiotropic IVs simulated in step 5), a fraction of ρ = 0.5 IVs were simulated to also have a pleiotropic effect on U. For ease of presentation, the direction of the pleiotropic effects on U was set to be the same as that on Y.
IV effects on exposure. We assumed a non-infinitesimal genetic architecture for IV-exposure effects. A proportion of 10% of total IVs were simulated to have a large effect each explaining 0.2% of X’s variance. The portion of explained variance for each of the remaining 90% IVs was drawn from an exponential distribution with a mean 1.0 10⁻⁴, a level comparable to the majority of GWAS findings. The 200 IVs explained approximately a total of 5.8% of X’s variance.
IV pleiotropic effects on outcome and confounder. The portion of the variance in both Y and U explained by each pleiotropic IV was drawn from an exponential distribution with a mean 1.0 × 10⁻⁴.

Each scenario was simulated with 1,000 iterations. At each simulation iteration, the MAF of each IV was drawn from a uniform distribution uni(0.05, 0.5), and the IV genotypes at both exposure and outcome samples were simulated using PLINK [59] assuming the Hardy-Weinberger equilibrium. After the phenotypes were simulated, PLINK was invoked to test genetic association in both the exposure and the outcome samples.

Comparison with existing methods

We evaluated and compared the performance of the proposed method with 14 existing two-sample MR methods, including IVW [23], MREGGER [22], weighted-median (W-median) [21], weighted-mode (W-mode) [20], MRPRESSO [19], the contamination mixed model (CMix) [16], MRMix [18], MRAID [11], MRcML [12], GRAPPLE [13], MRLASSO [17], MRRAPS [14], MRROBUST [17], and CAUSE [15]. Comparison criteria included type-I error rate, power rate and estimated effect size. The statistical significance was declared at the nominal level α=0.05. In power rate estimation, only effects in the same direction as the true effect were considered as potentially successful hits. Effects in the opposite direction were not considered regardless of their statistical significance.

Because various methods may possess different type-I error rates, comparing raw power rates may be unfair. To ensure a fair comparison of power rates, the raw power rate was corrected by calibrating it with the type-I error rate estimated at the corresponding null condition. This correction was only for comparison purpose. It allows all methods to operate under the assumption of possessing a correct type-I error rate of 0.05 CAUSE requires a substantial number of background SNPs spanning the genome to estimate its model parameters. To meet this requirement, we generated an additional random set of 100,000 independent SNPs that are associated with neither the exposure nor the outcome. This combination of 100,000 background SNPs and the IVs was used for estimating nuisance parameters (step 1), while only the IVs were used for causal inference (step 2). For all other methods, only the IVs were used for causal inference.

Due to their high computational demands, CAUSE and MRPRESSO were assessed across 200 iterations.

Real data application

Lean body mass is an important physiological index. Low ALM, coupled with diminished muscle strength and reduced physical performance, serves as a defining criterion for the onset of sarcopenia [60], which is a critical condition that can significantly impair function, lead to physical disability, and is a major modifiable risk factor for frailty in older adults [61, 62]. Lipids are another cluster of metabolites related to energy balance [63, 64]. Disorders of lipid metabolism can co-occur with the loss of skeletal muscle mass [65, 66]. However, the mutual relationship between ALM and lipid traits has not been well-studied. Uncovering their causal relationships is thus needed to facilitate the prediction and intervention of sarcopenia.

As a real application, we conducted a comprehensive bidirectional MR study between ALM and four lipid traits, including HDL-C, LDL-C, TC, and TG using two data sources: the UKB internal cohort and the summary statistics released by GLGC. The study (project number 41542) was covered by general ethical approval for the UKB study, and was approved by the Northwest Centre for Research Ethics Committee (11/NW/0382). All participants provided informed consent.

The study design is displayed in Supplemental Figure 7. In brief, we performed both the UKB-internal analysis and the UKB-GLGC joint analysis. In the UKB-internal analysis, we randomly divided the entire UKB cohort into two independent sub-samples (UKB_S1 and UKB_S2). For each exposure-outcome pair (e.g., ALM-TC), one sub-sample served as the exposure sample, while the other served as the outcome sample. This process was repeated by reversing the roles of the two sub-samples so that two independent sets of summary statistics were generated [41]. In the UKB-GLGC analysis, the whole UKB cohort was used to generate GWAS summary statistics for ALM, while the GLGC data (excluding UKB participants) provided summary statistics for the lipid traits. In total, for each exposure-outcome pair, we generated three datasets that are mutually independent in exposure and/or outcome, so that the results from them could cross-validate each other.

To evaluate the influence of sample overlap, we also applied a MR analysis in which both the exposure and the outcome summary statistics were derived from the whole UKB cohort, so that both samples completely overlapped.

The details of the analysis are described in Supplemental Notes. In brief, the GWAS of UKB samples was performed with BOLT-LMM [67]. Genome-wide significant (p<5.0 × 10⁻⁸) SNPs were selected from the exposure sample, followed by clumping (LD r²=0.01 and window size=500 kb) to select eligible IVs using TwoSampleMR R package. In the UKB-GLGC joint analysis, GWAS of ALM was conducted in the whole UKB cohort, while the GWAS summary statistics of lipid traits were derived from the released GLGC results including no UKB participants. In the UKB-overlapping analysis, the GWAS summary statistics of both ALM and lipid traits were derived from the whole UKB cohort. The QC and MR analysis procedure are the same across various analyses. However, in analyses using UKB samples only (UKB-internal and UKB-overlapping), palindromic IVs were not excluded, while in the UKB-GLGC joint analysis, palindromic IVs were excluded to avoid strand orientation error.

Mouse-model experiments

The experimental procedures and treatments conducted in this study were ethically reviewed and approved by the Animal Care Ethical Committee of Soochow University, Suzhou, China, ensuring compliance with animal welfare guidelines and regulations.

Male C57BL/6 mice, aged 8 weeks and weighing 20 ± 5 g, were procured from Shanghai Lingchang Biotechnology Co., LTD. These mice were housed in a controlled environment with a temperature of 23 ± 1 °C and a 12:12-hour light-dark cycle (lights on from 07:00 to 19:00), and they had free access to food and water throughout the experiment. Following a 1-week acclimatization period, the mice were randomly assigned to two groups: the control group (N=12) and the TC group (N=12). Standard feed was obtained from Double Lion Experimental Animal Feed Technology Co., LTD, Suzhou. The control group received a standard diet consisting of 71% normal diet food, 20% protein, 4% fat, and 5% fiber. In contrast, the TC group received a high-cholesterol diet comprising 68.3% normal diet food, 1.3% cholesterol, 18.4% lard, and 12% protein. This dietary intervention continued for a duration of 8 weeks.

Body composition analysis, including measurements of total body mass, fat mass, lean mass, and fluid content, was conducted on live animals without the use of anesthesia. This analysis was performed using small animal MRI equipment (Minispec LF50 body composition analyzer, Bruker, Billerica, MA, USA). To conduct the measurements, each mouse was placed in a specially designed plastic holder tailored for mice, without the need for sedation or anesthesia. Subsequently, the holder containing the mouse was positioned within the measuring space of the MRI system. To ensure the accuracy of the results, measures were taken to prevent the mice from moving within the holder during the scanning process. Each scan lasted approximately 2 minutes, during which the MRI equipment captured detailed data on the body composition of the mice.

Blood biochemistry analysis was conducted using a Hitachi 7100 clinical chemistry analyzer in accordance with the manufacturer’s guidelines. Approximately 500 μL of plasma was collected from each mouse and transferred to a gel tube containing lithium heparin. Subsequently, the plasma samples were centrifuged at 5000 rpm using a refrigerated centrifuge set at 4°C for 15 minutes to obtain 160–200 μL of serum. In cases where the volume of serum obtained was insufficient for analysis, it was diluted with deionized water at a ratio of 1:2 to ensure proper loading for analysis. This process ensured accurate and reliable blood biochemistry measurements for each sample.

For each trait, the difference of measurement between the endpoint and the baseline was analyzed. The comparison was made between the control group and the intervention group using the Wilcoxon rank test in R package.

Competing interests

We declare that none of the authors have competing financial or non-financial interests.

Data availability

Access to UK Biobank data can be obtained by application to UK Biobank (https://www.ukbiobank.ac.uk/).

Code availability

All analyses have been performed using publicly available software or custom codes. PLINK (v1.90b6.5, https://www.cog-genomics.org/plink/) and BOLT-LMM (v2.3.2, https://alkesgroup.broadinstitute.org/BOLT-LMM/) were used to perform association analysis in the simulated data and in the UK Biobank data, respectively. The MR analyses were performed using R (v.4.3.2, https://cran.r-project.org/). The R package TwoSampleMR (v0.5.7, https://github.com/MRCIEU/TwoSampleMR) was used to implement the IVW, MREGGER, W-median, and W-mode methods. MRAID (https://github.com/yuanzhongshang/MRAID), CAUSE (v1.2.0, https://github.com/jean997/cause), MRcML (https://github.com/xue-hr/MRcML/), GRAPPLE (https://github.com/jingshuw/GRAPPLE), MRMix (https://github.com/gqi/MRMix), mr.raps (https://github.com/qingyuanzhao/mr.raps), MR-PRESSO (https://github.com/rondolab/MR-PRESSO) were used to implement respective methods. MendelianRandomization (v0.9.0, https://cran.r-project.org/web/packages/MendelianRandomization/index.html) was used to implement the CMix, MRLASSO and MRROBUST methods [68]. The R package dfoptim (v2023.1.0, https://cran.r-project.org/web/packages/dfoptim/index.html) was used to optimize high-dimensional likelihood function.

Acknowledgments

This research was conducted using the UK Biobank resource under application number 41542. Special thank was given to an anonymous reviewer whose constructive comments advised us to model correlated pleiotropic effects. This study was partially supported by the funding from National Natural Science Foundation of China (32170670 to YFP, 81902181 to JJN), the Project of MOE Key Laboratory of Geriatric Diseases and Immunology (JYN202401 to HGR), and Suzhou Basic Research Program (Medical Application Basic Research) (SKY2023147 to JJN).

References

1.↵
Smith, G.D. and S. Ebrahim, ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol, 2003. 32(1): p. 1–22.
OpenUrl CrossRef PubMed Web of Science Google Scholar
2.↵
Burgess, S., et al., Use of Mendelian randomisation to assess potential benefit of clinical intervention. BMJ, 2012. 345: p. e7325.
OpenUrl FREE Full Text Google Scholar
3.↵
Burgess, S., et al., Guidelines for performing Mendelian randomization investigations: update for summer 2023. Wellcome Open Res, 2019. 4: p. 186.
OpenUrl Google Scholar
4.↵
Yengo, L., et al., A saturated map of common genetic variants associated with human height. Nature, 2022. 610(7933): p. 704–712.
OpenUrl Google Scholar
5.↵
Solovieff, N., et al., Pleiotropy in complex traits: challenges and strategies. Nat Rev Genet, 2013. 14(7): p. 483–95.
OpenUrl CrossRef PubMed Google Scholar
6.↵
Ebrahim, S. and G. Davey Smith, Mendelian randomization: can genetic epidemiology help redress the failures of observational epidemiology? Hum Genet, 2008. 123(1): p. 15–33.
OpenUrl CrossRef PubMed Web of Science Google Scholar
7.
Hemani, G., J. Bowden, and G. Davey Smith, Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum Mol Genet, 2018. 27(R2): p. R195–R208.
OpenUrl PubMed Google Scholar
8.↵
Holmes, M.V., M. Ala-Korpela, and G.D. Smith, Mendelian randomization in cardiometabolic disease: challenges in evaluating causality. Nat Rev Cardiol, 2017. 14(10): p. 577–590.
OpenUrl CrossRef PubMed Google Scholar
9.↵
Sanderson, E., et al., Mendelian randomization. Nat Rev Methods Primers, 2022. 2.
Google Scholar
10.↵
Slob, E.A.W. and S. Burgess, A comparison of robust Mendelian randomization methods using summary data. Genet Epidemiol, 2020. 44(4): p. 313–329.
OpenUrl PubMed Google Scholar
11.↵
Yuan, Z., et al., Likelihood-based Mendelian randomization analysis with automated instrument selection and horizontal pleiotropic modeling. Sci Adv, 2022. 8(9): p. eabl5744.
OpenUrl Google Scholar
12.↵
Xue, H., X. Shen, and W. Pan, Constrained maximum likelihood-based Mendelian randomization robust to both correlated and uncorrelated pleiotropic effects. Am J Hum Genet, 2021. 108(7): p. 1251–1269.
OpenUrl CrossRef PubMed Google Scholar
13.↵
Wang, J., et al., Causal inference for heritable phenotypic risk factors using heterogeneous genetic instruments. PLoS Genet, 2021. 17(6): p. e1009575.
OpenUrl Google Scholar
14.↵
Zhao, Q., et al., Statistical Inference in two-sample summary-data Mendelian randomization using robust adjusted profile score. The Annals of Statistics, 2020. 48(3): p. 1742–1769.
OpenUrl Google Scholar
15.↵
Morrison, J., et al., Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nat Genet, 2020. 52(7): p. 740–747.
OpenUrl PubMed Google Scholar
16.↵
Burgess, S., et al., A robust and efficient method for Mendelian randomization with hundreds of genetic variants. Nat Commun, 2020. 11(1): p. 376.
OpenUrl PubMed Google Scholar
17.↵
Rees, J.M.B., et al., Robust methods in Mendelian randomization via penalization of heterogeneous causal estimates. PLoS One, 2019. 14(9): p. e0222362.
OpenUrl CrossRef PubMed Google Scholar
18.↵
Qi, G. and N. Chatterjee, Mendelian randomization analysis using mixture models for robust and efficient estimation of causal effects. Nat Commun, 2019. 10(1): p. 1941.
OpenUrl PubMed Google Scholar
19.↵
Verbanck, M., et al., Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet, 2018. 50(5): p. 693–698.
OpenUrl CrossRef PubMed Google Scholar
20.↵
Hartwig, F.P., G. Davey Smith, and J. Bowden, Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int J Epidemiol, 2017. 46(6): p. 1985–1998.
OpenUrl CrossRef PubMed Google Scholar
21.↵
Bowden, J., et al., Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator. Genet Epidemiol, 2016. 40(4): p. 304–14.
OpenUrl CrossRef PubMed Google Scholar
22.↵
Bowden, J., G. Davey Smith, and S. Burgess, Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol, 2015. 44(2): p. 512–25.
OpenUrl CrossRef PubMed Google Scholar
23.↵
Burgess, S., A. Butterworth, and S.G. Thompson, Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol, 2013. 37(7): p. 658–65.
OpenUrl CrossRef PubMed Google Scholar
24.
Jiang, L., et al., Constrained instruments and their application to Mendelian randomization with pleiotropy. Genet Epidemiol, 2019. 43(4): p. 373–401.
OpenUrl Google Scholar
25.
Cho, Y., et al., Exploiting horizontal pleiotropy to search for causal pathways within a Mendelian randomization framework. Nat Commun, 2020. 11(1): p. 1010.
OpenUrl CrossRef Google Scholar
26.
Sanderson, E., et al., An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings. Int J Epidemiol, 2019. 48(3): p. 713–727.
OpenUrl CrossRef PubMed Google Scholar
27.
Burgess, S. and S.G. Thompson, Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am J Epidemiol, 2015. 181(4): p. 251–60.
OpenUrl CrossRef PubMed Google Scholar
28.
Zhu, Z., et al., Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat Commun, 2018. 9(1): p. 224.
OpenUrl CrossRef PubMed Google Scholar
29.
Bowden, J., et al., Improving the accuracy of two-sample summary-data Mendelian randomization: moving beyond the NOME assumption. Int J Epidemiol, 2019. 48(3): p. 728–742.
OpenUrl CrossRef PubMed Google Scholar
30.
Foley, C.N., et al., MR-Clust: clustering of genetic variants in Mendelian randomization with similar causal estimates. Bioinformatics, 2021. 37(4): p. 531–541.
OpenUrl Google Scholar
31.
Berzuini, C., et al., A Bayesian approach to Mendelian randomization with multiple pleiotropic variants. Biostatistics, 2020. 21(1): p. 86–101.
OpenUrl CrossRef Google Scholar
32.
Xu, S., W.K. Fung, and Z. Liu, MRCIP: a robust Mendelian randomization method accounting for correlated and idiosyncratic pleiotropy. Brief Bioinform, 2021. 22(5).
Google Scholar
33.
Cheng, Q., et al., MR-LDP: a two-sample Mendelian randomization for GWAS summary statistics accounting for linkage disequilibrium and horizontal pleiotropy. NAR Genom Bioinform, 2020. 2(2): p. lqaa028.
OpenUrl Google Scholar
34.
Zhu, X., et al., An iterative approach to detect pleiotropy and perform Mendelian Randomization analysis using GWAS summary statistics. Bioinformatics, 2021. 37(10): p. 1390–1400.
OpenUrl Google Scholar
35.
Grant, A.J. and S. Burgess, An efficient and robust approach to Mendelian randomization with measured pleiotropic effects in a high-dimensional setting. Biostatistics, 2022. 23(2): p. 609–625.
OpenUrl Google Scholar
36.
Howey, R., et al., Bayesian network analysis incorporating genetic anchors complements conventional Mendelian randomization approaches for exploratory analysis of causal relationships in complex data. PLoS Genet, 2020. 16(3): p. e1008198.
OpenUrl CrossRef Google Scholar
37.↵
Iong, D., Q. Zhao, and Y. Chen, A latent mixture model for heterogeneous causal mechanisms in Mendelian randomization. The Annals of Applied Statistics, 2024. 18(2): p. 966–990.
OpenUrl Google Scholar
38.↵
Graham, S.E., et al., The power of genetic diversity in genome-wide association studies of lipids. Nature, 2021. 600(7890): p. 675–679.
OpenUrl Google Scholar
39.↵
Burgess, S. and S.G. Thompson, Interpreting findings from Mendelian randomization using the MR-Egger method. Eur J Epidemiol, 2017. 32(5): p. 377–389.
OpenUrl CrossRef PubMed Google Scholar
40.↵
Zhang, L., et al., A new method for estimating effect size distribution and heritability from genome-wide association summary results. Hum Genet, 2016. 135(2): p. 171–84.
OpenUrl Google Scholar
41.↵
Denault, W.R.P., et al., Cross-fitted instrument: A blueprint for one-sample Mendelian randomization. PLoS Comput Biol, 2022. 18(8): p. e1010268.
OpenUrl Google Scholar
42.↵
Watanabe, K., et al., A global overview of pleiotropy and genetic architecture in complex traits. Nat Genet, 2019. 51(9): p. 1339–1348.
OpenUrl CrossRef PubMed Google Scholar
43.↵
Frontera, W.R. and J. Ochala, Skeletal muscle: a brief review of structure and function. Calcif Tissue Int, 2015. 96(3): p. 183–95.
OpenUrl CrossRef PubMed Google Scholar
44.
Zurlo, F., et al., Skeletal muscle metabolism is a major determinant of resting energy expenditure. J Clin Invest, 1990. 86(5): p. 1423–7.
OpenUrl CrossRef PubMed Web of Science Google Scholar
45.↵
Havel, P.J., Update on adipocyte hormones: regulation of energy balance and carbohydrate/lipid metabolism. Diabetes, 2004. 53 Suppl 1: p. S143–51.
OpenUrl Abstract/FREE Full Text Google Scholar
46.↵
Vella, C.A., et al., Skeletal muscle area and density are associated with lipid and lipoprotein cholesterol levels: The Multi-Ethnic Study of Atherosclerosis. J Clin Lipidol, 2020. 14(1): p. 143–153.
OpenUrl CrossRef Google Scholar
47.↵
Pickrell, J.K., et al., Detection and interpretation of shared genetic influences on 42 human traits. Nat Genet, 2016. 48(7): p. 709–17.
OpenUrl CrossRef PubMed Google Scholar
48.↵
Luke, A. and D.A. Schoeller, Basal metabolic rate, fat-free mass, and body cell mass during energy restriction. Metabolism, 1992. 41(4): p. 450–6.
OpenUrl CrossRef PubMed Web of Science Google Scholar
49.↵
Johnstone, A.M., et al., Factors influencing variation in basal metabolic rate include fat-free mass, fat mass, age, and circulating thyroxine but not sex, circulating leptin, or triiodothyronine. Am J Clin Nutr, 2005. 82(5): p. 941–8.
OpenUrl Abstract/FREE Full Text Google Scholar
50.↵
Merz, K.E. and D.C. Thurmond, Role of Skeletal Muscle in Insulin Resistance and Glucose Uptake. Compr Physiol, 2020. 10(3): p. 785–809.
OpenUrl Google Scholar
51.↵
Hirano, T., Pathophysiology of Diabetic Dyslipidemia. J Atheroscler Thromb, 2018. 25(9): p. 771–782.
OpenUrl CrossRef PubMed Google Scholar
52.↵
Welc, S.S. and T.L. Clanton, The regulation of interleukin-6 implicates skeletal muscle as an integrative stress sensor and endocrine organ. Exp Physiol, 2013. 98(2): p. 359–71.
OpenUrl CrossRef PubMed Web of Science Google Scholar
53.↵
Brenmoehl, J., et al., Irisin is elevated in skeletal muscle and serum of mice immediately after acute exercise. Int J Biol Sci, 2014. 10(3): p. 338–49.
OpenUrl CrossRef PubMed Google Scholar
54.↵
Nakagomi, A., et al., Relationships between the serum cholesterol levels, production of monocyte proinflammatory cytokines and long-term prognosis in patients with chronic heart failure. Intern Med, 2014. 53(21): p. 2415–24.
OpenUrl PubMed Google Scholar
55.↵
Hardardottir, I., C. Grunfeld, and K.R. Feingold, Effects of endotoxin and cytokines on lipid metabolism. Curr Opin Lipidol, 1994. 5(3): p. 207–15.
OpenUrl CrossRef PubMed Google Scholar
56.↵
Burgess, S., N.M. Davies, and S.G. Thompson, Bias due to participant overlap in two-sample Mendelian randomization. Genet Epidemiol, 2016. 40(7): p. 597–608.
OpenUrl CrossRef PubMed Google Scholar
57.↵
Jiang, T., et al., An empirical investigation into the impact of winner’s curse on estimates from Mendelian randomization. Int J Epidemiol, 2023. 52(4): p. 1209–1219.
OpenUrl Google Scholar
58.↵
Winkler, T.W., et al., Quality control and conduct of genome-wide association meta-analyses. Nat Protoc, 2014. 9(5): p. 1192–212.
OpenUrl CrossRef PubMed Google Scholar
59.↵
Purcell, S., et al., PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet, 2007. 81(3): p. 559–75.
OpenUrl CrossRef PubMed Google Scholar
60.↵
Cruz-Jentoft, A.J., et al., Sarcopenia: revised European consensus on definition and diagnosis. Age Ageing, 2019. 48(1): p. 16–31.
OpenUrl CrossRef PubMed Google Scholar
61.↵
Giles, J.T., et al., Association of body composition with disability in rheumatoid arthritis: impact of appendicular fat and lean tissue mass. Arthritis Rheum, 2008. 59(10): p. 1407–15.
OpenUrl CrossRef PubMed Web of Science Google Scholar
62.↵
Janssen, I., S.B. Heymsfield, and R. Ross, Low relative skeletal muscle mass (sarcopenia) in older persons is associated with functional impairment and physical disability. J Am Geriatr Soc, 2002. 50(5): p. 889–96.
OpenUrl CrossRef PubMed Web of Science Google Scholar
63.↵
Arner, P., et al., Dynamics of human adipose lipid turnover in health and metabolic disease. Nature, 2011. 478(7367): p. 110–3.
OpenUrl CrossRef PubMed Web of Science Google Scholar
64.↵
Arner, P., et al., Adipose lipid turnover and long-term changes in body weight. Nat Med, 2019. 25(9): p. 1385–1389.
OpenUrl CrossRef Google Scholar
65.↵
Jiang, Y., et al., The association of lipid metabolism and sarcopenia among older patients: a cross-sectional study. Sci Rep, 2023. 13(1): p. 17538.
OpenUrl Google Scholar
66.↵
Kim, G. and J.H. Kim, Impact of Skeletal Muscle Mass on Metabolic Health. Endocrinol Metab (Seoul), 2020. 35(1): p. 1–6.
OpenUrl Google Scholar
67.↵
Loh, P.R., et al., Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet, 2015. 47(3): p. 284–90.
OpenUrl CrossRef PubMed Google Scholar
68.↵
Yavorska, O.O. and S. Burgess, MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data. Int J Epidemiol, 2017. 46(6): p. 1734–1739.
OpenUrl CrossRef PubMed Google Scholar

Posted April 18, 2024.

Download PDF

Author Declarations

Supplementary Material

Data/Code

Citation Tools

Get QR code

Tweet Widget

Subject Area

Epidemiology

Reviews and Context

Comment

TRIP Peer Reviews

Community Reviews

Automated Services

Blogs/Media

Author Videos

Subject Areas

All Articles

Addiction Medicine (411)
Allergy and Immunology (725)
Anesthesia (214)
Cardiovascular Medicine (3093)
Dentistry and Oral Medicine (349)
Dermatology (261)
Emergency Medicine (463)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1097)
Epidemiology (13026)
Forensic Medicine (13)
Gastroenterology (861)
Genetic and Genomic Medicine (4846)
Geriatric Medicine (447)
Health Economics (750)
Health Informatics (3058)
Health Policy (1104)
Health Systems and Quality Improvement (1128)
Hematology (410)
HIV/AIDS (958)
Infectious Diseases (except HIV/AIDS) (14334)
Intensive Care and Critical Care Medicine (882)
Medical Education (453)
Medical Ethics (120)
Nephrology (500)
Neurology (4610)
Nursing (244)
Nutrition (687)
Obstetrics and Gynecology (844)
Occupational and Environmental Health (764)
Oncology (2389)
Ophthalmology (675)
Orthopedics (268)
Otolaryngology (333)
Pain Medicine (301)
Palliative Medicine (87)
Pathology (515)
Pediatrics (1239)
Pharmacology and Therapeutics (519)
Primary Care Research (522)
Psychiatry and Clinical Psychology (3962)
Public and Global Health (7191)
Radiology and Imaging (1603)
Rehabilitation Medicine and Physical Therapy (954)
Respiratory Medicine (943)
Rheumatology (459)
Sexual and Reproductive Health (477)
Sports Medicine (403)
Surgery (512)
Toxicology (65)
Transplantation (221)
Urology (189)

Comments

medRxiv aims to provide a venue for anyone to comment on a medRxiv preprint. Comments are moderated for offensive or irrelevant content (this can take ~24 h). Please avoid duplicate submissions and read our Comment Policy before commenting. The content of a comment is not endorsed by medRxiv.

medRxiv aims to inform readers about online discussion of this preprint occurring elsewhere. The content at the links below is not endorsed by either medRxiv or the preprint's authors.

Community reviews for this article:

There are no community reviews for this paper.

Automated Evaluations

Certain services provide automated analysis of preprints. Analyses invited by the authors are displayed at the top of this tab. Those done independently of authors are shown underneath . None of these analyses is endorsed by medRxiv.

Automated Evaluations:

There are no automated evaluations for this paper.

[1] 1.↵
Smith, G.D. and S. Ebrahim, ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol, 2003. 32(1): p. 1–22.
OpenUrl CrossRef PubMed Web of Science Google Scholar

[2] 2.↵
Burgess, S., et al., Use of Mendelian randomisation to assess potential benefit of clinical intervention. BMJ, 2012. 345: p. e7325.
OpenUrl FREE Full Text Google Scholar

[3] 3.↵
Burgess, S., et al., Guidelines for performing Mendelian randomization investigations: update for summer 2023. Wellcome Open Res, 2019. 4: p. 186.
OpenUrl Google Scholar

[4] 4.↵
Yengo, L., et al., A saturated map of common genetic variants associated with human height. Nature, 2022. 610(7933): p. 704–712.
OpenUrl Google Scholar

[5] 5.↵
Solovieff, N., et al., Pleiotropy in complex traits: challenges and strategies. Nat Rev Genet, 2013. 14(7): p. 483–95.
OpenUrl CrossRef PubMed Google Scholar

[6] 6.↵
Ebrahim, S. and G. Davey Smith, Mendelian randomization: can genetic epidemiology help redress the failures of observational epidemiology? Hum Genet, 2008. 123(1): p. 15–33.
OpenUrl CrossRef PubMed Web of Science Google Scholar

[7] 7.
Hemani, G., J. Bowden, and G. Davey Smith, Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum Mol Genet, 2018. 27(R2): p. R195–R208.
OpenUrl PubMed Google Scholar

[8] 8.↵
Holmes, M.V., M. Ala-Korpela, and G.D. Smith, Mendelian randomization in cardiometabolic disease: challenges in evaluating causality. Nat Rev Cardiol, 2017. 14(10): p. 577–590.
OpenUrl CrossRef PubMed Google Scholar

[9] 9.↵
Sanderson, E., et al., Mendelian randomization. Nat Rev Methods Primers, 2022. 2.
Google Scholar

[10] 10.↵
Slob, E.A.W. and S. Burgess, A comparison of robust Mendelian randomization methods using summary data. Genet Epidemiol, 2020. 44(4): p. 313–329.
OpenUrl PubMed Google Scholar

[11] 11.↵
Yuan, Z., et al., Likelihood-based Mendelian randomization analysis with automated instrument selection and horizontal pleiotropic modeling. Sci Adv, 2022. 8(9): p. eabl5744.
OpenUrl Google Scholar

[12] 12.↵
Xue, H., X. Shen, and W. Pan, Constrained maximum likelihood-based Mendelian randomization robust to both correlated and uncorrelated pleiotropic effects. Am J Hum Genet, 2021. 108(7): p. 1251–1269.
OpenUrl CrossRef PubMed Google Scholar

[13] 13.↵
Wang, J., et al., Causal inference for heritable phenotypic risk factors using heterogeneous genetic instruments. PLoS Genet, 2021. 17(6): p. e1009575.
OpenUrl Google Scholar

[14] 14.↵
Zhao, Q., et al., Statistical Inference in two-sample summary-data Mendelian randomization using robust adjusted profile score. The Annals of Statistics, 2020. 48(3): p. 1742–1769.
OpenUrl Google Scholar

[15] 15.↵
Morrison, J., et al., Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nat Genet, 2020. 52(7): p. 740–747.
OpenUrl PubMed Google Scholar

[16] 16.↵
Burgess, S., et al., A robust and efficient method for Mendelian randomization with hundreds of genetic variants. Nat Commun, 2020. 11(1): p. 376.
OpenUrl PubMed Google Scholar

[17] 17.↵
Rees, J.M.B., et al., Robust methods in Mendelian randomization via penalization of heterogeneous causal estimates. PLoS One, 2019. 14(9): p. e0222362.
OpenUrl CrossRef PubMed Google Scholar

[18] 18.↵
Qi, G. and N. Chatterjee, Mendelian randomization analysis using mixture models for robust and efficient estimation of causal effects. Nat Commun, 2019. 10(1): p. 1941.
OpenUrl PubMed Google Scholar

[19] 19.↵
Verbanck, M., et al., Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet, 2018. 50(5): p. 693–698.
OpenUrl CrossRef PubMed Google Scholar

[20] 20.↵
Hartwig, F.P., G. Davey Smith, and J. Bowden, Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int J Epidemiol, 2017. 46(6): p. 1985–1998.
OpenUrl CrossRef PubMed Google Scholar

[21] 21.↵
Bowden, J., et al., Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator. Genet Epidemiol, 2016. 40(4): p. 304–14.
OpenUrl CrossRef PubMed Google Scholar

[22] 22.↵
Bowden, J., G. Davey Smith, and S. Burgess, Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol, 2015. 44(2): p. 512–25.
OpenUrl CrossRef PubMed Google Scholar

[23] 23.↵
Burgess, S., A. Butterworth, and S.G. Thompson, Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol, 2013. 37(7): p. 658–65.
OpenUrl CrossRef PubMed Google Scholar

[24] 24.
Jiang, L., et al., Constrained instruments and their application to Mendelian randomization with pleiotropy. Genet Epidemiol, 2019. 43(4): p. 373–401.
OpenUrl Google Scholar

[25] 25.
Cho, Y., et al., Exploiting horizontal pleiotropy to search for causal pathways within a Mendelian randomization framework. Nat Commun, 2020. 11(1): p. 1010.
OpenUrl CrossRef Google Scholar

[26] 26.
Sanderson, E., et al., An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings. Int J Epidemiol, 2019. 48(3): p. 713–727.
OpenUrl CrossRef PubMed Google Scholar

[27] 27.
Burgess, S. and S.G. Thompson, Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am J Epidemiol, 2015. 181(4): p. 251–60.
OpenUrl CrossRef PubMed Google Scholar

[28] 28.
Zhu, Z., et al., Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat Commun, 2018. 9(1): p. 224.
OpenUrl CrossRef PubMed Google Scholar

[29] 29.
Bowden, J., et al., Improving the accuracy of two-sample summary-data Mendelian randomization: moving beyond the NOME assumption. Int J Epidemiol, 2019. 48(3): p. 728–742.
OpenUrl CrossRef PubMed Google Scholar

[30] 30.
Foley, C.N., et al., MR-Clust: clustering of genetic variants in Mendelian randomization with similar causal estimates. Bioinformatics, 2021. 37(4): p. 531–541.
OpenUrl Google Scholar

[31] 31.
Berzuini, C., et al., A Bayesian approach to Mendelian randomization with multiple pleiotropic variants. Biostatistics, 2020. 21(1): p. 86–101.
OpenUrl CrossRef Google Scholar

[32] 32.
Xu, S., W.K. Fung, and Z. Liu, MRCIP: a robust Mendelian randomization method accounting for correlated and idiosyncratic pleiotropy. Brief Bioinform, 2021. 22(5).
Google Scholar

[33] 33.
Cheng, Q., et al., MR-LDP: a two-sample Mendelian randomization for GWAS summary statistics accounting for linkage disequilibrium and horizontal pleiotropy. NAR Genom Bioinform, 2020. 2(2): p. lqaa028.
OpenUrl Google Scholar

[34] 34.
Zhu, X., et al., An iterative approach to detect pleiotropy and perform Mendelian Randomization analysis using GWAS summary statistics. Bioinformatics, 2021. 37(10): p. 1390–1400.
OpenUrl Google Scholar

[35] 35.
Grant, A.J. and S. Burgess, An efficient and robust approach to Mendelian randomization with measured pleiotropic effects in a high-dimensional setting. Biostatistics, 2022. 23(2): p. 609–625.
OpenUrl Google Scholar

[36] 36.
Howey, R., et al., Bayesian network analysis incorporating genetic anchors complements conventional Mendelian randomization approaches for exploratory analysis of causal relationships in complex data. PLoS Genet, 2020. 16(3): p. e1008198.
OpenUrl CrossRef Google Scholar

[37] 37.↵
Iong, D., Q. Zhao, and Y. Chen, A latent mixture model for heterogeneous causal mechanisms in Mendelian randomization. The Annals of Applied Statistics, 2024. 18(2): p. 966–990.
OpenUrl Google Scholar

[38] 38.↵
Graham, S.E., et al., The power of genetic diversity in genome-wide association studies of lipids. Nature, 2021. 600(7890): p. 675–679.
OpenUrl Google Scholar

[39] 39.↵
Burgess, S. and S.G. Thompson, Interpreting findings from Mendelian randomization using the MR-Egger method. Eur J Epidemiol, 2017. 32(5): p. 377–389.
OpenUrl CrossRef PubMed Google Scholar

[40] 40.↵
Zhang, L., et al., A new method for estimating effect size distribution and heritability from genome-wide association summary results. Hum Genet, 2016. 135(2): p. 171–84.
OpenUrl Google Scholar

[41] 41.↵
Denault, W.R.P., et al., Cross-fitted instrument: A blueprint for one-sample Mendelian randomization. PLoS Comput Biol, 2022. 18(8): p. e1010268.
OpenUrl Google Scholar

[42] 42.↵
Watanabe, K., et al., A global overview of pleiotropy and genetic architecture in complex traits. Nat Genet, 2019. 51(9): p. 1339–1348.
OpenUrl CrossRef PubMed Google Scholar

[43] 43.↵
Frontera, W.R. and J. Ochala, Skeletal muscle: a brief review of structure and function. Calcif Tissue Int, 2015. 96(3): p. 183–95.
OpenUrl CrossRef PubMed Google Scholar

[44] 44.
Zurlo, F., et al., Skeletal muscle metabolism is a major determinant of resting energy expenditure. J Clin Invest, 1990. 86(5): p. 1423–7.
OpenUrl CrossRef PubMed Web of Science Google Scholar

[45] 45.↵
Havel, P.J., Update on adipocyte hormones: regulation of energy balance and carbohydrate/lipid metabolism. Diabetes, 2004. 53 Suppl 1: p. S143–51.
OpenUrl Abstract/FREE Full Text Google Scholar

[46] 46.↵
Vella, C.A., et al., Skeletal muscle area and density are associated with lipid and lipoprotein cholesterol levels: The Multi-Ethnic Study of Atherosclerosis. J Clin Lipidol, 2020. 14(1): p. 143–153.
OpenUrl CrossRef Google Scholar

[47] 47.↵
Pickrell, J.K., et al., Detection and interpretation of shared genetic influences on 42 human traits. Nat Genet, 2016. 48(7): p. 709–17.
OpenUrl CrossRef PubMed Google Scholar

[48] 48.↵
Luke, A. and D.A. Schoeller, Basal metabolic rate, fat-free mass, and body cell mass during energy restriction. Metabolism, 1992. 41(4): p. 450–6.
OpenUrl CrossRef PubMed Web of Science Google Scholar

[49] 49.↵
Johnstone, A.M., et al., Factors influencing variation in basal metabolic rate include fat-free mass, fat mass, age, and circulating thyroxine but not sex, circulating leptin, or triiodothyronine. Am J Clin Nutr, 2005. 82(5): p. 941–8.
OpenUrl Abstract/FREE Full Text Google Scholar

[50] 50.↵
Merz, K.E. and D.C. Thurmond, Role of Skeletal Muscle in Insulin Resistance and Glucose Uptake. Compr Physiol, 2020. 10(3): p. 785–809.
OpenUrl Google Scholar

[51] 51.↵
Hirano, T., Pathophysiology of Diabetic Dyslipidemia. J Atheroscler Thromb, 2018. 25(9): p. 771–782.
OpenUrl CrossRef PubMed Google Scholar

[52] 52.↵
Welc, S.S. and T.L. Clanton, The regulation of interleukin-6 implicates skeletal muscle as an integrative stress sensor and endocrine organ. Exp Physiol, 2013. 98(2): p. 359–71.
OpenUrl CrossRef PubMed Web of Science Google Scholar

[53] 53.↵
Brenmoehl, J., et al., Irisin is elevated in skeletal muscle and serum of mice immediately after acute exercise. Int J Biol Sci, 2014. 10(3): p. 338–49.
OpenUrl CrossRef PubMed Google Scholar

[54] 54.↵
Nakagomi, A., et al., Relationships between the serum cholesterol levels, production of monocyte proinflammatory cytokines and long-term prognosis in patients with chronic heart failure. Intern Med, 2014. 53(21): p. 2415–24.
OpenUrl PubMed Google Scholar

[55] 55.↵
Hardardottir, I., C. Grunfeld, and K.R. Feingold, Effects of endotoxin and cytokines on lipid metabolism. Curr Opin Lipidol, 1994. 5(3): p. 207–15.
OpenUrl CrossRef PubMed Google Scholar

[56] 56.↵
Burgess, S., N.M. Davies, and S.G. Thompson, Bias due to participant overlap in two-sample Mendelian randomization. Genet Epidemiol, 2016. 40(7): p. 597–608.
OpenUrl CrossRef PubMed Google Scholar

[57] 57.↵
Jiang, T., et al., An empirical investigation into the impact of winner’s curse on estimates from Mendelian randomization. Int J Epidemiol, 2023. 52(4): p. 1209–1219.
OpenUrl Google Scholar

[58] 58.↵
Winkler, T.W., et al., Quality control and conduct of genome-wide association meta-analyses. Nat Protoc, 2014. 9(5): p. 1192–212.
OpenUrl CrossRef PubMed Google Scholar

[59] 59.↵
Purcell, S., et al., PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet, 2007. 81(3): p. 559–75.
OpenUrl CrossRef PubMed Google Scholar

[60] 60.↵
Cruz-Jentoft, A.J., et al., Sarcopenia: revised European consensus on definition and diagnosis. Age Ageing, 2019. 48(1): p. 16–31.
OpenUrl CrossRef PubMed Google Scholar

[61] 61.↵
Giles, J.T., et al., Association of body composition with disability in rheumatoid arthritis: impact of appendicular fat and lean tissue mass. Arthritis Rheum, 2008. 59(10): p. 1407–15.
OpenUrl CrossRef PubMed Web of Science Google Scholar

[62] 62.↵
Janssen, I., S.B. Heymsfield, and R. Ross, Low relative skeletal muscle mass (sarcopenia) in older persons is associated with functional impairment and physical disability. J Am Geriatr Soc, 2002. 50(5): p. 889–96.
OpenUrl CrossRef PubMed Web of Science Google Scholar

[63] 63.↵
Arner, P., et al., Dynamics of human adipose lipid turnover in health and metabolic disease. Nature, 2011. 478(7367): p. 110–3.
OpenUrl CrossRef PubMed Web of Science Google Scholar

[64] 64.↵
Arner, P., et al., Adipose lipid turnover and long-term changes in body weight. Nat Med, 2019. 25(9): p. 1385–1389.
OpenUrl CrossRef Google Scholar

[65] 65.↵
Jiang, Y., et al., The association of lipid metabolism and sarcopenia among older patients: a cross-sectional study. Sci Rep, 2023. 13(1): p. 17538.
OpenUrl Google Scholar

[66] 66.↵
Kim, G. and J.H. Kim, Impact of Skeletal Muscle Mass on Metabolic Health. Endocrinol Metab (Seoul), 2020. 35(1): p. 1–6.
OpenUrl Google Scholar

[67] 67.↵
Loh, P.R., et al., Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet, 2015. 47(3): p. 284–90.
OpenUrl CrossRef PubMed Google Scholar

[68] 68.↵
Yavorska, O.O. and S. Burgess, MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data. Int J Epidemiol, 2017. 46(6): p. 1734–1739.
OpenUrl CrossRef PubMed Google Scholar

A new robust and accurate two-sample Mendelian randomization method with a large number of genetic variants

Abstract

Introduction

Results

Outline of the proposed method

The distribution of

Detecting pleiotropic effects

Causal effect: type-I error rate

Causal effect: power and effect size

Running time

Real data analysis

Experimental validation

Discussion

Online methods

Basic model

Estimating r using z-scores

Modeling uncorrelated and correlated horizontal pleiotropy effects

Likelihood function under horizontal pleiotropy

Test of causal effect

Test of pleiotropic effects

Unstandardized phenotypes

Simulation studies

Comparison with existing methods

Real data application

Mouse-model experiments

Data Availability

Competing interests

Data availability

Code availability

Acknowledgments

References

Subject Area

Citation Manager Formats

A new robust and accurate two-sample Mendelian randomization method with a large number of genetic variants

Abstract

Introduction

Results

Outline of the proposed method

The distribution of <img class="highwire-embed" alt="Embedded Image" src="https://www.medrxiv.org/sites/default/files/highwire/medrxiv/early/2024/04/18/2024.04.16.24305933/embed/inline-graphic-15.gif"/>

Detecting pleiotropic effects

Causal effect: type-I error rate

Causal effect: power and effect size

Running time

Real data analysis

Experimental validation

Discussion

Online methods

Basic model

Estimating r using z-scores

Modeling uncorrelated and correlated horizontal pleiotropy effects

Likelihood function under horizontal pleiotropy

Test of causal effect

Test of pleiotropic effects

Unstandardized phenotypes

Simulation studies

Comparison with existing methods

Real data application

Mouse-model experiments

Data Availability

Competing interests

Data availability

Code availability

Acknowledgments

References

Subject Area

Follow this preprint

The distribution of