Inferring genetic variant causal network by leveraging pleiotropy ================================================================= * Martin Tournaire * Asma Nouira * Yves Rozenholc * Marie Verbanck ## Abstract Genetic variants have robustly been associated with multiple traits through genome-wide association studies (GWAS) over the past two decades. However, pinpointing the true causal genetic variant and its biological mechanism is still a considerable challenge. Recently, much concerned has been raised about the weak overlap between expression quantitative trait loci or DNA methylation with GWAS variants, when these very same molecular phenotypes have been routinely used to interpret GWAS variants. Therefore, we propose to takes the opposite approach to conventional methods and to infer variant causal networks by leveraging pleiotropy. We introduce PRISM (Pleiotropic Relationships to Infer the SNP Model) that aims to distinguish between true direct effects and pleiotropic effects in order to infer a causal network for each genetic variant. The fundamental principle of PRISM is to reassess GWAS associations to test for the consistency of a given variant-trait effect in the pleiotropic context of the other traits. PRISM clusters significant genetic variant effects in 3 categories: trait-mediated, confounder-mediated, and direct effects. By cross-referencing the information on all traits, a causal network is built for each genetic variant. On simulations, PRISM was able to recover direct effects with high precision in complex networks of traits. Then, we applied PRISM to a set of 61 heritable traits and diseases, using GWAS summary statistics from the UK Biobank. Interestingly, direct effects represent less than 13% of total significant effects, while vertical and confounding effects represent 43% and 44% respectively. Direct variants were largely enriched in per-variant heritability compared to GWAS-significant variants and pleiotropic variants. Pathways from direct variants lead to higher enrichment than GWAS variants. PRISM was able to pinpoint direct variants mapped to more trait-specific genes than GWAS, and the PRISM gene-trait network appeared disentangled and more relevant compared to the GWAS gene-trait network. Finally, we could show the concordance of the causal networks inferred by PRISM with some networks for a panel of validated variants from the literature. ## Introduction Over the past 20 years, Genome Wide Association Studies (GWASs) have established a myriad of associations between genetic variants and human traits1,2. GWAS is a method that measures and statistically tests the association between several million genetic variants and one trait of interest. According to the GWAS catalog inventory3, 6,868 publications and 619,964 unique genetic variant-trait associations have been reported as of May 2024. However, GWAS suffers from many limitations and biases. First, pleiotropy, *i*.*e*. a single genetic element affecting more than one trait, is pervasive in the human genome4–6. Because of pleiotropy, genetic variants are often associated with multiple traits in GWAS7. Second, Linkage Disequilibrium (LD), referring to the non-random association of alleles at different loci on a chromosome, renders indistinguishable between candidate causal variants in a genomic locus8. Third, the biological mechanisms underlying the associations are rarely resolved, a systematic review reported only 309 experimentally validated non-coding GWAS variants9. Therefore, precisely pinpointing true causal genetic variants to the complex traits they affect proves to be a tremendous challenge10, hence the need for computational approaches complementary to GWASs. The most prominent method to pinpoint causal variants is fine-mapping that aims at distinguishing between causal genetic variants and non-causal variants, using LD reference panels and genomic annotations10. Simply put, the objective is to attribute the real effect to a minimal subset of top variants in LD within a locus (at best, 1), and disqualify the other variants as LD effects. However, fine-mapping heavily relies on annotations to discriminate between causal and LD effects11–13. And yet recently much concern has been raised about the fact that genetic variants from GWASs associated with complex diseases overlap very little with annotations like molecular quantitative trait loci (QTL), particularly expression QTL (eQTL)14,15. GWAS and cis-eQTL variants even seem systematically and structurally different16. Likewise, GWAS and DNA methylation do not find the same causal genes17. Here, we take the opposite view to traditional fine-mapping methods, with the objective of severing our approach from annotations. Instead, the idea is to take advantage of the omnipresence of pleiotropy to disentangle the variant-trait associations obtained from GWAS. Apart from LD, we think that these associations can be explained by 3 distinct underlying biological mechanisms: 1) direct effect when the association is caused by a true causal direct effect from variant to trait 2) vertical pleiotropy when the association can be explained by the direct variant-trait effect from another trait, 3) confounding pleiotropy when it can be explained by a confounding factor between traits. Therefore, to disentangle the variant-trait associations, we propose to leverage pleiotropic relationships between traits by rerouting an integrative Mendelian randomization (MR) method. MR is used to infer the causality of an exposure trait X on an outcome trait Y. MR uses genetic variants as instrumental variables that are robustly associated with the exposure of interest and tests whether the effects of the variants on the exposure result in proportional effects on the outcome. Classical MR relies on three assumptions: 1) the genetic variants must be strongly associated with the exposure X, 2) the genetic variants cannot directly affect the outcome Y or 3) the confounder U of the exposure-outcome relationship. It has been shown that traditional MR massively suffers from pleiotropy, which violates the assumptions and biases the results17. This inspired multiple integrative MR methods (LHC-MR18, CAUSE19, MR-CUE20) that takes pleiotropy into account and infer relationships between traits. However, MR only use genetic variants as instrumental variables, and no conclusion are drawn on individual variant-trait associations. We choose to reroute MR to focus on the relationships between variants and traits. Here, we propose PRISM, which stands for Pleiotropic Relationships to Infer the SNP model, a genome-wide method to disentangle variant-trait effects from GWAS, into vertical, confounding, or direct effects. To do so, PRISM re-examines variant-trait effects from GWAS through the prism of other traits. Concretely, PRISM runs a pair-wise MR model that integrates confounding, across all studied traits. Then, PRISM predicts significant variant-trait effects, that are consistent regardless of the trait context. Finally, PRISM reconstructs the network of variants and traits, and assigns a label to all significant variant-trait effects. To assess the performances of PRISM, we simulated GWAS summary statistics for a complex network of traits. We processed 61 heritable traits from UK Biobank through PRISM and disentangled the effects of ∼4 million variants on these traits. ## Results ### PRISM disentangles the associations of genetic variants from GWASs by leveraging pleiotropy For each genetic variant, PRISM inputs GWAS summary statistics for multiple traits to infer a causal network derived from predicted direct and pleiotropic effects for each genetic variant (Fig. 1). Each significant variant-trait effect is labeled. We hypothesize that the observed associations in GWAS can be attributed to 3 distinct underlying biological mechanisms: 1) confounding pleiotropy occurs when the association is explained by a confounding factor between traits 2) vertical pleiotropy arises from the causal variant-trait effect of another trait, 3) a direct effect occurs when the association is due to a true causal direct effect from the variant on the trait. Therefore, variant-trait associations in GWAS are induced by true causal direct effects and their pleiotropic ripple effects. To comprehensively disentangle these associations, PRISM analyzes variant-trait effects across multiple contexts. Specifically, PRISM evaluates the effect of all genetic variants on a trait X relative to all other traits. It systematically assesses each variant effect on X relative to trait A, then to trait B, and so forth. By aggregating information across these various contexts, PRISM identifies and predicts significant variant-trait effects, subsequently labeling them based on their nature. Therefore, the “confounding pleiotropy” label indicates that the variant was identified as having an effect on trait X only through a confounder shared with another trait. In contrast, the “vertical pleiotropy” label denotes that the variant effect on trait X is only mediated through another trait causally related to X. Conversely, A “direct effect” means that the variant is not flagged for pleiotropy, indicating that its effect on X is not mediated by any other factor. Once this procedure is completed for all traits, the obtained direct and pleiotropic labels are used to construct a comprehensive causal network for each variant and the traits it impacts. ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/06/03/2024.06.01.24308193/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2024/06/03/2024.06.01.24308193/F1) Figure 1. Overview of PRISM | **A)** 3 distinct mechanisms underlying GWAS associations: 1) direct effect 2) vertical pleiotropy, i.e. effect mediated by a causal relationship between two traits, 3) confounding pleiotropy, i.e. effect mediated by a confounding factor. **B)** PRISM aims at disentangling GWAS associations into direct effects, hypothesized to convey true causal effects, from pleiotropic effects (vertical and confounding). **C)** Each variant-trait association is evaluated in a model of trait X in different contexts relative to A, then to trait B, and so on. Each model takes into account a unique latent confounder U. **D)** The main idea behind PRISM is to assess the effects of all studied genetic variants on a trait X, through the prism of multiple observations of these effects in multiple contexts. Then, by cross-referencing the information and combining all traits, a full causal network can be built for each variant. ### PRISM accurately detects direct effects in simulations We constructed a complex pleiotropic network consisting of 15 simulated traits and 100,000 variants to generate GWAS summary statistics (See online methods and Fig. 8). This network was simulated under multiple scenarios, encompassing a wide array of parameters (Supplementary Table 1). We modulated the polygenicity and the heritability of traits, the strength of causal effects between traits, and the strength of the confounders across scenarios. Then, we processed the simulated data through PRISM, and compared the true variant-trait effects with those predicted by PRISM. We calculated the precision and recall to assess the performance of predicting each type of pleiotropy. These metrics were computed as follows: ![Graphic][1] and ![Graphic][2]. As shown on Fig. 2 and Supplementary Fig. 1, we found that PRISM achieved very high precision and recall in scenarios characterized by highly heritable traits with low polygenicity and low pervasive confounding effects, regardless of causal relationships between traits. We prioritized precision over recall, even in scenarios featuring high polygenicity and high targeted confounding effects. ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/06/03/2024.06.01.24308193/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2024/06/03/2024.06.01.24308193/F2) Figure 2. Precision of PRISM predictions for significant variant-trait effects, on simulations. The y-axis represents precision in different conditions. the x-axis represents the set of 15 simulated traits (See Online Methods and Fig. 8). Significant effects are defined with P < 5 × 10−9, which is slightly less strict than the PRISM recommended threshold. Bars are colored according to predicted direct and pleiotropic labels. A bar below zero means that PRISM predicted 0 effect in this category. Eight scenarios are represented across facets, with varying parameters. Polygenicity represents the proportion of variants with a direct effect on each trait. Effect on B4 represents the proportion of effect passed to B4, for all traits with a non-zero vertical effect on B4. High targeted confounding means that few variants (0.01) have an effect on the confounder U, but with magnitude of effect rivaling direct effects. Low pervasive confounding means that a large proportion (5) of variants have an effect on the confounder, but with low magnitude. ### PRISM complements traditionnal fine-mapping methods PRISM adopts a fundementally different approach compared to traditional fine-mapping. Instead of distinguishing between variants in LD, PRISM predicts whether variant-trait effects are direct or caused by pleiotropy. We evaluated the compatibility of PRISM with standard fine-mapping methods, such as SuSiE21,22, CARMA12, and FINEMAP23. To compare PRISM and these fine-mapping methods, we focused on a specific trait, B4, which is the most central trait in our simulated network. We selected a highly heritable and polygenic scenario (scenario 25, Supplementary Table 1) and focused on true causal variants. Firstly, we assessed PRISM’s precision in indentifying true causal variants as direct effects, and compared this to the precision of the fine-mapping methods in prioritizing true causal variants within the fine-mapped set. In this scenario, PRISM achieved a precision of 95%, outperforming the fine-mapping methods which had precision rates ranging from 86% to 89%. Furthermore, both PRISM and traditional fine-mapping methods were mainly misled by high targeted confounding variant-trait effects. However, fine-mapping methods are designed to identify a limited number of variants within a locus, whereas PRISM operates on genome-wide scale. Hence, PRISM exhibited a recall of 51%, significantly higher than the less than 1% recall observed for the fine-mapping methods. These findings indicate that for highly polygenic traits, PRISM is more efficient in identifying across the genome compared to fine-mapping method that focus on specific loci. ### PRISM reassesses GWAS variant-trait associations and distinguishes between direct and pleiotropic effects PRISM tests and labels variants for direct and pleiotropic effects, *i*.*e*. vertical and confounding, effects. Contrary to GWAS, an effect is considered PRISM significant if it remains consistent across multiple assessments in different contexts. Furthermore, comparing p-values from GWAS and PRISM provides insight into the specificity of PRISM, as PRISM significant variants are generally GWAS or sub-GWAS significant. We processed 61 heritable traits with GWAS summary statistics from UK Biobank using PRISM (Supplementary Table 2). In Fig. 3, p-values from both methods for all 947 significant variants are represented, for coronary heart disease (CHD). Interestingly, most variants were labeled with vertical or confounding pleiotropy, generally related to lipids, indicating that the majority variant effects on CHD are mediated by lipids. Moreover, the most significant variants according to GWAS were labeled with confounding effects. Notably, the few variants labeled with direct effect map to the same non-coding gene CDKN2B-AS1. A recent study24 highlights its potential role in CHD by acting as an RNA sponge, which could explain why PRISM did not trace the effect to any other trait in the framework. Therefore, PRISM tests and assesses genetic variant effects, providing further insights into the underlying biological mechanisms leading to observed associations in GWAS. ![Figure 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/06/03/2024.06.01.24308193/F3.medium.gif) [Figure 3.](http://medrxiv.org/content/early/2024/06/03/2024.06.01.24308193/F3) Figure 3. P-values from the UK Biobank GWAS on coronary heart disease (CHD) (x-axis) and PRISM (y-axis) for 947 genetic variants. Each dot represents a significant variant according to GWAS or PRISM, colored according to PRISM predicted labels: direct (blue), vertical (green) and confounding (red). ### PRISM reveals that most observed associations in GWAS summary statistics are caused by relationships between traits Upon examining all variant-trait effects, we observed that direct effects represent less than 13% of all PRISM significant effects, while confounding and vertical pleiotropy represent 44% and 43% respectively. Since PRISM leverages pleiotropy, we aimed to characterize the pleiotropic nature of the associations discovered in GWAS. Indeed, GWASs often reveal numerous observed pleiotropic associations, where a genetic variant demonstrates significant associations with at least two traits. Specifically, we identified 170,433 variants exhibiting observed pleiotropic effects. Therefore, we assessed the number of variants with pleiotropic effects corresponding to a variant having an effect on two traits among GWAS variant-trait associations using PRISM. Naturally, confounding and vertical effects directly provides us with pleiotropic effects, however for the direct effects, we defined the horizontal pleiotropic effect (also called true pleiotropy or uncorrelated pleiotropy) when a genetic variant has a direct effect two traits. Using PRISM on the 61 highly heritable traits, we found that confounding and vertical pleiotropy are responsible for respectively 66.8% and 33% of the observed pleiotropy in GWAS (Supplementary Fig. 2). Thus, horizontal pleiotropy, true multiple direct causal effects from a genetic variant, is found extremely rare and represents 0.2%. This finding underscores the complexity of genetic effects on multiple traits and highlights the effectiveness of PRISM in elucidating the underlying mechanisms of pleiotropic associations in GWAS findings. ### PRISM Direct variants are significantly enriched in per-variant heritability compared to GWAS variants and vertical/confounding variants In GWAS, variant heritability measures the proportion of phenotypic variance of a trait explained by all measured variant associations. Thus, the per-variant heritability is the contribution of a specific genetic variant to the overall heritability. We applied stratified LD-score regression25 to calculate per-variant heritability enrichment across labeled variant-trait effects regrouped into four categories: Three categories (direct, vertical, confounding) for PRISM significant variant effects, and a category for GWAS significant associations under the same threshold as PRISM. It is worth mentioning that these categories are not mutually exclusive, since the majority of PRISM significant variant effects are also GWAS-significant. Only 45 traits among the 61 traits had enough variants in all four categories to be processed by s-LDSC. Across most traits, we observe a consistent order in the enrichment (Fig. 4). Indeed, direct variants demonstrate higher enrichment in per-variant heritability compared to GWAS variants, which in turn exhibit higher enrichment than vertical variants, followed by confounding variants. This highlights the hypothesis that indirect effects on traits are diluted compared to direct effects. ![Figure 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/06/03/2024.06.01.24308193/F4.medium.gif) [Figure 4.](http://medrxiv.org/content/early/2024/06/03/2024.06.01.24308193/F4) Figure 4. Enrichment in per-variant heritability for each category, calculated with LD-score regression. Enrichment was calculated for PRISM significant variants, specific to each trait, and for genome-wide significant variants in UK Biobank. ### PRISM mapped eGenes are found in relevant tissues Standard follow-up analysis in GWAS involves checking whether the variants identified are eQTL in tissues linked to the studied trait. We would expect eQTLs corresponding to direct variants to be present in tissues directly linked to the studied trait, while eQTLs corresponding to pleiotropic variants to be linked to tissues indirectly influencing this trait. For CHD, we investigated the enrichment of relevant tissues in the expression of genes associated with eQTL variants identified by PRISM. We retrieved eQTLs and their correspondent eGenes through GTEx (Fig. 5). We did not find any eQTL variant labeled as direct. However, we observed 3 enrichment peaks above 2, each providing valuable insight. The liver tissue is enriched in vertical pleiotropy, suggesting a lipid-mediated effect from the liver. The brain anterior cortex is enriched in confounding pleiotropy, which hints at CHD comorbidity with obesity, linked to brain functions. Similarly, lymphocyte cells are enriched in confounding pleiotropy, hinting at a potential association between CHD and obesity within the inflammatory system. These examples show that PRISM categories correspond to more or less direct mechanisms influencing the studied traits. ![Figure 5.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/06/03/2024.06.01.24308193/F5.medium.gif) [Figure 5.](http://medrxiv.org/content/early/2024/06/03/2024.06.01.24308193/F5) Figure 5. Enrichment of eGenes, from different tissues, mapped to genetic variants, according to the label of variants. The x-axis represents all tissues. The y-axis represents the enrichment of eGenes. The trait studied is I25 of UK Biobank, chronic ischaemic heart disease (CHD). The eGenes were retrieved from GTEx. ### PRISM produces coherent results on a panel of gold-standard variants The finality of PRISM is to produce causal networks for genetic variants. To confirm the reliability of PRISM, we aimed to compare its generated networks to what the scientific literature suggests with methods completely distinct from GWAS. For some networks of interest, we therefore examined whether the trait-variant effects identified and labeled by PRISM had already been identified previously. For example, variants rs7528419/rs629301/rs646776, residing in the SORT1 locus coding for the sortilin protein, is strongly associated with coronary heart disease (CHD) in GWAS26. According to PRISM (Fig. 6), these variants display a vertical effect on CHD but a direct effect on apolipoprotein B (apoB). Essentially, PRISM suggests that these variants affect CHD only through their direct effect on apoB, which has a causal effect on CHD. Recent studies have indeed demonstrated that sortilin restricts secretion of apoB27 and that apoB is an excellent marker of cardiovascular risk28. Another example of validated network is rs2282679/rs2298850, mapped to gene GC, which is validated for vitamin D levels. Kew et al.29 also highlights that this gene involved in white blood cells and neutrophil accumulation. Remarkably, these findings align with the 3 direct effects identified by PRISM coming from these variants (Supplementary Fig. 3). ![Figure 6.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/06/03/2024.06.01.24308193/F6.medium.gif) [Figure 6.](http://medrxiv.org/content/early/2024/06/03/2024.06.01.24308193/F6) Figure 6. PRISM causal network of variants rs7528419/rs629301/rs646776. These variants are represented as a black triangle. Arrows represent causal effects. Red arrows are effects of variants through a confounder, represented as a red square, meaning confounding pleiotropy. Green arrows are effects of variants through traits, represented as circled colored by general category, meaning vertical pleiotropy. Blue arrows are direct causal effects from variants to traits. Variants rs7528419/rs629301/rs646776 are involved in multiple mechanisms, but the one we’re most interested in is the ApoB-mediated effect on CHD. ### PRISM pinpoint direct variants that are mapped to more trait-specific genes than GWAS Traditional pipelines for GWAS analysis typically involve examining genes mapped to significant variants, and their associated pathways. In our study, we performed annotations analysis to map functionally annotated variants to genes based on their physical positions in the genome and employed expression Quantitative Trait Locus (eQTL) mapping. For this purpose, we utilized the FUMA platform30, a powerful tool for conducting comprehensive functional annotation analyses. Supplementary Fig. 4 illustrates the strong connectivity of traits through GWAS-mapped gene. Upon examining genes mapped to PRISM direct variants, we observed far fewer common genes among traits. This observation suggests that genes mapped to direct variants may be more relevant, resulting in a simpler and more biologically realistic variant-gene-trait network obtained thanks to PRISM. To further investigate the biological significance of traits, we conducted an enrichment analysis based on DisGeNET31 pathways using genes mapped to PRISM direct and GWAS variants. The goal was to identify enriched pathways and compare their significance between PRISM direct and GWAS mapped pathways. As shown in Supplementary Fig. 5, PRISM direct pathways are significantly more enriched than GWAS pathways for ten traits, while the opposite is true for only 4 traits. Next, we employed bipartite network analysis to compare the centrality measures of gene-trait networks constructed from PRISM direct and GWAS. In Fig. 7, PRISM network demonstrated lower degree and closeness, indicating fewer links between traits, but higher betweenness, suggesting the presence of more central traits. To ensure a fair comparison despite the fewer PRISM direct genes compared to GWAS genes, we conducted two distinct methodologies. First, we randomly removed genes from the GWAS network while retaining the edges linked to the selected genes, which resulted in similar centrality measurements of full GWAS network as seen in the black points on the graph. Second, we removed both genes and edges to create GWAS sub-networks with the same number of nodes and edges as the PRISM direct network, represented by the red points forming a plateau. Remarkably, our previous comparison still persisted even with similarly-sized networks. This suggests that the network created from PRISM direct variants is denoised and untangled, discarding redundant and biased relations induced by pleiotropy. ![Figure 7.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/06/03/2024.06.01.24308193/F7.medium.gif) [Figure 7.](http://medrxiv.org/content/early/2024/06/03/2024.06.01.24308193/F7) Figure 7. Centrality measures of PRISM direct bipartite gene-trait network and GWAS bipartite gene-trait network. The three sub-plots show respectively the degree, the betweeness, and the closeness metrics, represented on the y-axis. The x-axis represents the traits, colored by category, as the metrics are specific to a trait in the network. Green dots correspond to the PRISM direct network. Grey dots correspond to the GWAS network. Blue dots correpsond to the average of multiple networks with randomly removed genes, to have the same number of genes as PRISM. Red dots correspond to the average of multiple networks with randomly removed genes and edges, to have the same number of genes and edges as PRISM. ![Figure 8.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/06/03/2024.06.01.24308193/F8.medium.gif) [Figure 8.](http://medrxiv.org/content/early/2024/06/03/2024.06.01.24308193/F8) Figure 8. Architecture of the network of simulated traits. The simulated traits are represented as circles. Four horizontal variants are represented as triangles. Green arrows depict vertical relationships between traits. Green and red arrows depict vertical relationship between traits that can also be seen as confounding pleiotropy. Network pleiotropy is not showed here but all pairs of traits, are affected by a confounder U with effect on both of them. The only exception is E0, with no relationship with any other trait. ## Discussion We developed PRISM (Pleiotropic Relationships to Infer the SNP model) aims at reassessing the associations of genetic variants with traits from GWAS to distinguish between direct effects and pleiotropic effects. *In fine*, PRISM infers a causal network for each genetic variant. PRISM takes the opposite view to current methods such as fine-mapping, and does not resort to any annotation but instead leverages pleiotropy which is pervasive in the human genome. We assessed the performances of PRISM on a simulated complex pleiotropic trait network and compared it with several other fine-mapping methods. We found that PRISM has high precision to predict direct and pleiotropic variant-trait effects, but limited power to detect significant variant-trait effects for highly polygenic traits. We applied PRISM to a set of 61 heritable traits and diseases from UK Biobank. PRISM was able to build a causal network for 371,551 unique variants. PRISM showed that most GWAS associations were caused by pleiotropy, only 13% of variant-trait effects were direct. Direct variants predicted by PRISM were significantly enriched in per-variant heritability compared to GWAS-significant variants and pleiotropic variants. Comparing pathways from PRISM direct variants and GWAS variants, the enrichment was stronger for PRISM. PRISM was able to pinpoint direct variants mapped to more trait-specific genes than GWAS, and the PRISM gene-trait network appeared disentangled and more pertinent compared to the GWAS gene-trait network. PRISM inferred relevant variant causal networks; we could show the concordance of the causal networks inferred by PRISM and validated in the literature for a panel of validated variants. However, the proposed method has a number of limitations. To begin with PRISM is highly dependent of the traits processed. Pleiotropy is relative to the traits in the network, and so are the pleiotropic labels. Depending on the shape of the network, some effects can be justifiably classified as vertical or confounding pleiotropy. Adding or removing traits, especially central traits, can have an impact on whether or not the variants are significant, and their type of pleiotropy. On simulations, we applied PRISM to the same complex network of traits and purposefully omitted a central trait from input data. We saw that vertical and confounding variants induced by this central trait were predicted as direct for the other traits, unable to establish the mediation link with the omitted central trait. In addition, PRISM predictions can be limited by three factors. First, PRISM heavily relies on LHC-MR calculated parameters to assess relationships between traits and to infer direct and pleiotropic labels. Second, PRISM is geared towards precision to identify direct variant-trait effects, so power to detect significant effects, as well as confounding and vertical precision, are more limited. This is a deliberate choice since we aim to favor the identification of direct effect. Third, the precision and recall were calculated from summary statistics simulated from a pleiotropic network of heritable traits. Similarly on real data, we only included heritable traits. Traits that are not heritable might disrupt the inferred causal network. Interpreting results that include low heritability traits, which will be mechanically disadvantaged in comparison with highly heritable traits, must be done with caution. That said, we did simulate networks without any pleiotropy, and PRISM was able to identify direct effects correctly. Traits with very simple genetic architecture (*e*.*g*. monogenic diseases), can probably be processed without any issue. PRISM takes an orthogonal approach to traditional fine-mapping methods, and aims not to distinguish between variants in linkage disequilibrium (LD), but to assess whether the observed effect of a variant on a trait can be explained by pleiotropy, *i*.*e*. by another trait or a confounding factor. We chose not to clump genetic variants when presenting the biological results, because we do not want a true causal variant effect to be eliminated for the benefit of a LD effect. We suggest to use another fine-mapping method, like SuSiE, to decide between multiple candidate causal variants in LD. However, LD is taken into account when calculating probabilities that genetic variants have causal effects. Variants with large LD scores will be penalized as they are expected to mechanically have larger effect sizes32. The comparison between PRISM and other fine-mapping methods is debatable, as they differ in aim and approach. We wanted to make this comparison for two reasons. First, to show that PRISM has comparable precision to identify causal variant-trait effects. Second, we wanted to highlight the compatibility of these orthogonal approaches. Using them together, to try and find the true causal variant in a locus and get information about the pleiotropy of the variant, seems to be extremely pertinent knowledge for future investigation on this putative causal variant. PRISM mathematic model implies that direct effects result in stronger effect sizes in GWAS summary statistics. It would be a legitimate question to ask if direct variants are simply the most significant variants with the strongest effect sizes, while confounding and vertical variants are weaker significant variants. But although direct variants tend to have higher initial Z-scores from GWAS, PRISM results show much more intertwined results when looking at the original Z-scores of the different pleiotropies (Supplementary Fig. 6). For simple molecular traits like lipids or biomarkers, direct variants do tend to have higher Z-scores. However, complex traits like respiratory or metabolic traits do not follow this trend. In summary, undiluted direct effects should have stronger effect sizes than pleiotropic effects, but we believe that PRISM is capable of distinguishing between subtle biological mechanisms. To conclude, at the trait level, PRISM can be used to better apprehend the genetic architecture of a trait and the relationships between traits. At the variant level, PRISM can help understanding the specific genetic effect of a variant on multiple traits. We strongly believe that building the causal network of genetic variants is a priority to guide in vitro and in vivo follow-up studies as well as for medical applications, like genomic medicine or genome editing. ## Methods ### General principle of the Pleiotropic Relationships to Infer the SNP Model (PRISM) method The aim of PRISM is to infer the causal network of genetic variants, using GWAS summary statistics. To do so, PRISM reassesses variant-trait effects from GWASs and distinguishes between direct effects and pleiotropic effects. PRISM is based on an integrative pairwise Mendelian randomization (MR) model, extended to obtain pleiotropic information at the level of genetic variants. Many genetic variants are significantly associated with multiple traits in GWAS. PRISM is designed to check these associations and distinguish between direct effects and pleiotropic effects that leads to these observed associations. To do so, PRISM analyzes the pairwise causal relationship between traits using LHC-MR. Then, PRISM cross-references the information obtained for each trait individually, *i*.*e*. for a given trait all the pairs of traits that include the trait of interest, to describe the relationships between all the present variants and the trait. From a computational point of view, PRISM is divided in two pipelines, a pairwise pipeline and a traitwise pipeline, as described on Supplementary Fig. 7: * The pairwise pipeline processes traits two by two. Each unique pair of traits is first handled by LHC-MR to produce a set of trait-level *θ* parameters describing the trait relationships encompassing the pair of traits as well as a latent confounding factor. Then, the components of the Gaussian mixture model using the parameters estimated from LHC-MR are used to compute posterior probabilities describing the impact of the variants on these traits. * The traitwise pipeline processes traits one by one. For one trait, the probabilities calculated previously are gathered and tested against each other, variant by variant. For significant variants, direct and pleiotropic labels stem from the probabilities and parameters obtained earlier. PRISM derives its robustness from the high number of observations of the effect of variant *k* on trait *A* (i.e in all pairs containing *A*). As such, a high number *T* of traits with available GWAS summary statistics must be selected. Each trait is processed *T* − 1 times, paired with all the other traits. This represents ![Graphic][3] pairs of traits. To fulfill the assumptions of the paired student test used by PRISM, we recommend to include at least 31 traits. ### PRISM pairwise pipeline For *m* genetics variants and for each pair of traits *X* and *Y*, the pairwise pipeline extracts a matrix **S** using their summary statistics. ![Formula][4] Each row of the matrix corresponds to a variant. For variant *k*, ![Graphic][5] is a score that corresponds to having no effect on anything in the *X*-*Y* network, ![Graphic][6] are the scores that corresponds to having an effect on *X, Y*, and *U* respectively. This matrix is computed in 2 steps: * The pair of traits *X* and *Y* is processed by LHC-MR, using their GWAS summary statistics. A standard effect is computed from the t-statistic and the sample size for each variant *k*, ![Graphic][7], with ![Graphic][8] and ![Graphic][9]. Only genetic variants with computable p-values and existing in the HapMap3 framework are taken into account, as this is also a limitation from LHC-MR. A set of trait-level *θ* parameters is obtained. * From these parameters stem the bivariate Gaussian mixture model. The eight components are: * (0) No association * (1) Associated with X * (2) Associated with U * (3) Associated with Y * (4) Associated with X and U * (5) Associated with X and Y * (6) Associated with U and Y * (7) Associated with X, Y and U For a genetic variant *k* in group *i*, we consider ![Graphic][10] that whose variance-covariance matrix ![Graphic][11] depends on the model parameters *θ*. The probability that any variant *k* comes from the Gaussian component *i* is (from Bayes’ theorem): ![Formula][12] with ![Graphic][13] the variance-covariance matrix of the Gaussian distribution *i, ϕ* the joint probability density function for bivariate normal distribution, ![Graphic][14], the proportion of variants in each class a priori from *θ* parameters. These probabilities are stored and will be use to decide between direct and pleiotropic effects. * Then we translate the probabilities of variant *k* to come from the Gaussian component *i* into scores, to have no effect (denoted *O*), or an effect on *X, Y* or *U*: ![Formula][15] ![Graphic][16] are *m* vectors of dimension 4 that contain these scores, for each variant *k*. ![Formula][17] **S** is a matrix of *m* × 4 dimensions stemming from the concatenation of all ![Graphic][18]. vectors. ### PRISM traitwise pipeline We detailed the process to obtain **S** for two complex traits *X* and *Y*, using their GWAS summary statistics. The idea of PRISM is to apply this process to many traits, paired with each other exhaustively. Therefore,![Graphic][19] different **S** matrices were obtained from the pairwise pipeline. Then, for each variant *k*, we extract the *Ŝ* scores of each trait. For example, for trait *A*, we extract all ![Graphic][20] from all **S** matrices containing A. So we get ![Graphic][21] and ![Graphic][22] These vectors ![Graphic][23] and ![Graphic][24] are a collection of *T* − 1 observations of the effect of *k* on trait A and of having no effect respectively. ### Statistical test of the variant-trait effect consistency Next, we compare these values using a paired two-sample student t-test. We define ![Graphic][25] as the average of all ![Graphic][26] and ![Graphic][27] as the average of all ![Graphic][28]. The hypotheses of the test are ![Graphic][29] and ![Graphic][30]. Applying that test to all *k* variants, we obtain one p-value per variant. A variant *k* is significant if p-value ![Graphic][31]. This threshold corresponds to a Bonferroni correction. Then, a significant variant is flagged with vertical pleiotropy on trait *A* if any trait *B* is causal to trait *A*, and ![Graphic][32] in (*A, B*) causal inference model. This means that the probability that the genetic variant *k* has an effect only on trait *B* is higher than all other possibilities. A significant variant is flagged with confounding pleiotropy on trait X if, in at least 1 causal inference model involving ![Graphic][33]. This means that the probability that the genetic variant *k* has an effect via a confounder is higher than all other possibilities. The rest of the significant variants are considered as having a direct effect on *A*. A significant variant having a direct effect on more than one trait will be considered horizontally pleiotropic. ### Inference and deconvolution of the variant causal network For each variant, we derive the causal model by representing a graph with the variant and the traits as nodes and the pleiotropic relations, *i*.*e* direct, vertical, or confounding effect, inferred from PRISM as edges. Therefore, we deconvolute the obtained causal graph by removing vertical edges between the variant and a given trait when conditioning on all the traits involved in other vertical edges. As a concrete example, let us consider three traits A, B, and C. The three following vertical effects are reported: 1) vertical effect of the variant on trait B through trait A, 2) vertical effect of the variant on trait C through trait B, 3) vertical effect of the variant on trait C through trait A. Thus, we deconvolute the causal graph by considering that the variant has an effect on trait A, which has an effect on trait B, which has an effect on trait C. In other words, we remove the arrows from variant to traits B and C, substituting them by arrows from trait A to trait B, and from trait B to trait C, respectively. In addition, when a variant has a vertical effect on different traits but mediated by the same causal trait, we average the effect of the variant on the causal vertical trait and remove all duplicated edges. ### Simulation framework We started by creating a realistic network of 15 traits (Fig. 8). We created 32 different scenarios, with various parameters for heritability, polygenicity, and causal relationships. We chose to simulate 100,000 variants, a number high enough to approach a genome-wide model, but still computationally accessible in relation to the number of traits and scenarios. The standardized effects of all these variants on all the traits were simulated, taking into account the relationships between the traits. First, for each trait *X*, we randomly selected genetic variants to have a true direct effect on trait *X* and on the confounders between trait *X* and all the other traits *Y*. For the selected genetic variants, the true direct effects were drawn from a Gaussian distribution whose parameters depend on the trait and the scenario. Then, the true effects were relayed to the other traits through vertical and confounding pleiotropy. Additionally, the effects are propagated according to the LD structure of each variant. The LD structure that we used is derived from 1000 Genomes33 LD data from chromosome 1. Results from simulated data are always clumped, in order to enable interpretation and comparison between the true labels used in data generation and calculated labels from PRISM. The choice was made to simulate data with small independent LD-blocks. No Linkage Disequilibrium in simulated data would be a pitfall, as its presence is pervasive and a real challenge to interpret any genetic result. Large LD-blocks comprised of hundred of variants would be more realistic, but computationally unaffordable for extensive simulations. Finally, an error term is added. To calculate the LD score of a variant, we did 1 + ∑LD with all other variants. This gives us, for all genetic variants, standardized effects on all traits and LD scores to be processed by PRISM. In scenario 26, for trait B4, we computed precision and recall for CARMA, FINEMAP, and SuSiE. Z-scores and LD-matrix were separated in 10 segments of approximately 10 000 variants independent from other segments, for computation reason. No annotation was supplied to any method. ### Collection of genome-wide association summary statistics (GWAS) We retrieved publicly available GWAS summary statistics data for 61 heritable traits and diseases from UK Biobank round 2 (Supplementary Table 2). Only HapMap3 variants were selected for the analysis, as LHC-MR is limited to these variants. Low confidence variants and variants with a minor allele frequency below 0.05 were excluded. All analyses and results use genome build hg19. ### Availability of the method and results PRISM is implemented in R and a user-friendly tutorial can be found on github. As long as GWAS summary statistics are available and the studied variants are mapped in HapMap3, it is possible to compute any network of traits of interest to distinguish direct variant-trait effects from vertical and network variant-trait effects. PRISM results are easily accessible through an online user-friendly interface. We developed a ShinyR interface, freely available online, to display PRISM results on our network of 61 highly heritable traits. Results can be visualized at the trait level. It is also possible to display the causal network of a genetic variant of interest. ## Supporting information Supplementary material [[supplements/308193_file10.pdf]](pending:yes) ## Data Availability PRISM is implemented in R and a user-friendly tutorial can be found on github. As long as GWAS summary statistics are available and the studied variants are mapped in HapMap3, it is possible to compute any network of traits of interest to distinguish direct variants from vertical and network variants. PRISM results are easily accessible through an online user-friendly interface. We developed a ShinyR interface, freely available online, to display PRISM results on our network of 61 highly heritable traits. Results can be visualized at the trait level. It is also possible to display the pleiotropic network of a genetic variant of interest. [https://github.com/martintnr/PRISM](https://github.com/martintnr/PRISM) [https://verbam01.shinyapps.io/PRISM/](https://verbam01.shinyapps.io/PRISM/) * Received June 1, 2024. * Revision received June 1, 2024. * Accepted June 3, 2024. * © 2024, Posted by Cold Spring Harbor Laboratory The copyright holder for this pre-print is the author. All rights reserved. The material may not be redistributed, re-used or adapted without the author's permission. ## References 1. 1.Loos, R. J. F. 15 years of genome-wide association studies and no signs of slowing down. Nat Commun 11, 5900 (2020). 2. 2.Uffelmann, E. et al. Genome-wide association studies. Nat Rev Methods Primers 1, 1–21 (2021). 3. 3.MacArthur, J. et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res 45, D896–D901 (2017). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/nar/gkw1133&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27899670&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F06%2F03%2F2024.06.01.24308193.atom) 4. 4.Chesmore, K., Bartlett, J. & Williams, S. M. The ubiquity of pleiotropy in human disease. Hum Genet 137, 39–44 (2018). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s00439-017-1854-z&link_type=DOI) 5. 5.Solovieff, N., Cotsapas, C., Lee, P. H., Purcell, S. M. & Smoller, J. W. Pleiotropy in complex traits: Challenges and strategies. Nat Rev Genet 14, 483–495 (2013). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nrg3461&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23752797&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F06%2F03%2F2024.06.01.24308193.atom) 6. 6.Gratten, J. & Visscher, P. M. Genetic pleiotropy in complex traits and diseases: Implications for genomic medicine. Genome Med 8, 78 (2016). 7. 7.Jordan, D. M., Verbanck, M. & Do, R. HOPS: A quantitative score reveals pervasive horizontal pleiotropy in human genetic variation is driven by extreme polygenicity of human traits and diseases. Genome Biology 20, 222 (2019). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s13059-019-1844-7&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=31653226&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F06%2F03%2F2024.06.01.24308193.atom) 8. 8.Hormozdiari, F., Kostem, E., Kang, E. Y., Pasaniuc, B. & Eskin, E. Identifying Causal Variants at Loci with Multiple Signals of Association. Genetics 198, 497–508 (2014). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiZ2VuZXRpY3MiO3M6NToicmVzaWQiO3M6OToiMTk4LzIvNDk3IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMDYvMDMvMjAyNC4wNi4wMS4yNDMwODE5My5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 9. 9.Alsheikh, A. J. et al. The landscape of GWAS validation; systematic review identifying 309 validated non-coding variants across 130 human diseases. BMC Medical Genomics 15, 74 (2022). 10. 10.Schaid, D. J., Chen, W. & Larson, N. B. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat Rev Genet 19, 491–504 (2018). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41576-018-0016-z&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=29844615&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F06%2F03%2F2024.06.01.24308193.atom) 11. 11.Fisher, V., Sebastiani, P., Cupples, L. A. & Liu, C.-T. ANNORE: Genetic fine-mapping with functional annotation. Human Molecular Genetics 31, 32–40 (2022). 12. 12.Yang, Z. et al. CARMA is a new Bayesian model for fine-mapping in genome-wide association meta-analyses. Nat Genet 55, 1057–1065 (2023). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41588-023-01392-0&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=37169873&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F06%2F03%2F2024.06.01.24308193.atom) 13. 13.Kichaev, G. et al. Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies. PLOS Genetics 10, e1004722 (2014). 14. 14.Umans, B. D., Battle, A. & Gilad, Y. Where Are the Disease-Associated eQTLs? Trends in Genetics 37, 109–124 (2021). 15. 15.Connally, N. J. et al. The missing link between genetic association and regulatory function. eLife 11, e74970 (2022). 16. 16.Mostafavi, H., Spence, J. P., Naqvi, S. & Pritchard, J. K. Systematic differences in discovery of genetic effects on gene expression and complex traits. Nat Genet 55, 1866–1875 (2023). 17. 17.Battram, T., Gaunt, T. R., Relton, C. L., Timpson, N. J. & Hemani, G. A comparison of the genes and genesets identified by GWAS and EWAS of fifteen complex traits. Nat Commun 13, 7816 (2022). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41467-022-35037-3&link_type=DOI) 18. 18.Darrous, L., Mounier, N. & Kutalik, Z. Simultaneous estimation of bi-directional causal effects and heritable confounding from GWAS summary statistics. medRxiv 2020.01.27.20018929 (2020) doi:10.1101/2020.01.27.20018929. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoibWVkcnhpdiI7czo1OiJyZXNpZCI7czoyMToiMjAyMC4wMS4yNy4yMDAxODkyOXYzIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMDYvMDMvMjAyNC4wNi4wMS4yNDMwODE5My5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 19. 19.Morrison, J., Knoblauch, N., Marcus, J. H., Stephens, M. & He, X. Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nature Genetics 1–8 (2020) doi:10.1038/s41588-020-0631-4. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41588-020-0631-4&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32451458&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F06%2F03%2F2024.06.01.24308193.atom) 20. 20.Cheng, Q., Zhang, X., Chen, L. S. & Liu, J. Mendelian randomization accounting for complex correlated horizontal pleiotropy while elucidating shared genetic etiology. Nat Commun 13, 6490 (2022). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41467-022-34164-1&link_type=DOI) 21. 21.Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A Simple New Approach to Variable Selection in Regression, with Application to Genetic Fine Mapping. Journal of the Royal Statistical Society Series B: Statistical Methodology 82, 1273–1300 (2020). 22. 22.Zou, Y., Carbonetto, P., Wang, G. & Stephens, M. Fine-mapping from summary data with the 'Sum of Single Effects' model. PLOS Genetics 18, e1010299 (19 juil. 2022). 23. 23.Benner, C. et al. FINEMAP: Efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/btw018&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26773131&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F06%2F03%2F2024.06.01.24308193.atom) 24. 24.Xie, F., Wang, D. & Cheng, M. CDKN2B-AS1 may act as miR-92a-3p sponge in coronary artery disease. Minerva Cardiol Angiol 72, (2024). 25. 25. ReproGen Consortium et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat Genet 47, 1228–1235 (2015). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/ng.3404&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26414678&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F06%2F03%2F2024.06.01.24308193.atom) 26. 26.Hartmann, K., Seweryn, M. & Sadee, W. Interpreting coronary artery disease GWAS results: A functional genomics approach assessing biological significance. PLOS ONE 17, e0244904 (2022). 27. 27.Conlon, D. M. et al. Sortilin restricts secretion of apolipoprotein B-100 by hepatocytes under stressed but not basal conditions. J Clin Invest 132, (2022). 28. 28.Sniderman, A. D. et al. Apolipoprotein B Particles and Cardiovascular Disease: A Narrative Review. JAMA Cardiology 4, 1287–1295 (2019). 29. 29.Kew, R. R. The Vitamin D Binding Protein and Inflammatory Injury: A Mediator or Sentinel of Tissue Damage? Front Endocrinol (Lausanne) 10, 470 (2019). 30. 30.Watanabe, K., Taskesen, E., Van Bochoven, A. & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat Commun 8, 1826 (2017). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41467-017-01261-5&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F06%2F03%2F2024.06.01.24308193.atom) 31. 31.Piñero, J. et al. DisGeNET: A comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res 45, D833–D839 (2017). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/nar/gkw943&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27924018&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F06%2F03%2F2024.06.01.24308193.atom) 32. 32.Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nature Genetics 47, 291 (2015). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/ng.3211&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25642630&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F06%2F03%2F2024.06.01.24308193.atom) 33. 33.Devuyst, O. The 1000 Genomes Project: Welcome to a New World. Perit Dial Int 35, 676–677 (2015). [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiRlVMTCI7czoxMToiam91cm5hbENvZGUiO3M6MzoicGRpIjtzOjU6InJlc2lkIjtzOjg6IjM1LzcvNjc2IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMDYvMDMvMjAyNC4wNi4wMS4yNDMwODE5My5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) [1]: /embed/inline-graphic-1.gif [2]: /embed/inline-graphic-2.gif [3]: /embed/inline-graphic-3.gif [4]: /embed/graphic-9.gif [5]: /embed/inline-graphic-4.gif [6]: /embed/inline-graphic-5.gif [7]: /embed/inline-graphic-6.gif [8]: /embed/inline-graphic-7.gif [9]: /embed/inline-graphic-8.gif [10]: /embed/inline-graphic-9.gif [11]: /embed/inline-graphic-10.gif [12]: /embed/graphic-10.gif [13]: /embed/inline-graphic-11.gif [14]: /embed/inline-graphic-12.gif [15]: /embed/graphic-11.gif [16]: /embed/inline-graphic-13.gif [17]: /embed/graphic-12.gif [18]: /embed/inline-graphic-14.gif [19]: /embed/inline-graphic-15.gif [20]: /embed/inline-graphic-16.gif [21]: /embed/inline-graphic-17.gif [22]: /embed/inline-graphic-18.gif [23]: /embed/inline-graphic-19.gif [24]: /embed/inline-graphic-20.gif [25]: /embed/inline-graphic-21.gif [26]: /embed/inline-graphic-22.gif [27]: /embed/inline-graphic-23.gif [28]: /embed/inline-graphic-24.gif [29]: /embed/inline-graphic-25.gif [30]: /embed/inline-graphic-26.gif [31]: /embed/inline-graphic-27.gif [32]: /embed/inline-graphic-28.gif [33]: /embed/inline-graphic-29.gif