Abstract
Venous thromboembolism (VTE) occurs in up to one third patients with COVID-19. VTE and COVID-19 may share a common genetic architecture in etiology, which has not been comprehensively investigated. In this study, we leveraged summary-level data from the latest COVID-19 host genetics consortium and UK Biobank to study the genetic commonality between COVID-19 traits and VTE. We found a positive genetic correlation between COVID-19 hospitalization and VTE (rg = 0.2320, P-value= 0.0092). The cross-trait analysis identified shared genetic loci between VTE and COVID-19 traits, including 8 for severe COVID-19, 11 for COVID-19 hospitalization, and 7 for SARS-CoV-2 infection. We identified seven novel mapped genes (LINC00970, TSPAN15, ADAMTS13, F5, DNAJB4, SLC39A8 and OBSCN) that were enriched for expression in the lung tissue, and in coagulation and immune related pathways. Eight genetic loci were found to share causal variants between COVID-19 and VTE, which are localized in the ABO, ADAMTS13 and FUT2 gene regions. Bi-directional Mendelian randomization analysis did not suggest a causal relationship between VTE and COVID-19 traits. Our study advances the understanding of shared genetic etiology of COVID-19 and VTE at the molecular and functional levels.
Introduction
Coronavirus Disease (COVID-19) has led to a worldwide pandemic, with venous thromboembolism (VTE) as a serious comorbidity. Thrombotic events occur in up to one-third of patients with COVID-19, and are associated with higher risk of mortality 1. More pronounced thrombotic symptoms seem to be associated with a severe disease course, increasing the risk of death by up to 18-fold 2. The cytokine storm and excessive inflammation caused by SARS-CoV-2 infection is hypothesized to lead to the systemic coagulation dysfunction 3-6, therefore many institutional evidence-based guidelines supported the use of prophylactic measures such as anticoagulants for thromboprophylaxis in COVID-19 patients 7.
Although observational studies have reported a high incidence of VTE in COVID-19 patients 3,8-11, but the underlying complex mechanisms remain unclear, and related meta-analysis comparing the overall risk of VTE occurrence in COVID-19 and non-COVID-19 cohorts gave conflicting results 12,13.
Large-scale genome-wide association studies (GWASs) have identified multiple human genomic regions that may influence the risk of COVID-19 outcomes (severe COVID-19, COVID-19 hospitalization, and SARS-CoV-2 infection) 14-18, including some genes associated with coagulation and thrombosis, such as the ABO and ACE2 genes. However, to the best of our knowledge, no large-scale genome-wide studies have systematically reported shared genetic architecture between COVID-19 and VTE.
To fill this research gap, we performed a large-scale cross-trait analysis for VTE and three COVID-19 related traits to estimate their genetic correlations and shared genetic components using publicly available summary level data from GWASs of VTE and COVID-19 in the populations of multiple ancestries, and to provide novel insights into the underlying shared genetic etiology between VTE and COVID-19. We further applied two-sample bi-directional Mendelian randomization methods on VTE and three COVID-19 outcomes to investigate their causal associations.
Methods
Study population
The GWAS summary statistics for COVID-19 of European ancestry were provided by the COVID-19 host genetics initiative round 7 (https://www.covid19hg.org/results/, release date: April 08, 2022). We studied three related traits of COVID-19: (1) Severe COVID-19, defined as COVID-19-confirmed individuals with very severe respiratory symptoms or those who died from the disease (up to 13,769 cases and 1,072,442 controls); (2) COVID-19 hospitalization defined as individuals who were hospitalized for related infection symptoms, with laboratory-confirmed SARS-CoV-2 infection (up to 32,519 cases and 2,062,805 controls); (3) SARS-CoV-2 infection defined as all individuals who reported positive (laboratory diagnosis, physician diagnosis or self-report) for SARS-CoV-2 infection (up to 122,616 cases and 2,475,240 controls).
We used summary statistics on the GWAS of VTE (3,900 cases and 369,592 controls) in the UK Biobank of European ancestry. Participating cohorts of COVID-19 and VTE were described in detail in the Supplementary Note.
Linkage disequilibrium score regression (LDSC) analysis
LDSC analysis was conducted to assess the heritability for a single trait and genetic correlations between two traits. The analysis was conducted using the LDSC software based on the GWAS summary statistics. LD scores of 1000G European ancestry was used as reference 20. An estimate of the heritability or genetic correlation can be obtained by regressing the χ2 statistics or the products of z-scores on LD scores, respectively. LDSC can also correct for the inflation of test statistics caused by polygeneicity17.
Partitioned LDSC analysis
We performed partitioned LDSC analysis to estimate the genetic correlation between two traits within each of the following 12 functional categories 21: conserved region, DNaseI digital genomic footprinting region (DGF), DNase I hypersensitivity sites (DHS), fetal DHS, H3K4me1, H3K4me3, H3K9ac and H3K27ac, intron region, super enhancers, transcription factor-binding site (TFBS) and transcribed region. The re-calculated LD scores of the SNPs classified in each specific annotation category allowed us to find out which functional categories accounted for the majority of the overall genetic correlation.
Multi-Trait Analysis of GWAS (MTAG)
We applied MTAG to identify novel loci with strong signals for COVID-19, and to detect shared genetic variants for VTE and three COVID-19 traits22-24. MTAG can improve the effect estimates for each COVID-19 trait by incorporating the weighted sum of GWAS estimates for VTE. We also used MTAG to conduct cross-trait meta-analysis, which utilizes sample size-weighted, fixed-effect model together with genetic covariance modeling from all sources to combine evidence of the association between individual variants for VTE and COVID-19.
Fine-mapping and co-localization analysis
For the significant shared loci between two traits, we used the Bayesian fine-mapping algorithm to identify a 99%-credible set of causal variants for each of the shared loci within 500kb of the index SNP 25. Then, we conducted the co-localization analysis to check whether COVID-19 and VTE association signals co-localized at shared loci, by calculating the probability that two traits shared the same causal variant (P(H4)) 26. If P(H4) is greater than 0.5 for any SNP, we labelled it as a co-localized genetic variant.
Tissue-specific enrichment analysis (TSEA) and over-representation enrichment analysis
In order to test whether the shared genes are overly expressed in a specific tissue, we performed TSEA with the HUGO Gene Nomenclature Committee (HGNC) name of genes that correspond with the shared loci in cross-trait meta-analysis. We used the R package ‘deTS’, which uses GTEx RNA-seq data and the ENCODE panel as a reference panel, and calculates the corresponding z-score for each tissue27. We also used the WEB-based GeEne SeT AnaLysis Toolkit to further assess the overrepresented enrichment of the same shared genes in Gene ontology (GO) biological process 28.
Mendelian randomization (MR) analysis
We extracted genome-wide independent significant loci for COVID-19 related tratis (severe COVID-19, COVID-19 hospitalization and SARS-CoV-2 infection) and for VTE using the genome-wide significance threshold P-value<5×10−8, clumped at LD threshold r2=0.01 and clump size 500kb. Using these SNPs as instruments, we conducted two-sample bidirectional MR analysis to assess the causal association between each COVID-19 realted trait and VTE. The effect estimate with P-value <0.05 for the main inverse variance weighted (IVW) analysis was considered to be supportive evidence 29, and we used the weighted median 30, MR-Egger 31, MR-PRESSO 32 and MR-RAPS 33 methods to compare MR estimates between different MR models for sensitivity analysis.
The details of methods were provided in the Supplemental Note.
Results
Genetic correlation of VTE with COVID-19 related traits
The heritability (h2) estimated by LDSC suggested severe COVID-19, COVID-19 hospitalization, SARS-CoV-2 infection and VTE are heritable (P<0.05 shown in Supplementary Table 1). We found positive genetic correlations of VTE with severe COVID-19 (rg= 0.0573, P-value= 0.5087), COVID-19 hospitalization (rg = 0.2320, P-value= 0.0092) and SARS-COV2 infection (rg =0.1753, P-value= 0.0731) (shown in Figure 1 and Supplementary Table 2). Further partitioned LDSC analysis found severe COVID-19, COVID-19 hospitalization and SARS-CoV-2 infection are associated with VTE in 5(Fetal DHS, H3K4me1, H3K9ac H3K27ac and TFBS) (Figure 2), 11 (except for super enhancers regions) and 10 (except for conserved and super enhancers regions) of 12 functional categories, respectively.
Multi-Trait Analysis of GWAS (MTAG)
Based on the HGI updated GWAS meta-analysis, there are 45 genome-wide significant (P<5×10−8) and uncorrelated (r2<0.01) loci for severe COVID-19, 46 for COVID-19 hospitalization and 26 for SARS-CoV-2 infection (Supplementary Table 3), and 12 for VTE in the GWAS from UK Biobank (Supplementary Table 4). Using MTAG incorporating information from GWAS of VTE did not identify additional genome-wide significant loci for COVID-19 traits.
As shown in Table 1, we identified 8 shared novel genetic loci associated with both VTE and severe COVID-19, 11 with both VTE and COVID-19 hospitalization and 7 with both VTE and SARS-CoV-2 infection (Pmeta < 5×10−8; single trait P <0.05). Figure 3 displays the Manhattan plot of these results. In line with previous studies 14-17,34,35, we identified ABO gene and FUT2 gene, which contributed to both VTE and COVID-19. Notably, we identified seven novel association genes which have not been reported yet, including LINC00970, and six protein-coding genes (TSPAN15, ADAMTS13, F5, DNAJB4, SLC39A8 and OBSCN).
For severe COVID-19 with VTE, the strongest association signals were localized to the LINC00970 gene (index SNP: rs114101204 for severe COVID-19 and VTE, Pmeta=1.44×10−37) at locus 1q24.2. For COVID-19 hospitalization and SARS-CoV-2 with VTE, the strongest association signals were localized on or near the ABO gene (index SNP: rs11244061 for COVID-19 hospitalization, Pmeta=3.64×10−28; rs550057 for SARS-CoV-2 infection, Pmeta=1.56×10−51) at locus 9q34.2.
Fine-mapping and colocalization analysis identify shared causal variants
For each of the shared loci of VTE with COVID-19, Supplementary Tables 5-7 listed all SNPs within 500 kb of these variants in the 99% credible sets. Co-localization analysis shows that those genetic loci sharing causal variants are located in ABO (index SNPs: rs11244061, rs138683771, rs8176686, rs550057, rs9411367, rs71503180), ADAMTS13 (index SNPs: rs149181677) genes on chromosome 9 and FUT2 (index SNP: rs492602) gene on chromosome 19 (Table 1).
GTEx tissue enrichment analysis and over-representation enrichment analysis of shared genes
The identified shared genes for severe COVID-19 and COVID-19 hospitalization with VTE are significantly enriched for expression in the lung tissue, no highly enriched tissues were found for SARS-CoV-2 infection and VTE (Figure 4). GO analysis highlighted several significant shared biological processes between three COVID-19 traits and VTE, such as “calcium-mediated signaling”, “second-messenger-mediated signaling”, “chemokine-mediated signaling pathway”, “response to chemokine”, “response to type I interferon” (Supplementary Table 8-10).
Mendelian randomization analysis
We used 45 SNPs for severe COVID-19, 46 SNPs for COVID-19 hospitalization and 26 SNPs for SARS-CoV-2 infection as instruments to the GWAS of VTE. There was null association of genetically predicted COVID-19 with VTE (Figure 5). In the reverse direction, using 12 SNPs as instruments, we also found null associations of genetically predicted VTE with three COVID-19 traits (Figure 6). These results are robust to different MR methods (Supplementary Figures 1 and 2).
Discussion
In this study, we investigated the shared genetic etiology and causality between three COVID-19 related traits and VTE based on the latest data from COVID-19 HGI and UK Biobank. We identified 8 genome-wide shared loci for severe COVID-19, 11 for COVID-19 hospitalization and 7 for SARS-CoV-2 infection with VTE, respectively, with mapped genes were enriched mainly in the lung tissue, and in coagulation and immune related pathways. Bi-directional MR did not provide strong evidence of their causal relationships.
Our LDSC analysis showed a significant positive genetic correlation between VTE and COVID-19 hospitalization. Although LDSC analysis did not find significant overall genetic correlation between severe COVID-19 and VTE, however partitioned LDSC analysis found that genetic correlations for severe COVID-19 with VTE were positive and significant in some functional categories, including TFBS, Fetal DHS, H3K4me1, H3K9ac and H3K27ac, which are associated with the control of transcription and the status of cis-regulatory elements such as promoters and enhancers within gene regulatory regions 21,36-38.
To our knowledge, we are the first to investigate the shared genetic etiology between COVID-19 related traits and VTE using cross-trait meta-analysis. We identified shared loci in several genes (LINC00970, ABO, ADAMTS13, TSPAN15, NAPSA, IFNAR2, F5, WNT3, DNAJB4, SLC39A8, OBSCN, SLC6A20, FUT2) and co-localized genetic loci in three genes (ABO, ADAMTS13 and FUT2). Consistent with previous report, we confirmed the role of ABO, NAPSA, IFNAR2 in severe COVID-19 and COVID-19 hospitalization14,15,18,39,40, WNT3 in COVID-19 hospitalization41, and SLC6A20 in SARS-CoV-2 infection15. Notably, we found several novel genes, including ADAMTS13, TSPAN15, WNT3, DNAJB4, SLC39A8, OBSCN, which have not been reported in GWAS of COVID-19.
Interestingly, we found a strong, shared signal by severe COVID-19 and VTE located at locus 1q24.2 in LINC00970 close to NME7 gene. Locus 1q24.2 has been found to be associated with VTE previously 42, but has not been linked to severe COVID-19. This locus (1q24.2) also includes shared genetic variant in F5 which encodes plasma procoagulant factor (F)V involved in thrombin activation, and its impaired downregulation is key to thrombosis 43. Similarly, ABO, FUT2, and ADAMST13 are well-established VTE related genes 42,44, and were found in the colocalization analysis. Growing evidence supports that thrombophilia is closely related to COVID-19 susceptibility and severity 45,46. Multiple studies have reported that ABO gene is associated with severe COVID-19 and SARS-CoV-2 infection14-16,34, possibly by regulating thrombosis14,47. This is also supported by our colocalization results. FUT2 gene, a fucosyltransferase gene involved in ABO blood group antigen synthesis, was recently found to be associated with critical COVID-19 17. ADAMST13, which encodes protein ADAMST13, is identified to be a novel shared gene in our study. ADAMTS13 function might be affected by ABO blood group 48, and a previous MR study showed genetically predicted ADAMST13 is associated with severe COVID-1949.
Moreover, the association signals shared by severe COVID-19 and COVID-19 hospitalization with VTE were also mapped to the IFNAR2 gene at 21q22.11 and NAPSA gene at 4q24. IFNAR2 gene encodes interferon receptor subunit that mediates the early host immune response to viral infection 18. NAPSA gene is associated with damage-associated transient progenitors promoted by inflammation 39,50. Other shared genes have been known to be associated with innate antiviral defenses (SLC6A20) 14,51, immune cell signaling regulation (WNT3, TSPAN15)52,53, inflammatory lung injury (SLC39A8)54 or tumor activation (TSPAN15, DNAJB4, OBSCN)55-57. These results imply that immune function may be related to both COVID-19 and VTE.
The findings from enrichment analysis also suggest immune function might be involved in the shared etiology. In the enrichment analysis, we found several shared pathways related to immune function, such as calcium-mediated signaling and chemokine/interferon related response. The role of calcium signaling in platelets for hemostasis and thrombosis has been previously reported 58,59. Calcium signaling is also of paramount importance in immune cells 60,61, and chemokine and interferon play a major role in activating host immune and inflammatory responses 62-64. These evidences suggest that the common pathways between COVID-19 and VTE may relate to immunity, endothelial cell function, and coagulation. Our TSEA reported that shared genes for severe COVID-19 and COVID-19 hospitalization with VTE were mainly enriched for expression in the lung tissue. Consistently, pulmonary vascular endothelial injury and immunothrombosis (most of which occur within the lung microvessels) are key drivers of severe events after SARS-CoV-2 infection, such as acute respiratory failure and ARDS 45,65
Our study also adds to the evidence regarding the causal relationship between VTE and COVID-19 traits. Interestingly, using bi-directional MR, we shows null associations between VTE and COVID-19 traits. Previously, a cohort study in the UK Biobank (n=312,378) showed that VTE is a risk factor for severe COVID-19 66. However, conventional observationally studies might be subject to residual confounding bias, such as, by socioeconomic position or health status. Different from our findings, a previous MR study using COVID-19 HGI release 5 data found that genetically predicted VTE was associated with higher risk of COVID-19 hospitalization and SARS-CoV-2 infection. Although null association in MR studies might be due to a lack of power, we used data from COVID-19 HGI release 7 with doubled sample size compared to release 5, so the null association is less likely to be due to insufficient power. Meanwhile, our MR analysis did not find that genetically predicted COVID-19 was associated with an increased risk of VTE. Partly consistent with our results, a meta-analysis of cohort studies found no difference in VTE risk among people with (n= 3,060) and without COVID-19 (n=38,708)12.
Our study is the first to use large-scale genetic data to explore the shared genetic architecture between COVID-19 related traits and VTE, providing timely evidence and novel insights about the genetic etiology between the two diseases. The GWAS summary-level data for COVID-19 was obtained from the COVID19 HGI release 7 summary statistics, the largest and latest GWAS of COVID-19 available, which doubles the sample size from previous versions and improves reliability and genetic robustness. Applying the analysis to a large publicly available COVID-19 HGI and the UK Biobank enables us to examine the association in a well-powered study in a cost-efficient manner.
We also acknowledge several limitations of our study. First, the GWAS data for COVID-19 and VTE used in this study were derived from the European population, so the associations may not be applicable to other ancestries. Second, although GWAS summary statistics conducted study-specific quality control, misclassification of COVID-19 might exist. Third, the summary statistics limit us to assess sex and age-specific genetic effects. Fourth, the GWAS of COVID-19 related traits might be conducted at different time periods, so there might be differences in the SARS-COV-2 infection strain. However, our study does not aim to assess the association with specific strain of SARS-COV-2 infection. Finally, although our study provides evidence of genetic correlation and genetic overlap between COVID-19 and VTE, the underlying biological mechanisms are still unclear, and further studies are still needed for validation.
In conclusion, our findings provided novel evidence of genetic correlations between severe COVID-19, COVID-19 hospitalization, SARS-CoV-2 infection and VTE, and highlighted their common genetic architecture, with shared genes closely related to coagulation and immunity. Our works contribute to the understanding of COVID-19 and VTE etiology, and open up new insights into the prevention and comorbidity management of COVID-19.
Data Availability
Genetic associations with VTE were from the UK biobank GWAS results provided by Lee Lab (https://www.leelabsg.org/resources). Genetic associations with severe COVID-19, COVID-19 hospitalization and SARS-CoV-2 infection was obtained from COVID-19 host genetics consortium GWAS meta-analyses round 7, downloaded from https://www.covid19hg.org/results/r7/.
Statements and Declarations
Funding
The authors reported no funding received for this study.
Competing Interests
The authors have no competing interests.
Author Contributions
The draft was written by Xin Huang. Data analysis were performed by Minhao Yao and Peixin Tian. Xin Huang and Minhao Yao interpreted the results. Zhonghua Liu and Jie V. Zhao developed the study conception, directed the study’s analytic strategy and contributed to the critical revision of the manuscript for important intellectual content. All authors read and approved the final manuscript.
Ethics approval
The study is an analysis using publicly available summary data that does not require ethical approval.
Data availability
Genetic associations with VTE were from the UK biobank GWAS results provided by Lee Lab (https://www.leelabsg.org/resources). Genetic associations with severe COVID-19, COVID-19 hospitalization and SARS-CoV-2 infection was obtained from COVID-19 host genetics consortium GWAS meta-analyses round 7, downloaded from https://www.covid19hg.org/results/r7/.
Acknowledgments
This research was conducted using the summary statistics from COVID-19 host genetics consortium and UK Biobank. The authors would like to thank all participants in the study and investigators for sharing the valuable data.