ABSTRACT
Diastole is the sequence of physiological events that occur in the heart during ventricular filling and principally depends on myocardial relaxation and chamber stiffness. Abnormal diastolic function is related to many cardiovascular disease processes and is predictive of health outcomes, but its genetic architecture is largely unknown. Here, we use machine learning cardiac motion analysis to measure diastolic functional traits in 39,559 participants of UK Biobank and perform a genome-wide association study. We identified 9 significant, independent loci near genes that are associated with maintaining sarcomeric function under biomechanical stress and genes implicated in the development of cardiomyopathy. Age, sex and diabetes were independent predictors of diastolic function and we found a causal relationship between ventricular stiffness and heart failure. Our results provide novel insights into the genetic and environmental factors influencing diastolic function that are relevant for identifying causal relationships and tractable targets in heart failure.
Main
Diastole is not a passive phase of the cardiac cycle, but is a complex sequence of inter-related physiological processes dependent on myocardial relaxation, stiffness and recoil, that are modulated by loading conditions, heart rate, and contractile function1,2. Diastolic function therefore plays a central role in determining left ventricular filling and stroke volume with dysfunction shown to be a predictor of major adverse cardiovascular events and all cause mortality3. Decline in diastolic function is also a hallmark of cardiac ageing which occurs through multiple pro-fibrotic and energetic pathways4,5. While several candidate genes have been implicated in various systolic function phenotypes through genome wide association studies (GWAS)6,7, the genetic architecture of diastolic function and causal associations with disease are largely unknown. Efforts to better define the molecular mechanisms of diastolic dysfunction could enable the development of innovative therapies for many cardiovascular disease states8.
Pre-clinical models of diastolic dysfunction are associated with alterations in left ventricular stiffness on atomic force microscopy that occur at the level of the cardiomyocyte sarcomere as well as due to extracellular matrix protein expansion9. Such tissue level changes can be assessed at macroscopic scale in human populations through analysis of diastolic mechanics. Here we use data from participants in UK Biobank with cardiac magnetic resonance imaging (CMR)10, and apply deep learning computer vision techniques for precision motion analysis to derive image-based phenotypes of diastolic function11,12. In a GWAS of diastolic traits we identify associated loci that map to genes involved in actin assembly, cardiac myocyte survival, and heart failure phenotypes. We also describe the relationship between diastolic function and cardiovascular risk factors, and identify potential causal relationships with disease through Mendelian randomisation.
Results
Study Overview
We analysed CMR data from 39,559 participants in UK Biobank using machine learning segmentation and motion tracking to measure three validated parameters of diastolic function - radial and longitudinal peak early diastolic strain rate (PDSRrr and PDSRll) (Figure 1), and maximum body surface area-indexed left atrial volume (LAVmaxi)13. A flow chart of the analysis steps is depicted in Extended Data Fig 1. Baseline characteristics of the population are shown in Extended Data Table 1. The population was partitioned into discovery and validation sets by pre-defined criteria. To assess the association between these diastolic function traits and other clinical measurements, we further considered a broad selection of 30 imaging and 110 non-imaging phenotypes that included biophysical data and circulating biomarkers (Supplementary Data 1). Independent GWASs were undertaken for each image-derived phenotype and heritability estimated. We used a phenome-wide association study (PheWAS) to identify multiple phenotypes associated with a polygenic risk score (PRS) for diastolic function. Potentially causal associations were examined using 2-sample Mendelian randomisation (MR). The results are reported in accordance with GWAS reporting guidelines and a checklist is provided as Supplementary Information14.
Imaging and non-imaging phenotype associations
Strain rates declined with age and were lower in men (P < 10−16 for both associations) (Figure 2), but no univariable association was observed between age and LAVmaxi (Extended Data Fig 2). Multiple linear regression analysis was used to develop a model for predicting each diastolic trait from demographic, haemodynamic and cardiovascular risk factors (Figure 3a, Extended Data Fig 3). In this multivariable analysis strain rate and left atrial volumes were negatively associated with age, sex and pulse rate in the full model (P < 10−16 for all associations). Significant associations were also observed for body surface area (BSA) and systolic blood pressure (SBP). Diabetes also added significantly to the associations with the diastolic function traits in the model (PDSRll: P = 2.36 × 10−8; PDSRrr: P = 9.98 × 10−6; LAVmaxi: P = 1.04 × 10−3).
To investigate the association between a comprehensive set of image-derived measures of atrial, ventricular and aortic function with a broader range of non-imaging phenotypes we used regression analysis with variable selection to fit a sparse model of predictors (Extended Data Fig 4). For this analysis, we excluded 7,936 subjects with incomplete data. The final model described the relationship between 12 imaging phenotypes, 5 non-imaging phenotypes and 2 serum biomarkers (Figure 3b) (see Supplementary Material for further information). Sex was associated with several systolic and diastolic parameters, including aortic function, as well as higher serum triglycerides in men. C-reactive protein (CRP), a circulating biomarker of inflammation, showed a positive relationship with serum triglycerides, but we found no circulating biomarkers independently associated with diastolic function. We found that reduced peak diastolic strain rates were associated with reduced LAVmaxi. Left atrial function was related to indicators of right ventricular function emphasising their functional interdependence15.
Genetic architecture of diastolic function traits
Genome-wide association analyses of diastolic function traits
The SNP-based heritability (i.e. the proportion of variance per trait explained by all considered SNPs) was 12% for PDSRll, 13% for PDSRrr, and 21% for LAVmaxi. The observed genetic correlation between the diastolic function traits was 0.22 (SE 0.07) between PDSRll and LAVmaxi, 0.12 (SE 0.08) between PDSRrr and LAVmaxi, and 0.85 (SE 0.04) between PDSRll and PDSRrr.
Within the discovery set, we identified 5 independent loci (LAVmaxi, 1; PDSRrr, 3; PDSRrr, 1) reaching genome-wide significance (P = 5·10−8; Supplementary Fig 3), which were also significant in the validation dataset also (P < 0.05/5). Considering the full dataset, the number of significant independent loci increased to 9 with 2 additional loci associating with PDSRrr, 1 additional with LAVmaxi, and 1 additional with PDSRll (Figure 4).
Variant annotation
Summary information for the 9 loci identified using the full dataset is presented in Table 1 (further information can be found in Supplementary Material, Supplementary Fig 5 and Supplementary Table 1). The closest gene to each locus is depicted, with further variant to gene mapping presented as the “likely gene” given by evidence of a functional effect on a gene (details in Supplementary Material), additional heart-related phenotype associations, or a previously reported mechanism linking the gene to diastolic function. Through this approach we were able to highlight several structural genes associated with diastolic function that also have a known role in myocardial contractility (e.g. TTN, PLN, GJA1), and in the functional maintenance and stress-response of the cytoskeleton (eg FHOD3, BAG3)16. Moreover, we were also able to identify a novel link between the NPR3 locus and left atrial volume. The signal co-localizes with a previously discovered association with blood pressure traits (systolic, diastolic and mean arterial blood pressure). The C-allele of the lead SNP (rs1173727) at this locus increases NPR3 expression, and is associated with increased blood pressure and LAVmaxi, and a nominally significant increase in risk of heart failure (Supplementary Material). The NPR3 gene encodes the C-type natriuretic peptide receptor, which has a high tractability score (Table 1), making it an attractive therapeutic target.
NPR3 accounts for almost 95% of the natriuretic peptide receptor (NP) population17. It binds all 3 members of the natriuretic peptide pathways, natriuretic peptides A, B and C (ANP, BNP and CNP) and has been ascribed both a clearance role for these peptides, regulating their circulating levels, and a signalling role for CNP18. To further elucidate the biological relationship between the lead NPR3 SNP and heart failure, we first sought to study in more detail individuals with predicted loss-of-function variants in NPR3 or its ligands but, within the group with whole-exome sequencing data, the numbers of such individuals were limited (NPR3 n=5, nNPPA n=2, nNPPB n=2, nNPPC n=0).
We next examined the relationship between common variants in NPR3 and genes encoding other proteins in the natriuretic peptide pathway with traits linked to the lead SNP, rs1173727. Full results of this systematic assessment are shown in Supplementary Fig 6, and an abridged version is provided in Extended Data Fig 6. We identified additional genetic variants in NPR3 with robust effects on other traits influenced by the LAVmaxi associated variant rs1173727. Hierarchical clustering of the variant-trait effects showed that the observed effects for the NPR3 variant overlap with both the NPPA & NPPB genes and the effects of genetic variation in NPPC or NPR2. In brief, we were unable to distinguish between the two proposed roles for NPR3 in the context of this study.
Potential causes and consequences of diastolic function
Creation of polygenic risk scores (PRS and PheWAS)
PRSs for each diastolic function trait consisted of 20 SNPs for PDSRrr, 15 SNPs for PDSRll, and 8 for LAVmaxi. The PRS explained 1.5 % of the variability of PDSRrr, 1.1 % of PDSRll and 0.2 % of LAVmaxi. There was good agreement between the distribution of the PRS in the UK Biobank participants with and without CMR indicating no systematic bias in genetic architecture (Supplementary Fig 8). The Pearson correlation coefficient for the PRS for PDSRll and PDSRrr was 0.35 whereas the correlation coefficient between LAVmaxi and PDSRll or PDSRrr, respectively, was much lower (< 0.01). PheWAS was undertaken and we considered traits that have been previously associated with cardiac phenotypes in the literature, but in addition included an unbiased selection of phenotypes for exploration. In total, we considered 71 quantitative phenotypes and 63 (binary) disease endpoints (Supplementary Data 1). Out of these, 31 phenotypes were significantly associated (Padj<0.05) with at least one of the diastolic function PRSs after leave-one-out cross validation (Figure 5). Some of the identified PheWAS associations are consistent with the phenotype correlation analysis (e.g. pulse rate and blood pressure). We also confirmed associations between diastolic function and previously reported biomarkers of heart failure (e.g. SHBG20 and IGF-121). Furthermore, we identified an association of PDSRrr to heart failure, cardiomyopathy and dilated cardiomyopathy, implicating diastolic function in cardiovascular endpoints.
Mendelian randomization
We used Mendelian Randomization (MR) to identify causal bi-directional relationships between diastolic function and cardiovascular outcomes (heart failure, diabetes, pulse rate, and diastolic blood pressure) which we chose based on clinical plausibility and the findings of the phenotype correlation analysis. As we hypothesize LAVmaxi to be downstream of PDSRrr and PDSRll, reflecting that atrial volume is consequent on ventricular filling22, we focussed on PDSRrr and PDSRll while results for LAV maxi are reported for completeness in Supplementary Table 9 and in Supplementary Fig 10. We identified a potential causal, bi-directional association between pulse rate and both PDSRrr and PDSRll (Figure 6, Supplementary Table 7, Supplementary Table 8), supporting findings from pre-clinical models suggesting that selective heart rate reduction can affect diastolic function24. We also identified a likely causal relationship between lower PDSRrr (stiffer ventricle) and increased risk of heart failure Supplementary Fig 9, which was further corroborated using GWAS summary results25 from the HERMES consortium (Supplementary Table 10), which is a GWAS meta-analysis from 47309 heart failure cases and 930014 controls. The magnitude of the effect observed in the MR analysis is consistent with the observational epidemiological estimate, derived from correlating PDSRrr with incident heart failure (Extended Data Fig 5). We found no causal relationship between longitudinal PDSRll and heart failure, and neither was one observed in our epidemiological analysis (Extended Data Fig 5). Diastolic dysfunction is frequently present in diabetic patients26, but although we observed a difference in PDSRrr between diabetics and non-diabetics (Figure 3), we could not confirm a causal relationship between diabetes as an exposure with either PDSRll or PDSRrr as outcomes (Supplementary Table 7, Supplementary Table 8).
Discussion
Diastole is a complex series of molecular, biophysical and electro-mechanical processes that initiate contractile deactivation and promote efficient ventricular filling. Impairment of these coordinated mechanisms may lead to diastolic dysfunction which is associated with the presence of multiple cardiovascular risk factors leading to reduced quality of life and higher mortality27,28. Here, we used deep learning cardiac motion analysis to perform the first reported GWAS of diastolic function traits with the aim of determining tractable causative mechanisms. We found that diastolic function was a heritable trait with associations in loci related to myofilament mechanics, protein synthesis during mechanical stress and regulation of cardiac contractility. Furthermore, we find a role for endothelium-derived signalling in diastolic function that is a potential therapeutic target29. Lastly, through Mendelian randomisation we observe a causal relationship between genetically-determined diastolic function and heart failure outcomes.
A decline in diastolic function is a feature of the ageing heart and we found that age was a strong independent predictor of diastolic function, with a greater decrease present in males. Outcome studies have suggested that this is a prognostically-benign feature of healthy ageing that is not related to adverse effects of cardiac senescence4,30,31. Changes in titin protein phosphorylation, myocardial redox state and impairment of nitric oxide signalling have been proposed as potential mechanisms32, and clinical studies indicate that age-related myocardial fibrosis, cardiomyocyte hypertrophy, and reduced microvascular density, may be a consequence rather than an initiating cause of diastolic dysfunction33.
We found that diabetes was associated with impaired diastolic function and this relationship was independent of age, BSA, and systolic blood pressure. Increased myocardial stiffness is recognised as one of the earliest, and potentially reversible, manifestations of myocardial dysfunction in diabetes34. Several underlying mechanisms related to insulin resistance have been proposed that include altered cardiac energetics and accumulation of advanced glycation end products that promote ventricular stiffness35. However, our findings did not conclusively point to a causal relationship between diabetes and diastolic function using Mendelian randomisation. We observed a unidirectional causal relationship of PDSRrr on heart failure risk, as well as associations with cardiovascular endpoints and circulating biomarkers of heart failure through PheWAS. Longitudinal cohort studies have suggested that persistence or progression of diastolic dysfunction is a risk factor for subsequent heart failure36, and our findings suggest that ventricular stiffness is a substrate for the evolution of mixed aetiology heart failure. In contrast, our findings show that variation in left atrial volume (LAV maxi) is likely a consequence of myocardial diastolic function which is not itself causally related to heart failure risk.
Our study provides insights into the biological basis of diastolic function with potential implications for therapy development.
We identified common variants within genes implicated in cardiomyopathies (e.g. BAG3, FHOD3, PLN), suggesting sarcomere homeostasis during mechanical stress may affect diastolic function in both health and disease37. A further example is the identification of the gene of a key regulator of cardiac diastolic function, phospholamban (PLN), which modulates sarcoplasmic reticulum calcium-ATPase activity38. We also identified a potential therapeutic target through the association of DNA variants at the locus of NPR3 influencing diastolic function and risk of heart failure. Previous studies have highlighted its role in blood pressure control39. NPR3 has been described as a clearance receptor for ANP, BNP and CNP, while more recently as a functional signalling receptor for CNP has become apparent29. A recent mechanistic study in animal models29 identified a role for this receptor in mediating the cardioprotective effects of cardiomyocyte and fibroblast-released CNP. While we cannot exclude a signalling function for NPR3, the direction of effect in our genetic analysis is such that reduced NPR3 gene expression is associated with reduced LAVmaxi, blood pressure and heart failure risk. This is consistent with its role as a clearance receptor, where reduced expression would enhance exposure to the beneficial effects of ANP and BNP and outweigh loss of CNP activity. Inhibition of natriuretic peptide degradation via inhibition of neutral endopeptidase is already an approved licensed treatment for heart failure, validating the pathway as a therapeutic target. Our observations support pharmacological inhibition of NPR3 as an additional approach to managing this condition.
This analysis has some limitations. UK Biobank is a large-cross sectional study that is subject to selection bias and latent population stratification40, however risk factor associations appear to be broadly generalisable41. The population is predominantly European and further work is required to explore diastolic traits and outcomes in people of diverse ancestries. Echocardiography has been the cornerstone of assessing diastolic function by characterising features of ventricular relaxation, stiffness and recoil2. However, feature-tracking CMR has excellent agreement with speckle-tracking echocardiography42 and invasive measures of diastolic function43.
In conclusion, we found that diastolic function is a heritable trait that is causally upstream of heart failure. Associated common variants are related to genes that maintain functional homeostasis under biomechanical stress. We also identify a gene encoding an atrial natriuretic peptide receptor as a novel therapeutic target for modulating aspects of diastolic function.
Methods
All analyses in this study can be found here https://github.com/ImperialCollegeLondon/diastolic_genetics/, and were conducted with R version > 3.6.0.
Participants
For UK Biobank, approximately 500,000 community-dwelling participants aged 40–69 years were recruited across the United Kingdom between 2006 and 201044. All subjects provided written informed consent for participation in the study, which was also approved by the National Research Ethics Service (11/NW/0382). Our study was conducted under terms of access approval number 28807 and 40616. A range of available data were included in this study comprising genotyping arrays and whole exome sequencing, cardiac imaging, health-related diagnoses, and biological samples.
There are 488,252 genotyped participants of which 200,640 have whole exome sequencing. We partitioned 39,559 participants with both CMR imaging and genotyping array data into two tranches by date of release from UK Biobank providing a discovery dataset of 26,893 participants and a validation dataset of 12,666 participants.
Imaging protocol
A standardised CMR protocol was followed to assess cardiac structure and function using two-dimensional retrospectively-gated cine imaging on a 1.5T magnet (Siemens Healthineers, Erlangen, Germany). A contiguous stack of images in the left ventricular short-axis plane from base to apex was acquired, with long axis cine imaging in the two and four chamber views. Each cine sequence had 50 cardiac phases with an acquired temporal resolution of 31 ms10. Transverse cine imaging was also performed in the ascending and descending thoracic aorta. Images were curated on open-source databases45,46. All imaging phenotypes used for the analysis underwent quality control assessment11. Participants also underwent a resting 12 lead electrocardiogram which was automatically analysed using proprietary software (CardioSoft, GE Healthcare).
Cardiac image analysis
Segmentation of the short-axis and long-axis cine images in UK Biobank was made using fully convolutional networks, a type of deep learning neural network, which predict a pixel-wise image segmentation by applying a number of convolutional filters onto each input image for feature extraction and classification12. The accuracy of image segmentation on the UK Biobank dataset is equivalent to expert human readers47. End-diastolic volume, end-systolic volume, stroke volume, and ejection fraction were determined for both ventricles. Left ventricular myocardial mass was calculated from the myocardial volume assuming a density of 1.05 g.ml −1. Left atrial volume was calculated from the segmented images using the biplane area–length formula , where A2Ch and A4Ch indicate the atrial area on the two and four-chamber cines respectively, and L indicates the longitudinal diameter averaged across two views. Measurements were indexed to body surface area (BSA) according to the Du Bois formula: 0.20247 (Weight0.425) (Height0.725), with weight in kg and height in m. The heart was divided into 16 standardised anatomical segments, excluding the true apex, according to American Heart Association nomenclature48.
The aorta was segmented on the cine images using a spatio-termporal neural network49. The maximum and minimum cross-sectional areas were derived from the segmentation and distensibility calculated using estimates of central blood pressure obtained using peripheral pulse-wave analysis (Vicorder, Wuerzburg, Germany)11.
Motion tracking was performed on the cine images using non-rigid image registration between successive frames (in GitHub repository ukbb_cardiac)50,51. To reduce the accumulation of registration errors motion tracking was performed in both forward and backward directions from the end-diastolic frame and an average displacement field calculated11. This motion field was then used to warp the segmentation contours from end-diastole onto successive adjacent frames. Circumferential (Ecc) and radial (Err) strains were calculated on the short axis cines by the change in length of respective line segments (Figure 1A) as , where dir represents the direction, Ldir the length of a line segment along this direction and ΔLdir its change over time. Motion tracking was also performed on the long-axis four-chamber cines to derive longitudinal (Ell) strain. Peak strain for each segment and global peak strain were then calculated (Figure 1B). Strain was measured from slices acquired at basal, mid-ventricular, and apical levels. For comparison between each component absolute strain values are reported52,53. Strain rate was estimated as the first derivative of strain and peak early diastolic strain rate in radial (PDSRrr) and longitudinal (PDSRll) directions was detected using an algorithm to identify local maxima (in GitHub repository peak_detection) (Figure 1C).
Non-imaging phenotypes
In total we consider 110 non-imaging cardiovascular-related phenotypes in UK Biobank participants for the phenotype regression analysis and the genetic analysis. These phenotypes contain information acquired by touch screen questionnaire, interview, biophysical measurement, hospital episode statistics, primary care data and biochemical analysis of venous blood. Details of how each phenotype was acquired are available on the UK Biobank Showcase (http://biobank.ctsu.ox.ac.uk/crystal/). It should be noted that the biochemical markers used here were acquired at the initial assessment visit that preceded imaging assessment. Also, note that not all phenotypes were used in both the phenotype and the genetic analysis (e.g., due to lack of available data at the imaging visit). We refer to the Supplementary Material both for details on the definition of the considered phenotypes and for information on the inclusion of specific phenotypes for each analysis.
Statistical significance testing and multiplicity control
We consider in all phenotype analysis a P-value < 0.05 as significant. Where not stated otherwise, we control the false discovery rate with the Benjamini-Hochberg adjustment54. Significance thresholds and decision criteria for GWAS significant loci and causality assessment (Mendelian Randomization) are described in the respective sections and/or in the Supplementary Material.
Phenotype Association Analysis
Continuous variables are expressed as mean ± standard deviation (SD). Differences in continuous variables between groups were performed using Student’s t-test. Univariable and multiple linear regression analysis was used to explore the phenotype relationship between each diastolic parameter and cardiovascular risk factors. To identify relationships between diastolic function and a broader range of imaging and non-imaging phenotypes, including circulating biomarkers, we used the least absolute shrinkage and selection operator (LASSO)55 with stability selection, to optimise the model coefficients. We then ran regression diagnostics on the model with the selected variables, to exclude a possible collinearity inappropriately influencing our model (see Supplementary Material for details on the phenotype analysis and LASSO analysis procedure).
Genotyping and sample QC
Genotyping of UK Biobank participants has been described elsewhere in detail56. Briefly, UK Biobank genotyping for 488,252 subjects was performed on the UK BiLEVE or UK Biobank Axiom arrays. Imputation was based on the HaplotypeReference Consortium panel and the UK10K+1000 Genomes panel56. In this study, UK Biobank Imputation V3 (in GRCh37 coordinates) were used. Whole exome sequencing (WES) was performed on data released in 2020 collected from 200,640 UK Biobank participants57. The sequencing methods and variant calling procedures have been described in detail58. In the present study, genotypes in their released PLINK-format files are utilized, and samples were restricted to the European population. Quality control of the genetic data was performed as recommended by UK Biobank (see Supplementary Material for details on the procedure and number of excluded samples).
GWAS analysis
For the genetic analysis, there were 34,242 participants of European ancestry (see Supplementary Material for criteria) providing a discovery dataset of 23,321 participants and a validation set of 10,924 participants. GWAS analyses for the three diastolic function traits and additional quantitative traits of interest (as described for the causality assessment) were performed with BOLT-LMM (version 2.3.2) which accounts for ancestral heterogeneity, unknown population structure, and sample relatedness59,60. GWAS analyses were adjusted for imaging traits for the first ten genetic principal components, sex, age at time of MRI, the genotyping array and the MRI assessment center and for non-imaging quantitative traits for the first ten principal components, sex, age at measurement of the trait and the genotyping array. GWAS analyses for clinical endpoints of interest (binary endpoints) were conducted with PLINK261 and adjusted for the first ten principal components, sex, age at baseline and the genotyping array. Post-GWAS filtering removed any SNPs with a Hardy-Weinberg equilibrium p-value < 0.05 and MAF < 0.005.
Assessment of shared genetic architecture
For the assessment of shared genetic architecture between diastolic function traits, LD score regression (LDSC (LD SCore) v1.0.1, PMID 25642630) was used to obtain a genetic correlation score between each pair of traits.
Variant annotations
Lead variants for each locus were assigned causal genes, where possible, using a combination of variant annotations and additional functional genomic data sources (colocalisation). Each lead variant was systematically tested for any evidence of functional consequence using VEP. In addition, QTL evidence was extensively searched using Open Targets Genetics62. Where eQTL data was available for the locus, the full summary statistics were downloaded to assess colocalisation (see Suppplementary Material).
Variant Effect Predictor (VEP)63 and Loss-of-Function Transcript Effect Estimator (LOFTEE)64 plugin were applied on all genomic variants of WES data. In the present study, we considered the genomic variants predicted by LOFTEE with high-confidence label “HC”, non-dubious (no “LoF flag” such as variants that located in poorly conserved exons, or splice variants that affect NAGNAG sites or non-canonical splice regions), and minor allele frequency < 0.05, as a Loss-of-function (LoF) mutation.
LoF association analysis
A Loss-of-Function (LoF) carrier indicator was created for each WES sample and each of the human protein-coding genes based on the collapsed information of LoF annotations. A subject was considered as an LoF carrier of the gene if there was at least one LoF mutation (based on methods in the variant annotation section), and a non-carrier if there was none. We then conducted the association test between LoF carrier indicator and the three diastolic function imaging phenotypes. Linear regression was performed with the adjustment of sex, age at time of MRI, and the top ten genetic principal components. The association results were further filtered as those with at least two carriers and the endpoint available. The association was considered significant after multiple testing correction at α = 0.05 (FDR, calculated for three diastolic function traits). We identified 18,660 participants with both whole exome sequencing data and CMR imaging data.
Polygenic risk scores (PRS)
Candidate variants for PRS for the three diastolic function traits (LAVmaxi, PDSRll, PDSRrr) were obtained based on the respective GWAS (full imaging cohort) results by performing clumping (PLINK1.9,65) using an LD threshold of R2 = 0.1 (in a window of 1000kb) and considering all SNPs with P < 10−6. Candidate variants were included in multivariate linear modelling evaluated on the European subset of the full imaging cohort with the first ten genetic principal components, age at MRI, sex, genotyping array and the MRI center as additional covariates and the respective diastolic function trait as dependent variables. The diastolic function traits were scaled to standard deviation (sd) 1 prior to the model estimation - therefore, a unit change in the PRS score represents a change of 1 sd unit in the respective diastolic function trait. PRS estimates per individual were then calculated by multiplying the observed genotype with the estimated beta from the multivariate linear model for each SNP and summing these values up. Missing genotypes were imputed using a mean imputation. The variance explained for the PRS is measured by R2, estimated in a linear regression with the PRS as only variable and the respective diastolic function trait as endpoint.
Next, we conducted a PheWAS using the obtained PRS (see above and Supplementary Material for a full definition of included phenotypes in the PheWAS). Evaluation of the PRS were performed in the European non-imaging cohort, i.e. an independent set of subjects compared to the PRS construction set. Only results are shown that are significant after multiple testing correction at α = 0.05 (FDR, calculated per diastolic function trait) and, as a sensitivity analysis, for which all leave-one-SNP out cross validations analysis lead to a significant result at α = 0.05 after multiple testing correction (FDR) for the number of considered phenotypes. The latter condition is supposed to exclude spurious results which are only driven by one single variant. Leave-one-SNP out cross-validation is performed by excluding one SNP from the list of candidate variants, then re-estimating the PRS and performing the PheWAS as described above. For the leave-one-SNP out cross-validation, FDR adjustment is performed per combination of diastolic trait and phenotype, considering the number of included SNPs.
Mendelian randomization
For exploring the causes and consequences of diastolic function parameters, we used a bidirectional Mendelian randomization (MR) approach, i.e. two MR analysis are performed: first, an MR analysis using the first chosen trait as exposure is conduced and secondly a MR analysis using the selected second trait is run. By considering both results, evidence can be gathered for a one-directional causal relationship, a bi-directional causal relationship or no causal relationship at all66. We performed this analysis taking into account one diastolic and one non-diastolic function trait and for that, we selected non-diastolic function traits of interest by taking into account the results from the observational correlation analysis and clinical expertise. This approach lead to the consideration of 5 traits (Heart failure, Diabetes - considering Type I and Type II separately-, pulse rate, diastolic blood pressure) as potential causes or consequences of changes in diastolic function parameters.
We established a workflow for the MR analysis which briefly described in this section. Full details are provided in the Supplementary Material. Genetic instrumental variables were selected from the UK Biobank GWAS results generated -as described above- via clumping with PLINK1.9 as described for the PRS approach. The candidate SNP set prior to clumping was restricted to the intersection between the SNP sets of the pair of GWAS results (hypothesised causal trait GWAS and hypothesised consequence trait GWAS).
All MR analysis are based on the point estimates and standard deviations obtained from the respective GWAS. We follow a similar approach to van Oort et al.23 by using inverse-variance weighted method (IVW) as the main analysis and applying several other MR methods for ensuring robustness of the obtained results as sensitivity analyses. We used weighted median-based methods67, MR-PRESSO68 and MR-Egger69. Consistent effect estimates across the different methods improves our confidence in a truly causal effect. We consider an association as “potential causal” if the main analysis indicates a causal relationship (P < 0.01), at least two of the sensitivity analyses indicate at least a suggestive causal relationship (P < 0.05) and none of the sensitivity analyses indicate associations with inconsistent effect directionality, i.e. none of the methods showed a suggestive association with conflicting directionality (P < 0.05). No explicit multiplicity adjustment is performed for MR experiments. For “potential causal” associations, we next conducted a supplementary sensitivity analysis using published GWAS results as described in the Supplementary Material - if published GWAS data was available.
All analysis, which involved diastolic and non-diastolic function traits, were conducted in a two-sample approach, i.e., the diastolic function trait GWAS was calculated in the full imaging cohort and the non-diastolic function trait GWAS was calculated in the non-imaging cohort.
For comparison of the effect estimates from the MR-analysis to the observed correlation of diastolic function measurement and disease status, we restricted the analysis population to subjects which were disease-free at the CMR visit. We then fitted a logistic regression model by coding subjects who experienced a first event of the selected disease during follow-up time as 1 and event-free subjects during follow-up as 0. As covariates, we included age at CMR visit, gender, diabetes status, diastolic blood pressure and BMI. Note that this analysis was only performed for relationships judged as potential causal and that involves a disease endpoint (and not a quantitative measurement like pulse rate).
NPR3 Pathway Analysis
In order to increase our understanding of the association of NPR3 with LAVmaxi and to further characterize the role of natriuretic peptides, we looked for additional genetic associations within genes of the natriuretic peptide pathway (so in addition to NPR3 - NPR1, NPR2, NPPA, NPPB, and NPPC). We conducted GWAS using BOLT-LMM for all imaging traits listed in Extended Data Table 1 as described above, as well as any non-imaging traits associated with rs1173727 (the lead variant for NPR3) across the 4 loci (NPPA and NPPB share the same locus). The GWAS summary statistics were filtered to a 1MB window around each gene (for NPPA/B, the gene used for centering was NPPA). Across these summary statistics, we performed clumping with a p-value threshold of 10−5 and R2 < 0.1.
For the identified tag SNPs and associated variants in LD from the clumping analysis, we then tested which of these variants we could confidently link to the natriuretic gene in the locus. If any variant was classified as missense, we selected that variant directly. For eQTL variants, we used colocalisation analysis to link these SNPS to the natriuretic genes in each locus. Relevant eQTL and pQTL data was used (eQTL summary statistics were taken from eQTL Catalog70 and pQTL data from Sun et al.71) and SNPs with only a clear association with the gene of interest and traits of interest were kept (i.e. p < 1−4 for association with gene or protein expression, P < 10−5 for association with the trait, and H12 > 0.5 was used as a threshold for the co-localization analysis).
Hierarchical clustering was then performed on the —log(P) × β values with the β s aligned to have a negative sign on the diastolic blood pressure. Extended Data Fig 6 shows all SNPs and traits with a genome-wide significant association. The SNPs and traits with suggestive associations (P < 10−5) are shown in the Supplementary Material (Supplementary Fig 6).
Data Availability
All raw and derived data in this study is available from UK Biobank (http://www.ukbiobank.ac.uk/). GWAS summary level data are publicly available through the GWAS catalogue.
Code Availability
The analysis code is freely available on GitHub (DOI: 10.5281/zenodo.4767044).
Data Availability
All raw and derived data in this study is available from UK Biobank (http://www.ukbiobank.ac.uk/). GWAS summary level data are publicly available through the GWAS catalogue (accession number GCP000170).
Author contributions
M.T. and J.M. performed the formal analyses and co-wrote the paper; W.B, N.S., A.de M. and D.R. performed the image analysis; K.A.McG., J.S.W., H.V.M., L.Z., F.S. and C.B. performed or interpreted the genetic analyses; A.MacN. performed the GWAS; M.R.W. interpreted the pharmacology findings; D.F.F. and D.P.O’R conceived the study, managed the project and revised the manuscript. All authors reviewed the final manuscript.
Competing interests
The authors declare no competing interests.
Extended Data Figures
Extended Data Table
Acknowledgments
The study was supported by Bayer AG; Medical Research Council (MC-A658-5QEB0); National Institute for Health Re- search (NIHR) Imperial College Biomedical Research Centre; British Heart Foundation (NH/17/1/32725, RG/19/6/34387, RE/18/4/34215); Wellcome Trust (107469/Z/15/Z); Academy of Medical Sciences (SGL015/1006); Mason Medical Research Trust grant; and the Engineering and Physical Sciences Research Council (EP/P001009/1). H.V.M. also receives funding from the Simons Center for Quantitative Biology at Cold Spring Harbor Laboratory.
Footnotes
↵* These authors jointly supervised this work