Abstract
Metabolomic platforms using nuclear magnetic resonance (NMR) spectroscopy can now rapidly quantify many circulating metabolites which are potential biomarkers of cardiovascular disease (CVD). Here, we analyse ∼170,000 UK Biobank participants (5,096 incident CVD cases) without a history of CVD and not on lipid-lowering treatments to evaluate the potential for improving 10-year CVD risk prediction using NMR biomarkers in addition to conventional risk factors and polygenic risk scores (PRSs). Using machine learning, we developed sex-specific NMR scores for coronary heart disease (CHD) and ischaemic stroke, then estimated their incremental improvement of 10-year CVD risk prediction when added to guideline-recommended risk prediction models (i.e., SCORE2) with and without PRSs. The risk discrimination provided by SCORE2 (Harrell’s C-index = 0.718) was similarly improved by addition of NMR scores (ΔC-index 0.011; 0.009, 0.014) and PRSs (ΔC-index 0.009; 95% CI: 0.007, 0.012), which offered largely orthogonal information. Addition of both NMR scores and PRSs yielded the largest improvement in C-index over SCORE2, from 0.718 to 0.737 (ΔC-index 0.019; 95% CI: 0.016, 0.022). Concomitant improvements in risk stratification were observed in categorical net reclassification index when using guidelines-recommended risk categorisation, with net case reclassification of 13.04% (95% CI: 11.67%, 14.41%) when adding both NMR scores and PRSs to SCORE2. Using population modelling, we estimated that targeted risk-reclassification with NMR scores and PRSs together could increase the number of CVD events prevented per 100,000 screened from 201 to 370 (ΔCVDprevented: 170; 95% CI: 158, 182) while essentially maintaining the number of statins prescribed per CVD event prevented. Overall, we show combining NMR scores and PRSs with SCORE2 moderately enhances prediction of first-onset CVD, and could have substantial population health benefit if applied at scale.
Introduction
Circulating biomarkers play a central role in cardiovascular disease (CVD) risk scores recommended by clinical guidelines to identify high-risk individuals for CVD prevention1–3. Total cholesterol and high-density lipoprotein (HDL) cholesterol are routinely measured and used alongside demographic and lifestyle risk factors to assess 10-year risk of CVD using risk scores such as SCORE24. Efforts to improve CVD risk prediction models have considered additional circulating biomarkers5, such as C-reactive protein (CRP)6,7, as well as incorporating polygenic risk scores (PRSs) to account for genetic predisposition8–10. While PRSs have shown potential to enhance CVD risk screening11–14, addition of individual CVD biomarkers have thus far shown limited overall incremental benefits15–17.
High-throughput nuclear magnetic resonance (NMR) spectroscopy has enabled rapid and simultaneous quantification of several biomarkers from a single human blood plasma sample18,19. These include cholesterols and other lipids in lipoprotein sub-fractions, fatty acids, ketone bodies, amino acids, glycolysis metabolites and glycoprotein acetyls (GlycA)20,21. NMR metabolic biomarker data has been quantified in numerous cohorts over the last decade, helping derive new insights into the genetic determinants, molecular pathogenesis, and epidemiology of CVD22.
Several studies have investigated the utility of biomarkers combinations from NMR platforms to improve prediction of CVD23–26; however, they have focused on multi-disease prediction, used outdated clinical risk prediction scores, and have not investigated improvements relative to clinically relevant guideline-recommended risk thresholds.
Here, we utilize NMR biomarker data in UK Biobank to assess whether NMR biomarkers, in isolation or combination (i.e., NMR scores), can improve 10-year CVD risk prediction when added to the SCORE2 risk model, which is recommended by the European Society of Cardiology (ESC) 2021 guidelines for CVD prevention3. We further assess whether incremental improvements in CVD risk prediction are meaningful at ESC 2021 recommended risk thresholds for treatment consideration3. In addition we compared the improvement in risk prediction provided by NMR scores to that provided by PRS11 and also assessed the PRSs and NMR scores combined. Finally, we modelled the potential public health benefits for CVD prevention if applied to the UK primary care population according to the ESC 2021 guidelines for statin initiation. A schematic of the overall study is given in Figure 1.
Results
Characteristics of study participants
Of the 502,207 participants enrolled in UK Biobank consenting to electronic health record linkage, 168,517 participants met the inclusion criteria for this study (Figure S1), namely: participants who were eligible for 10-year CVD risk assessment according to the ESC 2021 guidelines for CVD prevention3, had plasma NMR spectroscopy biomarker data available (with <5% missingness), had complete data on risk factors needed to compute SCORE2, were not taking lipid lowering medications, and had data on PRSs (Methods). Participants were eligible for 10-year CVD risk assessment at baseline if they were 40 to 69 years of age and were apparently healthy3; i.e. with no prior history of established atherosclerotic cardiovascular disease, diabetes mellitus, chronic kidney disease, or familial hypercholesterolemia. During the 1,641,935 person-years at risk (median [5th, 95th percentile] follow-up of 10.0 [8.5–10.0] years), 5,096 CVD cases were recorded. Baseline cohort characteristics are detailed in Table 1.
Incremental CVD risk discrimination with individual biomarkers
Improvements in CVD risk discrimination were assessed by differences in C-index (ΔC-index) beyond SCORE2 alone for each of the 249 NMR biomarkers (Table S1) and 28 clinical chemistry biomarkers (Table S2). The ΔC-index was assessed for each biomarker separately using sex-stratified Cox proportional hazards models for 10-year CVD risk with the biomarker as a predictor and the SCORE2 linear predictor as an offset term (Methods). The sex-stratified C-index for SCORE2 alone was 0.718 (95% confidence interval [CI]: 0.711, 0.724) in the analysis sample of 168,517 participants (5,096 CVD cases).
Based on a false discovery rate (FDR) adjusted bootstrap P-value < 0.05 (Methods) we observed statistically significant improvement in C-index for 35 of the 277 biomarkers (Table S3A). Improvements in sex-stratified C-index over SCORE2 (ΔC-index) were modest (Figure 2A). The largest ΔC-index observed for any biomarker was with addition of cystatin-C measured by clinical biochemistry assay (Figure 2A), with ΔC-index of 0.006 (95% CI: 0.004, 0.008; Table S3A). The largest ΔC-index observed for any of the 249 NMR biomarkers was with addition of albumin (Figure 2A), with ΔC-index of 0.005 (95% CI: 0.003, 0.006; Table S3A). Results were similar when analysing males and females separately, but with reduced power to detect statistically significant differences in ΔC-index (Table S3B,C).
NMR biomarker scores
Sex-specific NMR biomarker scores for CHD (4,054 cases; Table S4A) and ischaemic stroke (1,280 cases; Table S4B) were trained and tested in the 168,517 participants, then later combined (see next section) for 10-year CVD risk prediction (Figure S2; Methods). NMR biomarker scores were trained for CHD and stroke separately and later combined, as we previously found that combining PRSs for CHD and stroke led to improved prediction of 10-year CVD risk over a single PRS trained for CVD, due to its heterogeneity11. NMR scores were also trained in males and females separately to capture sex-specific differences in their concentrations27 and well-known differences in baseline survival between males and females3. NMR scores were trained using elastic-net penalised Cox proportional hazards regression28,29 in nested cross validation using the 106 non-derived NMR biomarkers as candidate predictors (Figure S3; Methods). The per-biomarker weights for computing the consensus optimal NMR scores after model training are given in Table S5.
All four consensus optimal NMR scores included all 106 non-derived biomarkers (Figure S3C, Table S5). The biomarker with the strongest contribution to the CHD NMR scores was GlycA, which explained 12.8% of the variance in the CHD NMR score in males, and 11.4% of the variance in the CHD NMR score in females (Table S5). The biomarker with the strongest contribution to the ischaemic stroke NMR scores was albumin, which explained 15.7% of the variance in the ischaemic stroke NMR score in males, and 27.1% of the variance in the ischaemic stroke NMR score in females (Table S5).
To avoid overestimation of prediction performance in downstream analyses, we used for each NMR score the aggregate of their predicted values across cross-validation test partitions (Figure S2A). Sex-specific pairwise correlations between SCORE2, the NMR scores, and PRSs are shown in Figure S4. Predicted NMR scores and PRS were statistically significant independent predictors of 10-year CVD risk when fitting sex-stratified Cox proportional hazards regression with SCORE2 as an offset term (Figure 3A; Methods). Results were similar when fitting models with SCORE2 risk factors as independent predictors variables (Figure S6; Table S6B; Methods). Results were also similar when analysing males and females separately (Figure S5A, Figure S6, Table S6).
Incremental value of NMR biomarker scores and PRSs to 10-year CVD risk prediction
We assessed and compared three models to SCORE2 for 10-year CVD risk prediction in the 168,517 participants: (1) SCORE2 + CHD NMR score + ischaemic stroke NMR score, (2) SCORE2 + CHD PRS + ischaemic stroke PRS, and (3) SCORE2 + NMR scores + PRSs (Methods). NMR scores and/or PRSs were combined with SCORE2 as part of the cross-validation procedure described above (Figure S2). Per-score weightings for centring adding NMR scores and/or PRSs to SCORE2 in new samples are given in Table S7.
Incremental discrimination for 10-year CVD risk for each model was assessed by differences in sex-stratified C-index from SCORE2 alone (Methods). We observed statistically significant improvement in ΔC-index for all three models (Figure 3B, Table S8). The ΔC-index with addition of NMR scores was 0.011 (95% CI: 0.009, 0.014; Table S8), almost double the ΔC-index observed for any single biomarker alone (Table S3A). This was also similar to the ΔC-index observed with addition of PRSs, which was 0.009 (95% CI: 0.007, 0.012; Table S8). Improvement in risk discrimination was greatest when adding both NMR scores and PRSs to SCORE2 (Figure 3B), with ΔC-index 0.019 (95% CI: 0.016, 0.022; Table S8)—an 8.8% gain in C-index relative to SCORE2 alone—for a total absolute C-index of 0.737. Improvements in ΔC-index were greater in males than in females for all models, and the differences were most pronounced for models incorporating PRSs (Figure S5B, Table S8).
Incremental value in risk stratification using ESC 2021 risk thresholds for treatment consideration
Next, we assessed whether incremental improvements in CVD risk prediction were meaningful at clinically relevant risk thresholds. For each model, we calculated absolute 10-year CVD risk using formulae calibrated to the UK population4 (Methods) and stratified participants into categories of low risk, medium risk, and high risk (Table S9) using ESC 2021 recommended risk thresholds for treatment consideration3 (Methods). Distributions of predicted absolute risk are compared in Figure S7. Improvements in risk stratification over SCORE2 alone were then assessed using categorical net reclassification index (NRI) (Figure 3C, Table S10).
Statistically significant improvement in risk stratification over SCORE2 among incident CVD cases was observed for all three alternative models tested (Figure 3C, Table S10A). Improvements in case classification from NMR scores were more than twice as strong as those from PRSs. We observed a net case reclassification rate of 10.71% (95% CI: 9.33%, 12.08%) with addition of NMR scores, and 4.21% (95% CI: 3.08%, 5.34%; Table S10A) with addition of PRSs. Improvements in case classification were strongest with addition of both NMR scores and PRSs, with a net case reclassification rate of 13.04% (95% CI: 11.67%, 14.41%; Table S10A). Results were similar in sex-specific analyses (Figure S5C, Table S10B).
A modest, but statistically significant, inappropriate reclassification for non-cases was also observed for all three alternative models (Figure 3C, Table S10B). The net reclassification rate for non-cases was −2.51% (95% CI: −2.69%, −2.34%) with addition of NMR scores, −0.58% (95% CI: −0.74%, −0.41%) with addition of PRSs, and −2.90% (95% CI: −3.10%, −2.70%) with addition of both NMR scores and PRSs (Table S10B). In sex-specific analyses greater inappropriate reclassification of non-cases was observed in females than in males (Figure S5C, Table S10B).
Incremental value for CVD prevention with population-wide screening
Next, we estimated the incremental benefits to CVD prevention if applied at scale to the UK population. We simulated a hypothetical population of 100,000 adults 40–69 years of age representative of the general UK population using the age- and sex-structure of the UK population30 and previously published 10-year CVD incidence rates amongst CVD- and statin-free primary care patients11 (Figure S8, Methods). In total, the simulated population comprised 49,156 males (4,391 incident CVD cases) and 50,844 females (2,245 incident CVD cases) (Table S11).
To model the benefits of population-wide screening with each model, we stratified the simulated population into the low-, medium-, and high-CVD risk groups based on the proportions allocated to each category in UK Biobank by SCORE2 alone and the three alternative models adding NMR scores and/or PRSs (Figure S9A, Methods). We modelled statin initiation in the high-risk group, who based on their risk thresholds would be recommended for risk factor treatment by the ESC 2021 guidelines for CVD prevention3. The impact of statin initiation was modelled as preventing one in five simulated incident CVD events; assuming a 20% reduction in 10-year CVD risk31.
Incremental improvements in CVD prevention for each alternative model were assessed by differences from SCORE2 alone in (1) the number of people classified as high risk (ΔNhigh-risk); (2) the number of future CVD cases amongst the high-risk group (ΔCVDhigh-risk); (3) the number of future CVD events expected to be prevented by initiation of statins in the high-risk group (ΔCVDprevented); (4) the number needed to screen to prevent one CVD event (ΔNNS); and the number of statins prescribed per CVD event prevented (ΔNNT) (Methods).
Consistent with the categorical NRI analyses above, we observed statistically significant improvements in CVD prevention with addition of NMR scores and/or PRSs (Figure 4A). For all three models we observed a statistically significant increase in the ΔNhigh-risk, ΔCVDhigh-risk, and ΔCVDprevented, along with a statistically significant decrease in the ΔNNS (Figure 4A, Table S12A). The number of events prevented increased from 201 per 100,000 screened with SCORE2 alone, to 309 with addition of NMR scores (ΔCVDprevented: 108; 95% CI: 96, 120), to 246 with addition of PRSs (ΔCVDprevented: 108; 95% CI: 35, 56), and to 339 with addition of both NMR scores and PRSs (ΔCVDprevented: 139; 95% CI: 125, 153) (Table S12A). Importantly, our modelling indicated no statistically significant change in NNT, the number of statins prescribed per CVD event prevented was constant at 23 (Table S12A). Sex-specific analyses were similar (Figure S10A, Table S12A).
Incremental value for CVD prevention with targeted screening
Finally, we estimated the incremental benefits to CVD prevention if using NMR scores and/or PRSs for targeted risk-reclassification of those classified as medium-risk with SCORE2, for whom the ESC 2021 guidelines suggest considering, but do not explicitly recommend, risk factor treatment3. When applying SCORE2 alone to the simulated population (Figure S9A, Methods), there were 36,005 people predicted to be classified to the medium CVD risk category, which included 3,728 incident CVD cases (56% events).
We re-stratified the medium-risk population based on proportions re-stratified into each risk category in UK Biobank by the three alternative models adding NMR scores and/or PRSs (Figure S9B, Methods). Incremental improvements in CVD prevention from targeted screening with NMR scores and/or PRSs were consistent with and stronger than those observed from population-wide screening (Figure 4B, Table S12B). The number of events prevented increased from 201 per 100,000 screened with SCORE2 alone, to 336 with additional targeted screening with NMR scores (ΔCVDprevented: 136; 95% CI: 125, 147), to 277 with additional targeted screening with PRSs (ΔCVDprevented: 77; 95% CI: 68, 86), and to 370 with additional targeted screening with NMR scores and PRSs combined (ΔCVDprevented: 170; 95% CI: 158, 182) (Table S12B). A small but not statistically significant increase in the number of statins prescribed per CVD event was observed for all three models (ΔNNT: 1; Table S12B). Improvements in CVD prevention were statistically significant in both males and females, with no statistically significant change in the number of statins prescribed per CVD event prevented (Figure S10, Table S12B).
Discussion
Determining the added value of biomarkers beyond total and HDL cholesterol for 10-year CVD risk prediction is an area of interest for enhancing CVD prevention3. Here, we investigated whether 10-year CVD risk prediction in UK Biobank participants eligible for screening could be improved, in comparison to the currently recommended SCORE23,4.
We found statistically significant improvements in 10-year CVD risk prediction from 35 of 277 biomarkers quantified either individually by clinical chemistry assays or simultaneously by plasma NMR spectroscopy. Although statistically significant due to the large sample size, the magnitude of these incremental improvements was modest. Combining NMR biomarkers into NMR scores almost doubled the gain in observed predictive performance (ΔC-index) as compared to any single NMR biomarker. NMR biomarker scores and PRSs offered largely orthogonal information and increased SCORE2 C-index to similar degrees.
Apart from the cholesterol component of CVD risk in SCORE24, the biomarkers yielding the strongest improvement in 10-year CVD risk prediction were related to inflammation; a well-studied target in CVD prediction and prevention research32–34. Cystatin-C is a biomarker of renal function and cardiovascular disease also known to be associated with increased inflammation35,36. Albumin was the biomarker yielding the second strongest improvement over SCORE2 and was also the strongest contributor to the ischaemic stroke NMR score. Hypoalbuminemia has been associated with increased risk of stroke in numerous epidemiological studies37–39 and is also a biomarker of inflammation40,41. The strongest contributor to the CHD NMR score was GlycA, an NMR signal quantifying the levels of multiple proteins with key roles in inflammation21,42 and a stronger biomarker of chronic inflammation than C-reactive protein43, which has been associated with CVD risk in multiple studies20,44,45.
Risk prediction models including NMR scores and/or PRSs also improved risk stratification of future CVD events when using risk thresholds recommended for clinical decision making by the ESC 2021 guidelines for CVD prevention. NMR scores improved net case reclassification to a greater extent than PRSs (10.71% vs 4.21%, respectively); however, when combined, NMR scores and PRSs improved net case reclassification by 13.04%. These results highlight the complementary nature of the information capture by PRSs and NMR scores. While PRSs capture the lifetime risks due to genetics8–10, NMR scores capture part of the dynamic component of risk conferred by lifestyle and environment23, which act on that genetic background.
When modelling the potential benefits for CVD prevention in the wider UK population eligible for 10-year CVD risk screening, we found adding NMR scores and/or PRSs to SCORE2 significantly increased the those would be recommended for statin initiation (following the ESC 2021 guidelines for risk factor treatment for CVD prevention3) and who would subsequently experience a CVD event. Importantly, the number of statins prescribed per CVD event prevented would stay constant. We estimated that adding NMR scores to SCORE2 would increase the number of CVD events prevented from 201 to 309 (per 100,000); adding PRSs to SCORE2 would increase the number of CVD events prevented to 246; and adding both NMR scores and PRSs to SCORE2 would increase the number of CVD events prevented to 339.
To increase its efficiency, we also modelled the potential benefits of targeted follow-up screening in those at medium risk, for whom the ESC 2021 guidelines suggest considering, but do not explicitly recommend, risk factor treatment3. We estimated that, per 100,000 screened, that targeted risk-reclassification with NMR scores would increase the number of CVD events prevented to 336; targeted risk-reclassification with PRSs would increase the number of CVD events prevented to 277; and targeted risk-reclassification with both NMR scores and PRSs would increase the number of CVD events prevented to 370. We also estimated that this targeted follow-up screening would essentially maintain the number statins prescribed per CVD event prevented.
This study represents the largest population health assessment of metabolomic and genomic biomarkers for CVD to date. While our findings suggest that there are potential gains for CVD risk prediction and prevention, there are obvious challenges for validating clinical utility and potential implementation. Commercial providers of NMR biomarkers and PRSs exist, yet fidelity, scale, and cost frequently mean that real world benefits are less than those estimated in prospective cohort studies. Nevertheless, our results indicate that current technologies that can scale to populations (e.g. NMR metabolomics and genomics) have the capacity to moderately improve CVD risk prediction. Our results from clinical biochemistry assays also indicate substantial potential benefit for CVD risk prediction from proteins, which with few exceptions are not measurable by NMR spectroscopy22. Initial studies of plasma proteomics scores in UK Biobank have also shown promise for enhancing CVD risk prediction46, but larger sample sizes are needed to investigate the potential for improving risk stratification at clinically relevant decision-making thresholds. For the ultimate goal of primordial prevention, further studies are also needed to investigate the potential for circulating biomarkers for CVD risk prediction in younger adults47.
This study has several limitations. Ascertainment bias in UK Biobank means the analysis cohort is healthier than the general UK population48. The distribution of risk factors and biomarkers is likely wider in the general population eligible for 10-year CVD risk prediction, and thus the incremental improvement of NMR scores and PRSs may be higher than estimated in this study. On the other hand, the incremental benefits in CVD prevention are potentially overestimated when modelling the population eligible for 10-year CVD risk prediction with SCORE2 because no data were available on the age- and sex-structure of the CVD- and statin-free population, so our population was modelled using the age- and sex-structure of the wider UK population including those ineligible for screening. Nevertheless, despite these potential sources of bias, we believe our findings are robust as we have designed our analyses such that they do not differentially impact the different models for 10-year CVD risk prediction being compared. Finally, our study comprised middle-age adults of almost entirely (>95%) European ancestries in the UK Biobank. Our observations may not generalise to other countries, healthcare systems, or other ancestry groups within the UK. Similarly, while all training and test sets were distinct, the NMR scores were all generated within UK Biobank; therefore, their (relative) performances may differ in other populations. Further studies are needed to evaluate the efficacy and cost-effectiveness of NMR scores and PRSs for improving 10-year CVD risk prediction in these settings.
In conclusion, our results indicate that incorporating scores of NMR metabolomic biomarkers into 10-year CVD risk prediction could enhance prediction of first-onset CVD. We further add to the growing body of evidence that PRSs can be used to enhance CVD risk prediction over conventional risk factors10,11 and show that improvements in 10-year CVD risk prediction from PRSs are orthogonal to, and can be combined with, NMR scores. Applied at scale, integrating NMR scores alongside PRSs with SCORE2 may have moderate population health benefit.
Methods
Study cohort
UK Biobank is a cohort comprised of ∼500,000 participants 35–75 years of age with written informed consent for health related research48,49. Participants were members of the UK population recruited through primary care lists whom accepted invitation to attend one of 22 assessment centres across the UK between 2006 and 201048.
In this study we analysed a subset of 168,517 participants who at baseline assessment (1) consented for electronic health record linkage, (2) were eligible for 10-year CVD risk prediction with SCORE23,4, (3) were not prescribed statins or other lipid lowering medications, and (4) had completed information on risk factors required for SCORE2 computation, (5) had NMR biomarker data (with <5% missingness), (6) had imputed genotypes, and (7) are were eligible for joint analysis with PRSs. Figure S1 shows the sample exclusions at each step.
Sample exclusion criteria
Eligibility for 10-year CVD risk prediction with SCORE2 was determined following the ESC 2021 guidelines for CVD prevention in clinical practice3. UK Biobank participants were included if at baseline assessment they were (1) 40 years of age or older and less than 70 years of age, (2) did not have established atherosclerotic cardiovascular disease (ASCVD), (3) did not have diabetes mellitus, (4) did not have chronic kidney disease, and (5) did not have familial hypercholesterolemia.
Disease history was determined using a combination of self-reported medical history (UK Biobank fields #6150, #4728, #20002, and #20004), prescription medications (fields #6153, #6177, and #20003), and retrospective linkage to hospital episode statistics (fields #41259, #41234, and #41149). Prevalent ASCVD included acute myocardial infarction, acute coronary syndromes, transient ischaemic attack, peripheral arterial disease, and history of revascularization procedures. A full list of International Classification of Diseases (ICD) codes and self-report codes used to define ASCVD are given in Table S13. Prevalent diabetes mellitus was determined using the Eastwood et al. algorithms50, and included participants with probable or possible type 1 or type 2 diabetes. Prevalent chronic kidney disease was determined using the UK Biobank algorithmically defined outcome for end-stage renal disease (field #42026).
Familial hypercholesterolemia (FH) was determined using the Dutch Lipid Clinical Network (DLCN) diagnostic criteria51 as described in the ESC 2021 guidelines for CVD prevention in clinical practice3. Participants were excluded where they had low density lipoprotein (LDL) cholesterol quantified by clinical biochemistry assay ≥ 8.5 mmol/L (8 points on the DLCN diagnostic score indicative of probable FH) (fields #30780 and #30786). All participants with possible clinical history used for the DLCN diagnostic criteria were excluded due to prevalent ASCVD. Physical examination and family history data relevant to the DLCN diagnostic criteria were not collected at UK Biobank assessment. Functional mutations in the LDLR, APOB, and PCSK9 genes were not assessed in this study.
Participants already prescribed statins or other lipid lowering medications for CVD prevention were excluded as the primary purpose of 10-year CVD risk prediction with SCORE2 is to identify high-risk individuals for statin initiation in apparently healthy adults3. These participants were identified as those who at baseline assessment self-reported cholesterol lowering medication on the touchscreen questionnaire on health and medical history (field #6153 for women and field #6177 for men) and/or had been prescribed any of 18 lipid lowering prescription medications for CVD prevention (field #20003). The list of qualifying prescription medications (Table S14) was determined by cross-referencing the list of medications present in UK Biobank with the British National Formulary52 chapter 2.12: Lipid-regulating drugs with the restriction that the drug indications must include prevention of cardiovascular diseases.
Participants were also excluded where it was not possible to predict 10-year CVD risk with SCORE2 due to missing quantitative risk factor information. Quantitative risk factors with missing data included systolic blood pressure (SBP) (missing data at baseline for all instances of fields #93 and #4080) and total cholesterol and high-density lipoprotein (HDL) cholesterol as measured by clinical biochemistry assays (fields #30690 and #30760 respectively).
Participants with >5% missing NMR biomarker data were excluded, as after removal of technical variation, this excess missing data primarily arose due to removal of outlier plates of non-biological origin (89% of missing values).
Finally, participants were excluded if they were used as part of the training for the PRSs for CHD (PGS000018)8 and ischaemic stroke (PGS000039)9 analysed in this study.
Electronic health records
UK biobank participants were linked by UK Biobank to hospital inpatient admissions records (fields #41259, #41234, and #41149) for hospitals in England, Wales, and Scotland and to national death registry records (fields #40000, #40001, and #40002). All incident hospital events or death records were coded with ICD-10 codes (or OPCS-4 codes for surgical procedures). Hospital and death records follow-up was available up to 6th March 2018 for events occurring in hospitals in Wales, and to 2021 (beyond 10 years of follow-up) for events occurring in hospitals in England and Scotland.
Retrospective follow-up in hospital records was available from 27th July 1993 for events occurring in hospitals in England, 2nd December 1980 for events occurring in hospitals in Scotland, and 18th April 1991 for events occurring in hospitals in Wales, with median of 15.75 years retrospective follow-up (maximum 29.76 years). Retrospective hospital events were coded with a combination of ICD-10 and ICD-9 codes (or OPCS-4 and OPCS-3 codes for surgical procedures).
Participants with withdrawn consent for electronic health record linkage were identified from field #190 for sample exclusion.
NMR biomarker data quantification and quality control
NMR metabolite biomarker data was quantified in ∼275,000 randomly selected participants as previously described18,53. Briefly, NMR spectroscopy (Nightingale Health Plc.) was used to measure the absolute concentrations of 168 biomarkers and 81 biomarker ratios from non-fasting plasma samples (UK Biobank aliquot 3). Details on the identity of the 249 NMR metabolite biomarkers are provided in Table S1.
Technical variation was subsequently removed using a modified version of our previously described pipeline27 that has been updated to reflect our exploration the additional ∼150,000 participants measured since the pipeline development. The updated pipeline is available in version 2.2.1 of the ukbnmr R package. Briefly, technical variation removed included (1) time between sample preparation and sample measurement, (2) systematic differences in biomarker concentrations in each shipping batch due to sample position on the 96-well shipment plate, (3) measurement drift over time, (4) inter-spectrometer differences, and (5) shipment plates with systematically extreme concentrations of non-biological origin. For further details see https://github.com/sritchie73/ukbnmr.
Clinical biochemistry assay quantification and quality control
Targeted blood biochemistry assays were quantified in all ∼500,000 participants as previously described54. Briefly, absolute concentrations of 30 circulating biomarkers were quantified from serum (29 biomarkers) or red blood cell samples (glycated haemoglobin; HbA1c) using 24 analysis methods across six analytical platforms from five manufacturers (AU5800, Beckman Coulter; AU5800, Randox; LIASON XL, DiaSorin Ltd.; VARIANT II Turbo, Bio-Rad; and ADVIA 1800, Siemens). Missing data arising due to biomarker concentrations being above or below limits of reportability or detection were replaced with the largest or smallest non-missing value of that biomarker respectively with a small offset of 0.0001 units. Further details on the 30 biomarkers can be found in Table S2.
Genotyping, imputation, and polygenic risk scores
UK Biobank participants were genotyped on UK BiLEVE arrays and UK Biobank Axiom arrays and imputed to the 1000 genomes, UK10K, and Haplotype Reference Consortium panels55 using human genome build GRCh3749. Here, we converted the data to PLINK2 probabilistic dosages56 for analyses.
PRSs for CHD and ischaemic stroke were obtained from the polygenic score (PGS) catalog57 (accessions PGS000018 and PGS000039 respectively) and computed in UK Biobank participants using the PLINK2 software56 (PLINK v2.00a3LM AVX2 Intel [2 Mar 2021]) linear scoring function applied to the probabilistic allele dosages. These PRSs were chosen for this study as they are the most predictive PRSs for CHD and ischaemic stroke that do not include GWAS summary statistics derived from UK Biobank participants in their model development.
Cardiovascular risk factors
Age and sex at baseline assessment were obtained from UK Biobank fields #21003 and #31 respectively.
Systolic blood pressure (SBP) was measured using either an automated digital device (OMRON) (field: #4080) and/or by manual sphygmomanometer (field: #93). In both cases, two measurements were taken several moments apart, and the average was taken to obtain a single representative measure of SBP.
Smoking status was defined as “current” or “other” based on self-reported current smoking (field #20116). Missing data (“don’t know”, “prefer not to answer”) were set to “other”.
Incident cardiovascular disease
Incident CVD events were defined following the definition used by the SCORE2 working group and ESC Cardiovascular risk collaboration4 to include fatal hypertensive disease (ICD-10 codes I10–I16), fatal ischaemic heart disease (ICD-10 codes I20–I25), fatal arrhythmias or heart failure (ICD-10 codes I46–I52, excluding I51.4), fatal cerebrovascular disease (ICD-10 codes I60–I69, excluding I60, I62, I67.1, I68.2, and I67.1), fatal atherosclerosis or abdominal aortic aneurysm (ICD-10 codes I70–I73), sudden death and death within 24 hours of symptom onset (ICD-10 codes R96.0 and R96.1), non-fatal myocardial infarction (ICD-10 I21–I23), and non-fatal stroke (ICD-10 codes I60–I69, excluding I60, I62, I67.1, I68.2, and I67.1).
Follow-up time for each participant was restricted to a maximum of 10 years, defined as the difference in years between baseline assessment and the earliest of the following records: (1) the date of the first CVD event, (2) the date of death, (3) the date lost to follow-up (fields #190 and #191; e.g. participant reported to NHS or UK Biobank as having left the UK), (4) the maximum follow-up date in hospital records from Wales—6th March 2018—for participants located in Wales at baseline assessment or inferred to have moved to Wales since baseline assessment (based on presence of hospital records from Wales before 2018 and none in England or Scotland after, or change in location for subsequent UK Biobank assessments), or (5) date of baseline assessment plus ten years. Cohort characteristics reported in Table 1 include details on the number of people with non-CVD related mortality or otherwise lost to follow-up prior to 10 years.
SCORE2
Sex-specific per-participant values for SCORE2 (linear predictors) were computed from age, smoking status, SBP, total cholesterol, and HDL cholesterol using formulae published by the SCORE2 working group and ESC Cardiovascular risk collaboration4. Each risk factor was transformed as described in Supplementary methods Table 2 of the SCORE2 publication4: Then each transformed risk factor and risk factor × age-interaction was multiplied by the log hazard ratio obtained in the SCORE2 sensitivity analysis excluding UK Biobank participants from the log hazard ratio estimation (obtained from Supplementary Table 8 of the SCORE2 publication4 under the “Excluding UK Biobank” heading): These log hazard ratios were used to prevent overestimation of the efficacy of SCORE2 for 10-year CVD risk prediction in this study (Figure S11), which could result in underestimation of any potential improvements in risk discrimination from addition of biomarkers or PRS.
The sex-stratified C-index for SCORE2 in the 168,517 study participants was computed directly from this SCORE2 linear predictor using the concordance function in the survival R package version 3.3-1. The 95% confidence interval was computed from the standard error, which was computed by the survival R package using the infinitesimal jackknife method.
Incremental value in 10-year CVD risk prediction for individual biomarkers
Incremental improvement in 10-year CVD risk prediction for individual biomarkers beyond SCORE2 alone was assessed using differences in C-index from SCORE2 alone (ΔC-index) (Figure 2, Table S3). Incremental improvement in 10-year CVD risk was assessed for the 249 NMR biomarkers (Table S1) and 28 of the 30 clinical biochemistry assay biomarkers (Table S2): clinical biochemistry assays for HDL cholesterol and total cholesterol were not assessed here as they were used to compute SCORE2 linear predictor (see above).
For each biomarker, we fit a sex-stratified Cox proportional hazards regression for 10-year CVD risk with the biomarker as an independent variable and SCORE2 as an offset term in the 168,517 study participants (5,096 incident CVD cases) (Figure 2, Table S3A). SCORE2 was treated as an offset term, rather than an independent variable, as we sought to develop scores that added biomarkers to the existing SCORE2 weights. Cox proportional hazards regressions were fit using the coxph function in the survival R package version 3.3-1. The hazard ratio, absolute C-index, and its standard error were obtained directly from the returned result for each SCORE2 + biomarker model (Table S3A). The standard error was computed by the survival R package using the infinitesimal jackknife method.
The ΔC-index was subsequently computed for each SCORE2 + biomarker model by subtracting the sex-stratified C-index for the SCORE2 linear predictor as described above (Figure 2, Table S3A). A bootstrap procedure with 1,000 bootstraps was used to estimate the standard error for the ΔC-index. Bootstrap resampling was performed using methods appropriate for right-censored data58; using the censboot function in the boot R package version 1.3-28. 95% confidence intervals and two-sided P-values were computed from the bootstrap standard error using the first order normal approximation method. P-values were corrected for multiple testing across the 277 tested biomarkers using Benjamini-Hochberg false-discovery rate (FDR) correction.
In addition to sex-stratified analysis, incremental improvements in C-index beyond SCORE2 alone were similarly assessed in sex-specific analysis (Table S3B–C); performed separately in the 72,441 male participants (3,208 incident CVD cases) (Table S3B) and in the 96,076 female participants (1,888 incident CVD cases) (Table S3C).
NMR biomarker score training
Four NMR biomarker scores were developed: (1) an NMR score in males for predicting 10-year risk of coronary heart disease (CHD), (2) an NMR score in females for predicting 10-year risk of CHD, (3) an NMR score in males for predicting 10-year risk of ischaemic stroke, and (4) an NMR score in females for predicting 10-year risk of ischaemic stroke.
Incident CHD was defined as fatal or non-fatal myocardial infarction (ICD-10 codes I21–I24 or I25.2) or by presence of major coronary surgery (ICD-10 codes Z95.1 or OPSC-4 codes K40–K46, K49, K50.1, K75). Incident ischaemic stroke was determined using the respective UK Biobank algorithmically defined outcome (field #42008). Follow-up time was defined for each endpoint separately; i.e., a non-fatal ischaemic stroke event prior to CHD was not counted as a competing risk when defining follow-up for incident CHD and vice versa. Case numbers and follow-up characteristics are reported in Table S4A for CHD and Table S4B for ischaemic stroke.
Each of the four NMR biomarker scores were developed using elastic-net penalised Cox proportional hazards regression28,29 in data from the 168,517 participants. A nested cross-validation procedure was used, comprising a 5-fold outer layer and 10-fold inner layer (Figure S2). For each iteration at the outer layer, NMR biomarker scores were trained in 4/5ths of the data, then predicted in the withheld 1/5th of the data (test fold) (Figure S2A). Model training in each iteration used 10-fold cross-validation (inner-layer) for hyperparameter tuning of the elasticnet model (Figure S2B). Random allocation of participants to test folds was performed using the caret R package version 6.0-92 to balance the number of males and females, and incident cases within males and females, across test folds. Test folds at the inner cross-validation layer were balanced by incident CHD or ischaemic stroke cases respectively when training NMR scores for the respective endpoint. Test-folds at the outer cross-validation layer were balanced by incident CVD cases as the outer cross-validation layer was also used for the training procedure used to combine NMR scores for CHD and stroke described in the next section.
For each of the five iterations of training, sex- and endpoint-specific NMR scores were trained using elastic-net penalised Cox proportional hazards regression using the glmnet R package version 4.1-6 (Figure S2A). Each NMR biomarker score was trained with the 106 non-derived NMR biomarkers with SCORE2 as an offset term. Biomarkers were classified as non-derived where they could not be computed by summing or dividing two or more other biomarkers27 (Table S1).
Prior to NMR biomarker score training, missing data in the 106 NMR biomarkers were imputed as glmnet could not handle missing data. Missing data were imputed a single time in the 168,517 participants using the impute R package version 1.70.0 with the K-nearest neighbours algorithm59. K was set to 20 based on the number of principal components that cumulatively explained >95% of the variation in the 106 non-derived biomarkers. Prior to imputation, >92% of participants had no missing NMR biomarker concentrations, >6% had only one biomarker missing, and the remaining <2% had 2–5 of 106 biomarkers missing. In total 0.1% of biomarker concentrations were imputed. Per-biomarker missingness rates are given in Table S1. Biomarker concentrations were standardised in males and females separately so that sex-specific coefficients fit by glmnet were comparable across biomarkers.
For each of the five training iterations, the optimal elastic-net fit was determined with a grid search of α and λ where the range of α was [0 (ridge regression), 0.1, 0.25, 0.5, 0.75, 0.9, and 1 (lasso regression)] and a sequence of 100 λ values was automatically determined by the glmnet function for each α (Figure S3A). β coefficients estimated for each biomarker for the optimal fit in each training data are shown in Figure S3C. The proportion of variance in the NMR score explained by each biomarker was calculated as the ratio of its β2 to the sum of β2 across all biomarkers.
To obtain a single representative score (i.e., for computing NMR scores in new samples), coefficients were averaged across the five training iterations (Figures S3C, Table S5). Proportions of variance explained in the NMR score by each biomarker were also averaged across the five training iterations (Table S5).
To avoid overfitting in our downstream analyses of individual NMR scores (Figure 3A, Figure S4, Figure S5A, Figure S6, Table S6), the aggregate of the predicted NMR scores in the five withheld test folds were used (Figure S2A).
Independent associations of NMR scores, PRS, and risk factors with 10-year CVD risk
Two sets of multivariable Cox proportional hazards regressions were fit in the 168,517 participants to assess independent associations between predicted NMR scores, PRSs, and SCORE2 risk factors with 10-year CVD risk prediction (Figure 3A, Figure S5A, Table S6): (1) Multivariable Cox proportional hazards regression was fit with the CHD NMR score, stroke NMR score, CHD PRS, and stroke PRS as predictor variables and SCORE2 as an offset term (Figure 3A, Figure S5A, Table S6A); and (2) multivariable Cox proportional hazards regression was fit additionally with the individual SCORE2 risk factors as predictor variables instead of using SCORE2 as an offset term (Figure S6, Table S6B). Both multivariable models were fit in sex-stratified analysis and sex-specific (fit in the male and female participants separately) analyses (Figure 3A, Figure S5A, Figure S6, Table S6).
Combining NMR biomarker scores and/or PRSs with SCORE2
NMR scores and PRSs were combined with SCORE2 into three new scores for predicting 10-year CVD risk: (1) SCORE2 + NMR scores, (2) SCORE2 + PRSs, and (3) SCORE2 + NMR scores + PRSs.
Sex-specific combined linear predictors were created by adding zero-centred linear predictors for the CHD NMR score, stroke NMR score, CHD PRS, and stroke PRS to the SCORE2 linear predictor after multiplying each score by scaling factors weighting their relative contributions to 10-year CVD prediction over SCORE2 alone. Sex- and model-specific scaling factors were estimated using Cox proportional hazards regression fit for 10-year CVD risk with the relevant respective scores as independent predictor variables and SCORE2 as an offset term.
To avoid overfitting, offsets for score centring and scaling factors were calculated as part of the cross-validation procedure described above (Figure S2C). For downstream analyses (Figure 3B–C, Figure 4, Figure S5B–C, Figure S9, Table S8–S10, Table S12), the aggregate of the predicted values for the combined linear predictors across the five withheld test folds were used (Figure S2A). To obtain a single set of representative model coefficients for computing the combined scores in new samples, model coefficients were average across the five training iterations (Table S7).
Incremental value in 10-year CVD risk prediction for NMR biomarker scores and PRSs
Incremental improvement in 10-year CVD risk prediction beyond SCORE2 alone for each of the three combined scores (above) were assessed using differences in sex-stratified C-index in the 168,517 study participants (5,096 incident CVD cases) (Figure 3B, Table S8).
Sex-stratified C-indices and their infinitesimal jackknife standard errors were computed directly from the linear predictors for each of the three models using the concordancefit function in the survival package. The sex-stratified ΔC-index was computed as the difference between the C-index for SCORE2 and the C-index for each model. 95% confidence intervals and P-values were calculated from the bootstrap standard error after estimating the ΔC-index in a bootstrap procedure with 1,000 bootstraps. Bootstrap resampling was performed using methods appropriate for right-censored data58; using the censboot function in the boot R package version 1.3-28.
In addition to sex-stratified analysis, incremental improvements in C-index beyond SCORE2 alone were similarly assessed in sex-specific analysis (Figure S5B, Table S8); performed separately in the 72,441 male participants (3,208 incident CVD cases) and in the 96,076 female participants (1,888 incident CVD cases).
Computation of absolute risks
Linear predictors for SCORE2, SCORE2 + NMR scores, SCORE2 + PRSs, and SCORE2 + NMR scores + PRSs were converted into predictions of absolute 10-year CVD risk using formulae developed by the SCORE2 working group and ESC cardiovascular risk consortium4:
First, uncalibrated absolute risks were calculated using sex-specific estimates of baseline survival, which they calculated as the median baseline survival across 44 cohorts (including UK Biobank)4: Second, these were converted into absolute 10-year risks calibrated to the UK population using their scaling factors for the low-risk European region (which included the UK) applied in their risk recalibration formula4: Distributions of absolute risk for each model are shown in Figure S8.
Incremental value in stratification at risk thresholds used for clinical decision making
Categorical net reclassification improvement (NRI) analysis60,61 was used to assess the incremental value of NMR scores and/or PRSs over SCORE2 for stratifying individuals based on risk thresholds used for clinical decision making.
Participants were stratified into categories of low risk, moderate risk, and high risk based on their absolute 10-year CVD risk predicted by each model using risk thresholds recommended by the ESC 2021 guidelines for CVD prevention3 (Table S9). Participants <50 years of age were allocated to the low-risk group if their absolute 10-year CVD risk was < 2.5%, to the medium risk group if their absolute 10-year CVD risk was < 7.5% risk, and to the high-risk group if their absolute 10-year CVD risk was ≥ 7.5%. Participants 50 years or older were allocated to the low-risk group if their absolute 10-year CVD risk was < 5%, to the medium risk group if their absolute 10-year CVD risk was < 10% risk, and to the high-risk group if their absolute 10-year CVD risk was ≥ 10%.
The three models adding NMR scores and/or PRSs to SCORE2 were assessed in comparison to SCORE2 alone in categorical NRI analysis using the nricens R package version 1.6. Categorical NRI analysis was used to assess (1) the % of incident CVD cases correctly reclassified from a lower risk group into higher risk group, and (2) the % of non-cases correctly reclassified from a higher risk group into a lower risk group. Categorical NRI analysis was performed separately in all participants (Figure 3C), male participants (Figure S5C), and female participants (Figure S5C). Bootstrap resampling of the categorical NRI analysis was performed using the nricens R package, and 95% confidence intervals and P-values were subsequently calculated from the bootstrap standard error (Figure 3C, Figure S5C, Table S10).
Population simulation
For downstream analyses we simulated a hypothetical population of approximately 100,000 individuals with age- and sex-structure and expected 10-year CVD incidence rates representative of the UK population eligible for 10-year CVD risk screening with SCORE2 (Figure S8, Table S11).
Sex-specific CVD incidence rates expected in each five-year age-group (Figure S8) were obtained from Table A in the S2 Text in Sun et al 202111. These CVD incidence rates were obtained by Sun et al. from a random sample of 2.1 million people amongst 11.3 million CVD- and statin-free primary care patients 35– 74 years of age who, between 2004–2017, attended any of 674 general practices opting into data linkage to the UK Clinical Practice Research Datalink (CPRD)62. The published per-1000-year CVD incidence rates in the five-years ahead (Table A in the S2 Text in Sun et al 202111; Figure S8) were converted into percentages of males and females expected to have an incident CVD event within 10 years (Figure S8) as 1−exp(rate/1000). Numbers of CVD- and statin-free individuals in each sex- and age-group in CRPD were not published4,11. Therefore, we derived the sizes of each age- and sex-group using the age- and sex-structure of the general UK population in the mid-2020 UK population estimates published by the Office of National Statistics30.
The simulated population was derived by multiplying by 100,000 the relative sizes of each age-group and sex among those 40–69 in the mid-2020 UK population estimate, then multiplying by the percentages expected to have incident CVD within 10 years derived from CPRD (Figure S8, Table S11).
Incremental value for CVD prevention with population-wide screening
To assess the relative benefits for CVD prevention of each model if applied at scale to the UK population (Figure 4A, Table S12A) we stratified the simulated population into the low-, medium-, and high-risk groups based on the proportions allocated to each category in UK Biobank by SCORE2 alone (Figure S9A), SCORE2 + NMR scores, SCORE2 + PRSs, and SCORE2 + NMR scores + PRSs. We modelled statin initiation in the high-risk group, who based on their risk thresholds would be recommended for risk factor treatment by the ESC 2021 guidelines for CVD prevention3. The impact of statin initiation was modelled as preventing one in five simulated incident CVD events assuming a 20% reduction in 10-year CVD risk31.
Benefits for CVD prevention were quantified for each model using five statistics: (1) the number of people classified as high-risk (Nhigh-risk), (2) the number of incident CVD events amongst those classified as high-risk (CVDhigh-risk), (3) the expected number of events prevented by statin initiation in the high-risk group (CVDprevented), (4) the number of people needed to screen to prevent one CVD event (NNS; calculated as N/CVDprevented), and (5) the number of statins prescribed per CVD event prevented (NNT; calculated as CVDhigh-risk/CVDprevented). Incremental benefits for CVD prevention were assessed by differences in these five statistics from SCORE2 alone (ΔNhigh-risk, ΔCVDhigh-risk, ΔCVDprevented, ΔNNS, and ΔNNT). 95% confidence intervals and P-values were calculated from the bootstrap standard error estimated for each statistic in a bootstrap procedure with 1,000 bootstraps. Bootstrap resampling was performed using methods appropriate for right-censored data58; using the censboot function in the boot R package version 1.3-28.
Incremental value for CVD prevention with targeted screening
Incremental benefits to CVD prevention were further modelled when using SCORE2 + NMR scores, SCORE2 + PRSs, or SCORE2 + NMR scores + PRSs for targeted risk-reclassification of those classified as medium risk by SCORE2 alone (Figure 4B, Table S12B), for whom the ESC 2021 guidelines suggest considering, but do not explicitly recommend, risk factor treatment3. For targeted-screening, the SCORE2-classified medium-risk subset of the simulated population was re-stratified into the low-, medium, and high-CVD risk groups based on proportions re-stratified into each category in UK Biobank by each alternative model (Figure S9B, Methods). After targeted screening, the high-risk group comprised those classified as high-risk either by SCORE2 alone or the alternative model incorporating NMR scores and/or PRSs. Benefits for CVD preventing were quantified as described above.
Ethics statement
This study was approved under UK Biobank Projects 30418 and ethics approval was obtained from the North West Multi-Center Research Ethics Committee. UK Biobank participants provided written informed consent for health related research48.
Data Availability
All data described are available through UK Biobank subject to approval from the UK Biobank access committee. See https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access for further details.
Code Availability
Code underlying this paper are available at https://github.com/sritchie73/UKB_NMR_CVD_prediction/. This repository and specific release for this paper are permanently archived by Zenodo at https://doi.org/10.5281/zenodo.10059234.
Competing Interests
During the course of this project P.S. became a full-time employee of GSK Plc. All significant contributions to this study were made prior to this role and GSK Plc had no input to the study. J.D. serves on scientific advisory boards for AstraZeneca, Novartis, and UK Biobank, and has received multiple grants from academic, charitable and industry sources outside of the submitted work. A.S.B. reports institutional grants from AstraZeneca, Bayer, Biogen, BioMarin, Bioverativ, Novartis, Regeneron and Sanofi. The remaining authors declare no competing interests.
Acknowledgements
The authors are grateful to UK Biobank for access to data to undertake this study (Projects #30418).
Nightingale Health Plc is acknowledged for early access to the UK Biobank NMR biomarker data and discussions regarding sources of experimental variation.
This work was performed using resources provided by the Cambridge Service for Data Driven Discovery (CSD3) operated by the University of Cambridge Research Computing Service (www.csd3.cam.ac.uk), provided by Dell EMC and Intel using Tier-2 funding from the Engineering and Physical Sciences Research Council (capital grant EP/P020259/1), and DiRAC funding from the Science and Technology Facilities Council (www.dirac.ac.uk).
This work was supported by core funding from the: Cambridge BHF Centre of Research Excellence (RE/18/1/34212) and BHF Chair Award (CH/12/2/29428).
S.C.R. and S.K. were funded by a British Heart Foundation (BHF) Programme Grant (RG/18/13/33946).
S.C.R. was also funded by the National Institute for Health and Care Research (NIHR) Cambridge BRC (BRC-1215-20014; NIHR203312) [*]. X.J. was funded by British Heart Foundation (CH/12/2/29428) and Wellcome Trust (227566/Z/23/Z). L.P. and P.S. were supported by a Rutherford Fund Fellowship from the Medical Research Council grant MR/S003746/1. Y.X. and M.I. were supported by the UK Economic and Social Research Council (ES/T013192/1). S.A.L. was supported by a Canadian Institutes of Health Research postdoctoral fellowship (MFE-171279). E.D.A. holds a NIHR Senior Investigator Award. J.D. holds a BHF Professorship and a NIHR Senior Investigator Award. M.I. is supported by the Munz Chair of Cardiovascular Prediction and Prevention and the NIHR Cambridge Biomedical Research Centre (NIHR203312).
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
*The views expressed are those of the authors and not necessarily those of the NIHR or the Department of Health and Social Care.
Footnotes
Corrected typo in one of the corresponding author's email address in the manuscript and supplementary information files.