Abstract
Background Thoracic aortic disease is an important cause of morbidity and mortality in the US, and aortic diameter is a heritable contributor to risk. Could a polygenic prediction of ascending aortic diameter improve detection of aortic aneurysm?
Methods Deep learning was used to measure ascending thoracic aortic diameter in 49,939 UK Biobank participants. A genome-wide association study (GWAS) was conducted in 39,524 participants and leveraged to build a 1.1 million-variant polygenic score with PRScs-auto. Aortic diameter prediction models were built with the polygenic score (“AORTA Gene”) and without it. The models were tested in a held-out set of 4,962 UK Biobank participants and externally validated in 5,469 participants from Mass General Brigham Biobank (MGB), 1,298 from the Framingham Heart Study (FHS), and 610 participants from All of Us.
Results In each test set, the AORTA Gene model explained more of the variance in thoracic aortic diameter compared to clinical factors alone: 39.9% (95% CI 37.8-42.0%) vs 29.2% (95% CI 27.1-31.4%) in UK Biobank, 36.5% (95% CI 34.4-38.5%) vs 32.5% (95% CI 30.4-34.5%) in MGB, 41.8% (95% CI 37.7-45.9%) vs 33.0% (95% CI 28.9-37.2%) in FHS, and 34.9% (95% CI 28.8-41.0%) vs 28.9% (95% CI 22.9-35.0%) in All of Us. AORTA Gene had a greater AUROC for identifying diameter ≥4cm in each test set: 0.834 vs 0.765 (P=7.3E-10) in UK Biobank, 0.808 vs 0.767 in MGB (P=4.5E-12), 0.856 vs 0.818 in FHS (P=8.5E-05), and 0.827 vs 0.791 (P=7.8E-03) in All of Us.
Conclusions Genetic information improved estimation of thoracic aortic diameter when added to clinical risk factors. Larger and more diverse cohorts will be needed to develop more powerful and equitable scores.
Introduction
Thoracic aortic disease is an important cause of morbidity and mortality1, 2. Ascending thoracic aortic enlargement is a well-established risk factor for ascending aortic dissection3, 4. Indeed, the majority of dissections occur in individuals with aortic diameter ≥4cm, including more than 90% in the International Registry of Aortic Dissection and 100% in Kaiser4, 5. Contemporary guidelines do not describe a role for population screening of thoracic aortic aneurysm6, and universal imaging would likely be impractical.
These limitations have spurred the development of clinical risk scores to identify individuals who are likely to have aortic enlargement on confirmatory imaging7–9. These scores have been shown to explain approximately 30% of the variance in ascending aortic diameter, with an area under the receiver operator characteristic curve (AUROC) near 0.78 for detecting individuals with diameter ≥4cm. Population genetics studies suggest that ascending aortic diameter is highly heritable, with a proportion of variance attributable to common genetic factors of over 60%10. We therefore sought to assess whether incorporating polygenic risk might improve estimation of ascending aortic diameter over clinical factors alone.
Methods
Study design
Model development and internal validation were conducted in UK Biobank, with external validation within the All of Us cohort, the Mass General Brigham Biobank (MGB), and the Framingham Heart Study (FHS). This study was reported in accordance with the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) statement11.
Study protocols complied with the tenets of the Declaration of Helsinki. All UK Biobank participants provided written informed consent12, and only those participants who had not withdrawn consent as of March 7, 2023 were analyzed. The UK Biobank analyses were considered exempt by the UCSF Institutional Review Board (IRB), #22-37715. UK Biobank analyses were conducted under application #41664. Each All of Us biobank participant provided written informed consent13. The All of Us analyses were considered exempt by the UCSF IRB, #22-37715. The MGB study protocols were approved with a waiver of informed consent by the Mass General Brigham IRB. All FHS participants provided written informed consent, and the FHS imaging analyses were previously approved by the IRB of the Boston University Medical Center14. The FHS analyses were also approved by the MGB IRB.
Study populations
In UK Biobank participants, ascending aortic diameter was measured from magnetic resonance imaging (MRI) using a previously described deep learning model15. Model development and internal validation were conducted in UK Biobank. External validation of the models was pursued in MGB participants with ascending aortic diameter measured from TTE for clinical indications16; in FHS participants with ascending aortic diameter measured for research purposes from non-contrast computed tomography (CT)14; and in All of Us biobank participants who had clinical aortic diameter measurements from transthoracic echocardiography (TTE)13. In MGB, race or ethnicity was extracted from the electronic health record16 and the validation analyses were repeated in individuals identified as Black.
Modality-specific differences
The clinical approach to measuring ascending aortic diameter differs by modality. UK Biobank MRI-based aortic diameter measurements were derived from the blood pool diameter only, incorporating neither aortic wall. For TTE, used in the MGB and All of Us cohorts, standard measurements use the “leading edge to leading edge” approach, incorporating one wall. For non-contrast CT measurements, used in FHS, both walls are incorporated into the diameter.
Splitting the UK Biobank for training and validation
49,939 UK Biobank participants with MRI measurements of aortic diameter were randomly assigned a number from 0-999. Those from 0-799 contributed to the genome- wide association study (GWAS) used to produce the polygenic score (N=39,524). Those from 800-999 related within 3 degrees of kinship to the 0-799 group were excluded to avoid polygenic score overfitting (N=557). Those remaining from 0-899 contributed to the development of non-genetic models (N=44,420). Those from 800-899 were included in the derivation of weights for the polygenic score component of the AORTA Gene combined clinical and genetic model (N=4,896). Those ≥900 were analyzed for validation (N=4,962; Figure 1).
Thoracic aortic aneurysm in All of Us
Within All of Us, thoracic aortic aneurysm was defined as the first occurrence of SNOMED codes 433068007, 74883004, or 426948001; ICD10CM codes of I71.01, I71.1, I71.2, or I77.810; ICD9CM codes of 441.01, 441.1, 441.2, or 447.71; ICD10PCS codes of 02RX0JZ or 02RW0JZ; the ICD9Proc code 39.73; or CPT4 codes 33863, 75957, 33880, 75956, or 33881. Prevalent analyses were conducted based on the presence of disease labels prior to the time of enrollment. Time-to-event analyses were conducted beginning at the time of enrollment and censored on July 1, 2022 using the R survival package, excluding participants whose diagnoses occurred prior to enrollment. Any tabular data with an All of Us sample count <20 was modified in accordance with All of Us policies.
Genome-wide association study and polygenic scores
Common genetic contributions to ascending aortic diameter in the 39,524 GWAS participants were discovered using REGENIE v3.2.717 with imputed variants provided by UK Biobank18, 19. GWAS covariates included age, age2, sex, the MRI serial number, the genotyping array, and the first ten principal components of ancestry. The summary statistics were trimmed to the ∼1.1 million UK Biobank-specific variants within HapMap3 identified by the PRScs authors, and underwent Bayesian weighting by PRScs-auto on its default settings to produce polygenic score weights20.
The polygenic score was applied to all UK Biobank participants with imputed genetic data19. Variants with imputation quality <0.3 were removed.
The polygenic score was applied to all MGB participants with imputed genetic data16, 21 with imputation into the TOPMed panel22.
The polygenic score was applied to FHS participants with CT and genetic data available. For FHS, genotyping was done on the Affymetrix GeneChip 500K Array Set & 50K Human Gene Focused Panel (Affymetrix, Santa Clara, CA, USA). Variants with call rate <97% or Hardy–Weinberg equilibrium P<1E-06 were excluded. The remaining variants were then imputed to the 1000 Genomes Project phase I release 3 panel by MACH version 1.023, 24. Variants with imputation quality <0.3 were removed.
The polygenic score was applied to All of Us participants with whole genome sequencing data available in release “R7”25.
Within each set of participants, the polygenic score was residualized for the first 20 principal components of ancestry (except for FHS, where 10 were used due to availability). The polygenic score in each group was zero-centered and scaled by the standard deviation from the UK Biobank training set: for studies that applied the average polygenic score (UK Biobank, All of Us) the standard deviation was 3.4E-07, and for those that applied the summed polygenic score (MGB, FHS) the standard deviation was 0.38.
Ascending aortic diameter model training
Hierarchical group least absolute shrinkage and selection operator models were built to estimate ascending aortic diameter using the R package glinternet26. Three models were constructed with glinternet v1.0.11 using the following independent variables: (a) age, age2, and sex; (b) age, age2, sex, and polygenic score; (c) age, age2, body mass index (BMI), heart rate, systolic and diastolic blood pressure, height, weight, sex, and a history of diabetes, hypertension, or hyperlipidemia. A fourth model (d) was constructed by taking a linear combination of the model (c) and the polygenic score. The weighting for model (c) was estimated in all participants randomly assigned to 0-899, while the weighting for model (b) and the linear combination for model (d) were estimated in those assigned to 800-899 to avoid overlap with the GWAS cohort. For ease of reference, we label model (c) “AORTA Score” (AORTA: optimized regression for thoracic aneurysm) and model (d) “AORTA Gene”.
Statistical analysis
All statistical significance tests were two-sided, with significance defined as P < 0.05, and performed in R 4.2.3 unless otherwise stated. For visualization, scores were plotted against measured thoracic aortic diameter.
Outcomes
The primary outcome was correlation between the scores and ascending aortic diameter using linear models, expressed as R2 (variance explained). Secondary outcomes included tests of calibration, and the performance of the scores for identifying ascending aortic diameter ≥4.0cm; the clinical diagnosis of aneurysm was also tested in All of Us.
To assess calibration, each score was used as an independent variable in a linear model predicting aortic diameter that also had an intercept, permitting significance testing of both the slope and the intercept. Mean absolute error was also computed.
To assess the performance for identifying diameter ≥4cm, the AUROC was computed. Confusion matrices and their derived statistical measures were produced based on the presence or absence of aortic diameter ≥4cm and the presence or absence of predicted diameters above a fixed threshold determined in the training set for each score. The thresholds for the confusion matrix analysis were defined in the 4,896 participants in the training set.
AUROCs were compared using the DeLong test27. The continuous net reclassification index (NRI) was calculated following the formula of Pencina, et al28.
Data and code availability
Computed scores are returned to UK Biobank, where data are made available by UK Biobank to researchers from research institutions with genuine research inquiries, following IRB and UK Biobank approval. FHS data are made available to researchers with approved research applications. The dbGAP study accession number used for FHS validation was #phv00076329.v1.p5 for ascending aortic diameter. MGB data are available to MGB investigators; external collaboration requests can be initiated through https://biobank.massgeneralbrigham.org/for-researchers. The complete set of AORTA Score covariates and their weights are available as R programs at github.com/carbocation/genomisc and the polygenic score weights will be available on the Polygenic Score Catalog (pgscatalog.org).
Results
Clinical characteristics
Among the 4,896 UK Biobank participants whose data contributed to training the genetic models, 2,489 were women and 2,407 were men. The ascending aortic diameter was 3.05 ± 0.31cm for women and 3.34 ± 0.34cm for men. The 4,962 internal validation set participants had similar measurements (Table 1; Figure 1). The MGB validation set consisted of 5,469 participants with genetic data and aortic diameter measured from TTE. The FHS validation set consisted of 1,298 participants with genetic data and aortic diameter measured from non-contrast CT. The All of Us validation set consisted of 610 participants 40 years and older with genetic data and aortic diameter measured from TTE; 60% identified as female and 88% as having non-Hispanic white ethnicity.
Model validation for aortic diameter
Testing calibration, the slope of the AORTA Gene model was statistically consistent with one in UK Biobank (P=0.2), MGB (P=1.0), FHS (P=0.6), and All of Us (P=0.4) (Figure 2). The intercept was consistent with zero in UK Biobank (P=0.2), MGB (P=0.4), and All of Us (P=0.8), but not FHS (intercept estimate 0.26cm, P=0.01). The same pattern was observed for the clinical AORTA Score. In contrast, the model incorporating only age, sex, and the polygenic score had a calibration slope inconsistent with one in most cohorts outside of UK Biobank (eTable AA).
The AORTA Gene model’s mean absolute error (MAE) for ascending aortic diameter was 0.212cm (95% CI 0.207-0.217cm) in UK Biobank, 0.282cm (95% CI 0.275- 0.289cm) in MGB, 0.344cm (95% CI 0.331-0.358cm) in FHS, and 0.293cm (95% CI 0.273-0.313cm) in All of Us. The clinical AORTA Score had an inferior MAE in all cohorts: 0.232cm (95% CI 0.227-0.237cm) in UK Biobank, 0.291cm (95% CI 0.284- 0.298cm) in MGB, 0.351cm (95% CI 0.337-0.365cm) in FHS, and 0.306cm (95% CI 0.285-0.327cm) in All of Us. The age/sex/genetics model had an MAE of 0.226cm (95% CI 0.221-0.231cm) in UK Biobank, 0.293cm (95% CI 0.286-0.300cm) in MGB, 0.346cm (95% CI 0.332-0.360cm) in FHS, and 0.318cm (95% CI 0.296-0.340) in All of Us.
The AORTA Gene model explained 39.9% of the variance in ascending aortic diameter (99% CI 37.8-42.0%; P=2.6E-551) in the UK Biobank validation set, 36.5% (95% CI 34.4-38.5%, P=2.4E-541) in MGB, 41.8% (95% CI 37.7-45.9%; P=3.5E-154) in FHS, and 34.9% (95% CI 28.8-41.0%; P=1E-58) in All of Us (Figure 3). In comparison, the clinical AORTA Score explained 29.3% of the variance in UK Biobank (99% CI 27.1- 31.4%), 32.5% in MGB (95% CI 30.4-34.5%), 33.0% (95% CI 28.9-37.2%) in FHS, and 28.9% (95% CI 22.9-35.0%) in All of Us. The age/sex/genetics model explained 31.8% (95% CI 29.6-33.9%) of variance in UK Biobank, 33.2% (95% CI 31.2-35.3%) in MGB, 33.3% (95% CI 29.1-37.5%) in FHS, and 31.0% (95% CI 24.9-37.1%) in All of Us.
Model validation for a 4cm aortic diameter threshold
113 of the 4,962 internal validation set participants in UK Biobank had aortic diameter ≥4cm (2.3%), as did 335 of 5,469 MGB participants (6.1%), 116 of 1,298 FHS participants (8.9%), and 65 of 610 All of Us participants (10.7%).
The AUROC for detecting diameter ≥4cm for the AORTA Gene model was 0.834 (95% CI 0.798-0.869) in UK Biobank, 0.808 (95% CI 0.783-0.832) in MGB, 0.856 (95% CI 0.821-0.891) in FHS, and 0.827 (95% CI 0.776-0.878) in All of Us (Figure 4; eTable BB).
For the clinical AORTA Score, the AUROC was 0.765 (95% CI 0.723-0.807) in UK Biobank, 0.767 (95% CI 0.741-0.792) in MGB, 0.818 (95% CI 0.776-0.859) in FHS, and 0.791 (95% CI 0.735-0.846) for All of Us. The AUROC of AORTA Gene was significantly greater in all cohorts: P=7.3E-10 against a null hypothesis of no difference in UK Biobank, P=4.5E-12 in MGB, P=8.5E-05 in FHS, and P=7.8E-03 in All of Us (eTable CC).
The age/sex/genetics model had an AUROC of 0.805 (95% CI 0.767-0.843) in UK Biobank, 0.805 (95% CI 0.781-0.829) in MGB, 0.819 (95% CI 0.781-0.856) in FHS, and 0.813 (95% CI 0.757-0.870) in All of Us. The AORTA Gene AUROC was significantly greater in UK Biobank (P=0.03) and FHS (P=3.9E-03), but not in MGB (P=0.7) or All of Us (P=0.5) (eTable CC).
The continuous NRI was positive for AORTA Gene over the clinical AORTA Score and the age/sex/genetics model in all cohorts (eTable CC). Decomposed into its NRI(event) and NRI(nonevent) components, AORTA Gene had a greater NRI(event) in all cohorts compared to the AORTA Score and the age/sex/genetics model. However, its NRI(nonevent) value was lower than that of the AORTA Score in MGB and UK Biobank, and lower than that of the age/sex/genetics model in MGB and All of Us.
Score threshold demonstrations
To evaluate the consequences of various score thresholds whereby a portion of the population could be brought forward for confirmatory thoracic imaging, two thresholds were defined within the UK Biobank training set and applied in the validation sets: one targeted at the top 10% of the population, and the other designed to have a sensitivity of 90% for aortic diameter ≥4cm (eTable BB).
For the target top 10% AORTA Gene cutoff, 485 (9.8%) of 4,962 participants were above this threshold in UK Biobank, 640 (11.7%) of 5,469 in MGB, 99 (7.6%) of 1,295 in FHS, and 97 (15.9%) of 610 in All of Us. 59 of 113 UK Biobank participants with diameter ≥4cm were correctly classified as enlarged (sensitivity 52.2%), as were 162 of 335 in MGB (48.4%), 47 of 116 in FHS (40.5%), and 35 of 65 in All of Us (53.8%). The precision was 0.122 in UK Biobank, 0.253 in MGB, 0.475 in FHS, and 0.361 in All of Us, yielding respective F1 scores of 0.197, 0.332, 0.437, and 0.432 (eTable BB). A comparable threshold for the clinical AORTA Score flagged 484 (9.8%) of 4,962 UK Biobank participants, 722 (13.2%) of 5,469 in MGB, 92 (7.1%) of 1,295 in FHS, and 110 (18.0%) of 610 in All of Us. 41 of 113 UK Biobank participants with diameter ≥4cm were correctly classified as enlarged (sensitivity 36.3%) as were 143 of 335 in MGB (42.7%), 38 of 116 in FHS (32.8%), and 33 of 65 in All of Us (50.8%). The precision was 0.85 in UK Biobank, 0.198 in MGB, 0.413 in FHS, and 0.300 in All of Us, yielding respective F1 scores of 0.137, 0.271, 0.365, and 0.377 (eTable BB).
For the target 90% sensitive AORTA Gene cutoff, 1,737 (35.0%) of 4,962 UK Biobank participants were above this threshold, 1,943 (35.5%) of 5,469 in MGB, 336 (25.9%) of 1,295 in FHS, and 247 (40.5%) of 610 in All of Us. 92 of 113 UK Biobank participants with diameter ≥4cm were correctly classified as enlarged (sensitivity 81.4%), as were 258 of 335 in MGB (77.0%), 88 of 116 in FHS (75.9%), and 56 of 65 in All of Us (86.2%). The precision was 0.053 in UK Biobank, 0.133 in MGB, 0.262 in FHS, and 0.227 in All of Us, yielding respective F1 scores of 0.099, 0.227, 0.389, and 0.359 (eTable DD). A comparable threshold for the clinical AORTA Score flagged 2,006 (40.4%) of 4,962 UK Biobank participants, 2,194 (40.1%) of 5,469 in MGB, 405 (31.3%) of 1,295 in FHS, and 266 (43.6%) of 610 in All of Us. 87 of 113 UK Biobank participants with diameter ≥4cm were correctly classified as enlarged (sensitivity 77.0%), as were 264 of 335 in MGB (78.8%), 86 of 116 in FHS (74.1%), and 52 of 65 in All of Us (80.0%). The precision was 0.043 in UK Biobank, 0.120 in MGB, 0.212 in FHS, and 0.195 in All of Us, yielding respective F1 scores of 0.082, 0.209, 0.330, and 0.314 (eTable DD).
Thoracic aortic aneurysm diagnosis in All of Us
In the 164,789 All of Us participants aged 40 or older with genetic data, 1,904 had an electronic health record (EHR)-based diagnosis of thoracic aortic aneurysm prior to enrollment. AORTA Gene’s AUROC was 0.760 (95% CI 0.750-0.771) compared to 0.739 (95% CI 0.728-0.750) for the AORTA Score (eTable EE), P=2.4E-16 against a null hypothesis of no difference (eTable FF). For incident disease (1,632 cases diagnosed after enrollment), AORTA Gene’s AUROC was 0.748 (95% CI 0.737-0.759) compared to 0.729 (95% CI 0.718-0.741) for the AORTA Score, P=9.5E-10 against a null hypothesis of no difference (eTables EE-FF).
Analysis in Black individuals in MGB
In MGB, 340 individuals were identified as Black through the electronic health record16. In these individuals, AORTA Gene had an intercept consistent with zero (P=0.5) and a slope consistent with one (P=0.5). The MAE was 0.297cm (95% CI 0.272-0.321cm), and the model explained 34.8% of variance in ascending aortic diameter (95% CI 26.7- 43.0%). In comparison, for the clinical AORTA Score these values were 0.294cm (95% CI 0.269-0.319cm) and 34.3% (95% CI 26.2-42.5%) (eTable AA). The AUROC for detecting diameter ≥4cm for the AORTA Gene model was 0.858 (eTable BB), nominally greater than that of the clinical AORTA Score (AUROC 0.820, P=0.04 for the difference, eTable CC), with a positive continuous NRI (0.473).
Discussion
Ascending thoracic aortic diameter has a SNP heritability estimated to be greater than 60%10, which makes the incorporation of common-variant genetic data for the presymptomatic detection of individuals with aortic enlargement or aneurysm conceptually appealing. But whether aortic polygenic scores encode information not already latently captured through correlation with clinical covariates was unknown. Here, we observed that AORTA Gene, which incorporated both an aortic polygenic score and clinical covariates, improved estimation of ascending aortic diameter and identification of individuals with diameter ≥4cm in four cohorts—UK Biobank, MGB, FHS, and All of Us—when compared to a previously validated model for aortic diameter built from the same clinical covariates.
For the estimation of aortic diameter as a quantitative measure, a clinical model built from the same covariates as the AORTA Score9 explained 29.3% of the variance in ascending aortic diameter in UK Biobank, 28.9% in All of Us, 32.5% in MGB, and 33.0% in FHS. The AORTA Gene model that additionally incorporated the aortic diameter polygenic score increased those values to 39.9% in UK Biobank, 34.9% in All of Us, 36.5% in MGB, and 41.8% in FHS. These represent a 10.6% absolute improvement in explained variance (36% relative improvement) in UK Biobank, 4% (12.3% relative improvement) in MGB, 8.8% (27% relative improvement) in FHS, and 6% (21% relative improvement) in All of Us. These substantive improvements from the polygenic score over a clinical model are comparable in magnitude to the addition of key clinical predictors over age and sex alone9.
Incorporation of the polygenic score also improved identification of individuals with clinically relevant aortic enlargement (i.e., diameter ≥4cm). The clinical model had an AUROC of 0.765 in UK Biobank, 0.791 in All of Us, 0.767 in MGB, and 0.818 in FHS; these respectively improved to 0.834, 0.827, 0.808, and 0.856 for AORTA Gene. The AUROC is independent of the choice of score threshold, so we also demonstrated two example thresholds: one based on the goal of being 90% sensitive for diameter ≥4cm, and another based on a scenario where there are resources to image up to 10% of the population. In both scenarios, for all four cohorts, sensitivity and specificity were modestly improved with the incorporation of the polygenic score. Interestingly, we also observed that, compared to the comprehensive clinical score, a simpler model consisting only of age, sex, and the polygenic score performed similarly. Because such a model can be estimated for any future time from birth, this may have implications for attempts at early risk stratification before the onset of clinical risk factors. We also noted that, for the task of identifying All of Us participants with an EHR-based diagnosis of aneurysm, all models had attenuated performance—likely in part due to limited ascertainment of thoracic aortic disease in current clinical practice. Nevertheless, even for this task, the AORTA Gene model remained superior to the clinical score (P=9.5E- 10).
We also note two limitations of the current model that point to future directions: first, despite a heritability of nearly 60%, the current polygenic score for thoracic aortic diameter derived from a GWAS of 39,524 participants added approximately 10% to the variance already explained by a clinical model, suggesting that it accounted for approximately 17% of the heritability of aortic diameter. In contrast, polygenic scores for height—built from GWAS of millions of participants—are nearly saturated29. Larger GWAS sample sizes are required to increase the variance explained by polygenic estimates of ascending aortic diameter. Second, polygenic scores derived in largely European-ancestry populations do not capture the full range of human genetic diversity, limiting their utility in individuals with more diverse genetic identities30. To examine this, we evaluated 340 individuals in MGB identified as Black in the electronic health record16. In these individuals, the addition of the polygenic score did not improve the continuous estimation of aortic diameter compared to the clinical score (R2 34.8% for AORTA Gene vs 34.3% for the clinical AORTA Score), although it did nominally improve identification of individuals with diameter ≥4cm (AUROC 0.858 vs 0.820, P=0.04). We expect a key problem from incorporating genetic data to be one of inequity due to differential accrual of benefits to individuals with European genetic identities. Larger and more diverse GWAS are required to improve genetic predictions for people across diverse genetic ancestries.
Anticipation of the relevant clinical scenario for the application of polygenic scores is important for placing the current findings into context. The cost of acquiring a targeted set of images of the ascending thoracic aorta may be comparable that of obtaining genetic information. Therefore, we expect that the value of polygenic scores for primary prevention will be realized within the context of a healthcare system that incorporates genetic information as part of the prior probability across many diseases and risk factors, where the cost of one-time genotyping would be amortized over all downstream uses. In such a system, might there be a role, in principle, for the incorporation of a genetic estimator of thoracic aortic diameter into a comprehensive model to identify people who are likely to have ascending aortic enlargement? The results from the present analysis suggest so.
Limitations
While all participants were incorporated into analyses without exclusion for race, ethnicity, or ancestry, the participants in UK Biobank, MGB, FHS, and All of Us with thoracic imaging predominantly had genetic identities similar to that of Europeans. None of the models described in this manuscript are anticipated to be effective for identifying individuals with aortic enlargement driven by rare pathogenic variants, which are not captured in polygenic scores constructed from common variants. The models were derived and tested in individuals over the age of 40 years. The attenuation in variance explained in the external cohorts compared to UK Biobank may be, in part, a consequence of cryptic relatedness causing overfitting to the UK Biobank cohort; reducing this discrepancy is an area of interest for future efforts. Future efforts are also needed to understand whether aortic prediction models can improve outcomes.
Conclusion
Integrating genetic information into a validated clinical model improved estimation of ascending aortic diameter and the identification of individuals with thoracic aortic aneurysm.
Data Availability
Computed scores are returned to UK Biobank, where data are made available by UK Biobank to researchers from research institutions with genuine research inquiries, following IRB and UK Biobank approval. FHS data are made available to researchers with approved research applications. The dbGAP study accession number used for FHS validation was #phv00076329.v1.p5 for ascending aortic diameter. MGB data are available to MGB investigators; external collaboration requests can be initiated through https://biobank.massgeneralbrigham.org/for-researchers. Upon publication, the complete set of AORTA Score covariates and their weights will be available as R programs at github.com/carbocation/genomisc and the polygenic score weights will be available on the Polygenic Score Catalog (pgscatalog.org).
Sources of funding
Organizations that provided financial support had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.
Dr. Pirruccello is supported by National Institutes of Health (NIH) K08HL159346. Dr. Lin is supported by the NIH grant U01AG068221. Dr. Kany is supported by the Walter- Benjamin Fellowship from the Deutsche Forschungsgemeinschaft (521832260). Dr. Raghavan is supported by the John S. LaDue Memorial Fellowship in Cardiovascular Medicine from Harvard Medical School. Dr. Benjamin is supported by NIH grants R01HL128914, R01HL092577, R01HL141434, and U54HL120163; and American Heart Association (AHA) grant 18SFRN34110082. Dr. Lindsay is supported by the Fredman Fellowship for Aortic Disease and the Toomey Fund for Aortic Dissection Research and has received salary support from Bayer AG. Dr. Ellinor is supported by grants from the NIH (R01HL092577, 1R01HL157635), from the AHA Strategically Focused Research Networks (18SFRN34110082), and from the European Union (MAESTRIA 965286).
The All of Us Research Program is supported by the National Institutes of Health, Office of the Director: Regional Medical Centers: 1 OT2 OD026549; 1 OT2 OD026554; 1 OT2 OD026557; 1 OT2 OD026556; 1 OT2 OD026550; 1 OT2 OD 026552; 1 OT2 OD026553; 1 OT2 OD026548; 1 OT2 OD026551; 1 OT2 OD026555; IAA #: AOD 16037; Federally Qualified Health Centers: HHSN 263201600085U; Data and Research Center: 5 U2C OD023196; Biobank: 1 U24 OD023121; The Participant Center: U24 OD023176; Participant Technology Systems Center: 1 U24 OD023163; Communications and Engagement: 3 OT2 OD023205; 3 OT2 OD023206; and Community Partners: 1 OT2 OD025277; 3 OT2 OD025315; 1 OT2 OD025337; 1 OT2 OD025276. In addition, the All of Us Research Program would not be possible without the partnership of its participants.
From the Framingham Heart Study of the National Heart Lung and Blood Institute of the National Institutes of Health and Boston University School of Medicine, this project has been funded in whole or in part with Federal funds from the National Heart Lung and Blood Institute, Department of Health and Human Services, under Contract No. 75N92019D00031.
Disclosures
Dr. Ellinor has received sponsored research support from Bayer AG, IBM Health, Bristol Myers Squibb and Pfizer. Dr. Ellinor has also served on advisory boards or consulted for Bayer AG, MyoKardia and Novartis. The Broad Institute has filed for a patent on an invention from Drs. Ellinor, Lindsay, and Pirruccello related to a previous genetic risk predictor for aortic disease. Remaining authors report no disclosures.
Acknowledgments
JPP had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. JPP and PTE conceived of the study. JPP conducted the genetic analyses, developed the models, validated the models, and conducted bioinformatic analyses in the UK Biobank, MGB, and All of Us. HL validated the models in FHS. LCW applied the polygenic score in MGB. S Koyama performed PCA in MGB. JPP and PTE wrote the paper. All other authors contributed to the analysis plan or provided critical revisions.