Abstract
Multiple germline and somatic genomic factors are associated with risk of coronary artery disease (CAD), but there is no single measure of risk that integrates all information from a DNA sample, limiting clinical use of genomic information. To address this gap, we developed an integrated genomic model (IGM), analogous to a clinical risk calculator that combines various clinical risk factors into a unified risk estimate. The IGM includes six genetic drivers for CAD, including germline factors (familial hypercholesterolemia [FH] variants, CAD polygenic risk score [PRS], proteome PRS, metabolome PRS) and somatic factors (clonal hematopoiesis of indeterminate potential [CHIP], and leukocyte telomere length [LTL]). We evaluated the IGM on CAD risk prediction in the UK Biobank (N=391,536), and validated it in the Trans-Omics for Precision Medicine (TOPMed) program (N=34,177). The 10-year CAD risk based on the IGM profile ranged from 1.1% to 15.5% in the UK Biobank and from 3.8% to 33.0% in TOPMed, with a more pronounced gradient in males than females. IGM captured the cumulative effect of multiple genetic drivers, identifying individuals at high risk for CAD despite lacking obvious high risk genetic factors, or individuals at low risk for CAD despite having known genetic risk variants such as FH and CHIP. The IGM had the highest performance in younger individuals (C-statistic 0.805 [95% CI, 0.699-0.913] for age ≤ 45 years). In middle age, IGM augmented the performance of the Pooled Cohort Equations (PCE), a clinical risk calculator for CAD. Adding IGM to PCE resulted in a continuous net reclassification index of 33.45% (95% CI, 32.11%-34.76%). We present the first model that integrates all currently available information from a single “DNA biopsy” to translate complex genetic information into a single risk estimate.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
Dr. Ellinor is supported by grants from the National Institutes of Health (R01HL092577, 1R01HL157635, 5R01HL139731), from the American Heart Association (18SFRN34110082, 961045) and from the European Union (MAESTRIA 965286). Dr. Natarajan is funded by grants R01HL1427, R01HL148565 R01HL148050, and U01HG011719 from the National Institutes of Health. Dr. Fahed is funded by grants K08HL161448 and R01HL164629 from the National Institutes of Health. Dr. de Vries is funded by R01HL146860 from the National Heart, Lung and Blood Institute (NHLBI). Dr. Wang is supported by the Pioneering Action Grants of the Chinese Academy of Sciences. Molecular data for the Trans Omics in Precision Medicine (TOPMed) program was supported by the NHLBI. Core support including centralized genomic read mapping and genotype calling, along with variant quality metrics and filtering were provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1; contract HHSN268201800002I). Core support including phenotype harmonization, data management, sample-identity QC and general program coordination was provided by the TOPMed Data Coordinating Center (R01HL-120393; U01HL-120393; contract HHSN268201800001I). We gratefully acknowledge the studies and participants who provided biological samples and data for TOPMed. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute, the National Institutes of Health, or the U.S. Department of Health and Human Services. We wish to acknowledge the contributions of the consortium working on the development of the NHLBI BioData Catalyst ecosystem. Support for the Genetic Epidemiology Network of Arteriopathy (GENOA) was provided by the National Heart, Lung and Blood Institute (U01 HL054457, U01 HL054464, U01 HL054481, R01 HL119443, and R01 HL087660) of the National Institutes of Health. DNA extraction for [NHLBI TOPMed: Genetic Epidemiology Network of Arteriopathy] (phs001345) was performed at the Mayo Clinic Genotyping Core, and WGS was performed at the DNA Sequencing and Gene Analysis Center at the University of Washington (3R01HL055673-18S1) and the Broad Institute (HHSN268201500014C). We would like to thank the GENOA participants. The Jackson Heart Study (JHS) is supported and conducted in collaboration with Jackson State University (HHSN268201800013I), Tougaloo College (HHSN268201800014I), the Mississippi State Department of Health (HHSN268201800015I) and the University of Mississippi Medical Center (HHSN268201800010I, HHSN268201800011I and HHSN268201800012I) contracts from the NHLBI and the National Institute on Minority Health and Health Disparities (NIMHD). Genome sequencing for [NHLBI TOPMed: The Jackson Heart Study] (phs000964.v1.p1) was performed at the Northwest Genomics Center (HHSN268201100037C). The authors also wish to thank the staffs and participants of the JHS. The MESA projects are conducted and supported by NHLBI in collaboration with MESA investigators. Support for the Multi-Ethnic Study of Atherosclerosis (MESA) projects are conducted and supported by the NHLBI in collaboration with MESA investigators. Support for MESA is provided by contracts 75N92020D00001, HHSN268201500003I, N01-HC-95159, 75N92020D00005, N01-HC-95160, 75N92020D00002, N01-HC-95161, 75N92020D00003, N01-HC-95162, 75N92020D00006, N01-HC-95163, 75N92020D00004, N01-HC-95164, 75N92020D00007, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169, UL1-TR-000040, UL1-TR-001079, UL1-TR-001420, UL1TR001881, DK063491, R01HL105756, and R01HL146860. Genome sequencing for [NHLBI TOPMed: Whole Genome Sequencing and Related Phenotypes in the Multi-Ethnic Study of Atherosclerosis Study (MESA)] (phs001416) was performed at Broad Institute of MIT and Harvard Genomics Platform (3U54HG003067-13S1). The authors thank the other investigators, the staff, and the participants of the MESA study for their valuable contributions. A full list of participating MESA investigators and institutes can be found at [http://www.mesa-nhlbi.org]. The Womens Health Initiative (WHI) program is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services through contracts 75N92021D00001, 75N92021D00002, 75N92021D00003, 75N92021D00004, 75N92021D00005. Genome sequencing for [NHLBI TOPMed: Whole Genome Sequencing and Related Phenotypes in the Womens Health Initiative Study (WHI)] (phs001237) was performed at Broad Institute of MIT and Harvard Genomics Platform (HHSN268201500014C). Support for the Diabetes Heart Study (DHS) by R01 HL92301, R01 HL67348, R01 NS058700, R01 AR48797, R01 DK071891, R01 AG058921, the General Clinical Research Center of the Wake Forest University School of Medicine (M01 RR07122, F32 HL085989), the American Diabetes Association, and a pilot grant from the Claude Pepper Older Americans Independence Center of Wake Forest University Health Sciences (P60 AG10484). Genome sequencing for [NHLBI TOPMed: The Diabetes Heart Study] (phs001412) was performed at the Broad Institute of MIT and Harvard Genomic Platform (HHSN268201500014C). The Atherosclerosis Risk in Communities study has been funded in whole or in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, Department of Health and Human Services, under Contract nos. (75N92022D00001, 75N92022D00002, 75N92022D00003, 75N92022D00004, 75N92022D00005). The authors thank the staff and participants of the ARIC study for their important contributions. Whole genome sequencing (WGS) for the Trans-Omics in Precision Medicine (TOPMed) program was supported by the National Heart, Lung and Blood Institute (NHLBI). WGS for [NHLBI TOPMed: Atherosclerosis Risk in Communities (ARIC)] (phs001211) was performed at the Baylor College of Medicine Human Genome Sequencing Center (HHSN268201500015C and 3U54HG003273-12S2) and the Broad Institute for MIT and Harvard (3R01HL092577- 06S1). Centralized read mapping and genotype calling, along with variant quality metrics and filtering were provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1). Phenotype harmonization, data management, sample-identity QC, and general study coordination, were provided by the TOPMed Data Coordinating Center (3R01HL- 120393- 02S1). We gratefully acknowledge the studies and participants who provided biological samples and data for TOPMed. The Genome Sequencing Program (GSP) was funded by the National Human Genome Research Institute (NHGRI), the National Heart, Lung, and Blood Institute (NHLBI), and the National Eye Institute (NEI). The GSP Coordinating Center (U24 HG008956) contributed to cross program scientific initiatives and provided logistical and general study coordination. The Centers for Common Disease Genomics (CCDG) program was supported by NHGRI and NHLBI, and whole genome sequencing was performed at the Baylor College of Medicine Human Genome Sequencing Center (UM1 HG008898). The COPDGene study (NCT00608764) is supported by grants from the NHLBI (U01HL089897 and U01HL089856), by NIH contract 75N92023D00011, and by the COPD Foundation through contributions made to an Industry Advisory Committee that has included AstraZeneca, Bayer Pharmaceuticals, Boehringer-Ingelheim, Genentech, GlaxoSmithKline, Novartis, Pfizer and Sunovion. A full listing of COPDGene investigators can be found at: [http://www.copdgene.org/directory]. Genome sequencing for [NHLBI TOPMed: Genetic Epidemiology of COPD Study] (phs000951) was performed at Northwest Genomics Center and Broad Genomics (3R01HL089856-08S1, HHSN268201500014C, HHSN268201500014C). This Cardiovascular Health Study research was supported by NHLBI contracts HHSN268201200036C, HHSN268200800007C, HHSN268201800001C, N01HC55222, N01HC85079, N01HC85080, N01HC85081, N01HC85082, N01HC85083, N01HC85086, 75N92021D00006; and NHLBI grants U01HL080295, R01HL087652, R01HL105756, R01HL103612, R01HL120393, and U01HL130114 with additional contribution from the National Institute of Neurological Disorders and Stroke (NINDS). Additional support was provided through R01AG023629 from the National Institute on Aging (NIA). Genome sequencing for [NHLBI TOPMed: Cardiovascular Health Study] (phs001368.v2.p1) was performed at the Baylor College of Medicine Human Genome Sequencing Center (3U54HG003273-12S2, HHSN268201500015C, HHSN268201600033I). A full list of principal CHS investigators and institutions can be found at CHS-NHLBI.org. The TOPMed component of the Amish Research Program was supported by NIH grants R01 HL121007, U01 HL072515, and R01 AG18728. Genome sequencing for NHLBI TOPMed: Amish (phs000956) was performed at the Broad Institute of MIT and Harvard (3R01HL121007-01S1). The Framingham Heart Study (FHS) acknowledges the support of contracts NO1-HC-25195, HHSN268201500001I and 75N92019D00031 from the National Heart, Lung and Blood Institute and grant supplement R01 HL092577-06S1 for this research. Genome sequencing for [NHLBI TOPMed: Whole Genome Sequencing and Related Phenotypes in the Framingham Heart Study (FHS)] (phs000974) was performed at Broad Institute of MIT and Harvard Genomics Platform (3U54HG003067-12S2). We also acknowledge the dedication of the FHS study participants without whom this research would not be possible. Dr. Vasan is supported in part by the Evans Medical Foundation and the Jay and Louis Coffman Endowment from the Department of Medicine, Boston University School of Medicine. GeneSTAR was supported by the National Institutes of Health/National Heart, Lung, and Blood Institute (U01 HL72518, HL087698, HL112064, HL49762, HL59684, HL58625, HL071025), by the National Institutes of Health/ National Institute of Nursing Research (NR0224103, NR008153), and by a grant from the National Institutes of Health/National Center for Research Resources (M01-RR000052) to the Johns Hopkins General Clinical Research Center. Genome sequencing for NHLBI TOPMed: GeneSTAR (Genetic Study of Atherosclerosis Risk)(phs001218) was performed at the Broad Institute of MIT and Harvard (HHSN268201500014C), at PsomaGen (formerly Macrogen, HHSN268201500014C), and at Illumina (HL112064). We gratefully acknowledge the studies and participants who provided biological samples and data for UK Biobank.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
All data are made available from the UK Biobank (https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access) to researchers from universities and other institutions with genuine research inquiries following institutional review board and UK Biobank approval. This research was conducted using the UK Biobank resource under Application Number 89885 and approved by Beijing Institute of Genomics review board. The weights of MetPRS and ProPRS are available in the Polygenic Score Catalog (IDs: PGS005093-PGS005094). This paper used the TOPMed whole genome sequencing (WGS) data and cardiovascular disease phenotype data. Genotype and phenotype data are both available in database of Genotypes and Phenotypes (dbGaP). The TOPMed WGS data were from the following eleven study cohorts: Amish, Atherosclerosis Risk in Communities Study (ARIC), Cardiovascular Health Study (CHS), Genetic epidemiology of COPD (COPDGene), Diabetes Heart Study (DHS), Framingham Heart Study (FHS), Genetic Study of Atherosclerosis Risk (GeneSTAR), Genetic Epidemiology Network of Arteriopathy (GENOA), Jackson Heart Study (JHS), Multi-Ethnic Study of Atherosclerosis (MESA), and Women's Health Initiative (WHI).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data availability
All data are made available from the UK Biobank (https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access) to researchers from universities and other institutions with genuine research inquiries following institutional review board and UK Biobank approval. This research was conducted using the UK Biobank resource under Application Number 89885 and approved by Beijing Institute of Genomics review board. The weights of MetPRS and ProPRS are available in the Polygenic Score Catalog (IDs: PGS005093-PGS005094). This paper used the TOPMed whole genome sequencing (WGS) data and cardiovascular disease phenotype data. Genotype and phenotype data are both available in database of Genotypes and Phenotypes (dbGaP). The TOPMed WGS data were from the following eleven study cohorts: Amish, Atherosclerosis Risk in Communities Study (ARIC), Cardiovascular Health Study (CHS), Genetic epidemiology of COPD (COPDGene), Diabetes Heart Study (DHS), Framingham Heart Study (FHS), Genetic Study of Atherosclerosis Risk (GeneSTAR), Genetic Epidemiology Network of Arteriopathy (GENOA), Jackson Heart Study (JHS), Multi-Ethnic Study of Atherosclerosis (MESA), and Women’s Health Initiative (WHI).