Abstract
Polygenic risk scores (PRSs) depend on genetic ancestry due to differences in allele frequencies between ancestral populations. This leads to implementation challenges in diverse populations. We propose a framework to calibrate PRS based on ancestral makeup. We define a metric called “expected PRS” (ePRS), the expected value of a PRS based on one’s global or local admixture patterns. We further define the “residual PRS” (rPRS), measuring the deviation of the PRS from the ePRS. Simulation studies confirm that it suffices to adjust for ePRS to obtain nearly unbiased estimates of the PRS-outcome association without further adjusting for PCs. Using the TOPMed dataset, the estimated effect size of the rPRS adjusting for the ePRS is similar to the estimated effect of the PRS adjusting for genetic PCs. The ePRS framework can protect from population stratification in association analysis and provide an equitable strategy to quantify genetic risk across diverse populations.
Competing Interest Statement
BP serves on the Steering Committee of the Yale Open Data Access Project funded by Johnson & Johnson. LMR and SSR are consultants to the TOPMed Administrative Coordinating Center (through Westat).
Funding Statement
This work is supported by National Heart, Lung, and Blood Institute (NHLBI) grant R01HL161012 and National Aging Institute grant R01AG080598 grant to TS, and National Human Genome Research Institute (NHGRI) grant R56HG013163. SZ is supported by NHGRI grant R01HG011031. NLS is supported by NHLBI grants R01HL139553 and R01HL154385. GMP is support by R01HL142711 from NHLBI. This analysis was approved by the Beth Israel Deaconess Medical Center Committee on Clinical Investigations, protocol #2023P000279, and by the Mass General Brigham IRB, protocol #2021P001928.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This analysis was approved by the Beth Israel Deaconess Medical Center Committee on Clinical Investigations, protocol #2023P000279, and by the Mass General Brigham IRB, protocol #2021P001928. This manuscript used data from 14 studies. Ethical approval was provided for each study, as follows: Amish: All study protocols were approved by the institutional review board at the University of Maryland Baltimore. Informed consent was obtained from each study participant. ARIC: The ARIC study has been approved by a single Institutional Review Board (sIRB) at Johns Hopkins School of Medicine and Institutional Review Boards (IRB) at all participating institutions: University of North Carolina at Chapel Hill IRB, Johns Hopkins University School of Public Health IRB, University of Minnesota IRB, Wake Forest University Health Sciences IRB, and University of Mississippi Medical Center IRB. Study participants provided written informed consent at all study visits. CARDIA: All CARDIA participants provided informed consent, and the study was approved by the Institutional Review Boards of the University of Alabama at Birmingham and the University of Texas Health Science Center at Houston. CFS: Cleveland Family Study was approved by the Institutional Review Board (IRB) of Case Western Reserve University and Mass General Brigham (formerly Partners HealthCare). Written informed consent was obtained from all participants. CHS: All CHS participants provided informed consent, and the study was approved by the Institutional Review Board [or ethics review committee] of University Washington. COPDGene: All COPDGene participants provided written informed consent, and the study was approved by the Institutional Review Boards of the participating clinical centers. FHS: The Framingham Heart Study was approved by the Institutional Review Board of the Boston University Medical Center. All study participants provided written informed consent. GENOA: Written informed consent was obtained from all subjects and approval was granted by participating institutional review boards (University of Michigan, University of Mississippi Medical Center, and Mayo Clinic). HCHS/SOL: This study was approved by the institutional review boards (IRBs) at each field center, where all participants gave written informed consent, and by the Non-Biomedical IRB at the University of North Carolina at Chapel Hill, to the HCHS/SOL Data Coordinating Center. All IRBs approving the study are: Non-Biomedical IRB at the University of North Carolina at Chapel Hill. Chapel Hill, NC; Einstein IRB at the Albert Einstein College of Medicine of Yeshiva University. Bronx, NY; IRB at Office for the Protection of Research Subjects (OPRS), University of Illinois at Chicago. Chicago, IL; Human Subject Research Office, University of Miami. Miami, FL; Institutional Review Board of San Diego State University. San Diego, CA. HVH: Study approval was granted by the human subjects committee at Group Health, and written informed consent was provided by all study participants. JHS: The Institutional Review Boards at Jackson State University, Tougaloo College, and the University of Mississippi Medical Center approved the study, and all participants provided written informed consent. Mayo VTE: All Mayo-VTE participants provided informed consent and the study was approved by the Institutional Review Board of Mayo Clinic, Rochester, MN. MESA: All MESA participants provided written informed consent, and the study was approved by the Institutional Review Boards at The Lundquist Institute (formerly Los Angeles BioMedical Research Institute) at Harbor-UCLA Medical Center, University of Washington, Wake Forest School of Medicine, Northwestern University, University of Minnesota, Columbia University, and Johns Hopkins University. WHI: All WHI participants provided informed consent and the study was approved by the Institutional Review Board (IRB) of the Fred Hutchinson Cancer Research Center.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
TOPMed freeze 8 WGS data and harmonized BP and lipid phenotypes are available by application to dbGaP according to the study specific accessions: Amish: phs000956, ARIC: phs001211, CARDIA: phs001612, CFS: phs000954, CHS: phs001368, COPDGene: phs000951, FHS: phs000974, GENOA: phs001345, HCHS/SOL: phs001395, HVH: phs000993, JHS: phs000964, Mayo VTE: phs001402, MESA: phs001211, WHI: phs001237. Summary statistics from MVP GWAS are available from dbGaP by application to study accession phs001672. Summary statistics from GIANT + UKBB GWAS were downloaded from https://portals.broadinstitute.org/collaboration/giant/index.php/GIANT_consortium_data_files. Data needed to construct the reported PRSs in this study include variants, alleles, and weights for each of the PRS are deposited on fighsare, and will be deposited on the PGS catalog. A dataset with ancestry-specific allele frequencies computed using GAFA on the TOPMed dataset for Europe, Africa, Middle East, East Asia, South Asia, and America ancestries for HapMap3 variants, which are recommended for use by the LDPred2 software, are available on the figshare repository: https://figshare.com/articles/dataset/ePRS_project_summary_statistics_and_ancestry-specific_allele_frequency/25336294