RT Journal Article SR Electronic T1 Global biobank analyses provide lessons for developing polygenic risk scores across diverse cohorts JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2021.11.18.21266545 DO 10.1101/2021.11.18.21266545 A1 Wang, Ying A1 Namba, Shinichi A1 Lopera, Esteban A1 Kerminen, Sini A1 Tsuo, Kristin A1 Läll, Kristi A1 Kanai, Masahiro A1 Zhou, Wei A1 Wu, Kuan-Han A1 Favé, Marie-Julie A1 Bhatta, Laxmi A1 Awadalla, Philip A1 Brumpton, Ben A1 Deelen, Patrick A1 Hveem, Kristian A1 Lo Faro, Valeria A1 Mägi, Reedik A1 Murakami, Yoshinori A1 Sanna, Serena A1 Smoller, Jordan W. A1 Uzunovic, Jasmina A1 Wolford, Brooke N. A1 , A1 Willer, Cristen A1 Gamazon, Eric R. A1 Cox, Nancy J. A1 Surakka, Ida A1 Okada, Yukinori A1 Martin, Alicia R. A1 Hirbo, Jibril YR 2022 UL http://medrxiv.org/content/early/2022/09/07/2021.11.18.21266545.abstract AB With the increasing availability of biobank-scale datasets that incorporate both genomic data and electronic health records, many associations between genetic variants and phenotypes of interest have been discovered. Polygenic risk scores (PRS), which are being widely explored in precision medicine, use the results of association studies to predict the genetic component of disease risk by accumulating risk alleles weighted by their effect sizes. However, few studies have thoroughly investigated best practices for PRS in global populations across different diseases. In this study, we utilize data from the Global-Biobank Meta-analysis Initiative (GBMI), which consists of individuals from diverse ancestries and across continents, to explore methodological considerations and PRS prediction performance in 9 different biobanks for 14 disease endpoints. Specifically, we constructed PRS using heuristic (pruning and thresholding, P+T) and Bayesian (PRS-CS) methods. We found that the genetic architecture, such as SNP-based heritability and polygenicity, varied greatly among endpoints. For both PRS construction methods, using a European ancestry LD reference panel resulted in comparable or higher prediction accuracy compared to several other non-European based panels; this is largely attributable to European descent populations still comprising the majority of GBMI participants. PRS-CS overall outperformed the classic P+T method, especially for endpoints with higher SNP-based heritability. For example, substantial improvements are observed in East-Asian ancestry (EAS) using PRS- CS compared to P+T for heart failure (HF) and chronic obstructive pulmonary disease (COPD). Notably, prediction accuracy is heterogeneous across endpoints, biobanks, and ancestries, especially for asthma which has known variation in disease prevalence across global populations. Overall, we provide lessons for PRS construction, evaluation, and interpretation using the GBMI and highlight the importance of best practices for PRS in the biobank-scale genomics era.Competing Interest StatementE.R.G. receives an honorarium from the journal Circulation Research of the American Heart Association as a member of the Editorial Board.Funding StatementA.R.M is funded by the K99/R00MH117229. E.L. is funded by the Colciencias fellowship ed.783. S.N. was supported by Takeda Science Foundation. Y.O. was supported by JSPS KAKENHI (19H01021, 20K21834), and AMED (JP21km0405211, JP21ek0109413, JP21ek0410075, JP21gm4010006, and JP21km0405217), JST Moonshot R&D (JPMJMS2021, JPMJMS2024), Takeda Science Foundation, and Bioinformatics Initiative of Osaka University Graduate School of Medicine, Osaka University. E.R.G. is supported by the National Institutes of Health (NIH) Awards R35HG010718, R01HG011138, R01GM140287, and NIH/NIA AG068026. V.L.F. was supported by the European Unions Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No.675033 (EGRET plus). L. B. and B. B. receive support from the K.G. Jebsen Center for Genetic Epidemiology funded by Stiftelsen Kristian Gerhard Jebsen; Faculty of Medicine and Health Sciences, NTNU; The Liaison Committee for education, research and innovation in Central Norway; and the Joint Research Committee between St Olavs Hospital and the Faculty of Medicine and Health Sciences, NTNU. K.L. and R.M. were supported by the Estonian Research Council grant PUT (PRG687) and by INTERVENE - This project has received funding from the European Unions Horizon 2020 research and innovation programme under grant agreement No 101016775. W.Z. was supported by the National Human Genome Research Institute of the National Institutes of Health under award number T32HG010464. The work of the contributing biobanks was supported by numerous grants from governmental and charitable bodies. The biobank specific acknowledgements and full author list for GBMI are included in the Supplementary Notes.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll data produced in the present work are contained in the manuscript https://www.globalbiobankmeta.org/resources http://results.globalbiobankmeta.org/ ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000_genomes_project/data