RT Journal Article SR Electronic T1 Leveraging Electronic Medical Records and Knowledge Networks to Predict Disease Onset and Gain Biological Insight Into Alzheimer’s Disease JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.03.14.23287224 DO 10.1101/2023.03.14.23287224 A1 Tang, Alice A1 Rankin, Katherine P. A1 Cerono, Gabriel A1 Miramontes, Silvia A1 Mills, Hunter A1 Roger, Jacquelyn A1 Zeng, Billy A1 Nelson, Charlotte A1 Soman, Karthik A1 Woldemariam, Sarah A1 Li, Yaqiao A1 Lee, Albert A1 Bove, Riley A1 Glymour, Maria A1 Oskotsky, Tomiko A1 Miller, Zachary A1 Allen, Isabel A1 Sanders, Stephan J. A1 Baranzini, Sergio A1 Sirota, Marina YR 2023 UL http://medrxiv.org/content/early/2023/03/19/2023.03.14.23287224.abstract AB Early identification of Alzheimer’s Disease (AD) risk can aid in interventions before disease progression. We demonstrate that electronic health records (EHRs) combined with heterogeneous knowledge networks (e.g., SPOKE) allow for (1) prediction of AD onset and (2) generation of biological hypotheses linking phenotypes with AD. We trained random forest models that predict AD onset with mean AUROC of 0.72 (-7 years) to .81 (-1 day). Top identified conditions from matched cohort trained models include phenotypes with importance across time, early in time, or closer to AD onset. SPOKE networks highlight shared genes between top predictors and AD (e.g., APOE, IL6, TNF, and INS). Survival analysis of top predictors (hyperlipidemia and osteoporosis) in external EHRs validates an increased risk of AD. Genetic colocalization confirms hyperlipidemia and AD association at the APOE locus, and AD with osteoporosis colocalize at a locus close to MS4A6A with a stronger female association.Competing Interest StatementDr. Bove has received research support for F Hoffman LaRoche, Novartis and Biogen. She has received personal support for consulting and/or scientific advisory boards from Alexion, EMD Serono, Horizon, Jansen, and TG Therapeutics.Funding StatementPrimary support was provided by grant numbers NIA R01AG060393 (AT, SM, SW, TTO, MS). Additional support was provided by the Medical Scientist Training Program T32GM007618 and F30 Fellowship 1F30AG079504-01 (AT) and NSF GRFP 2038436 (JR). SEB holds the Heidrich Family and Friends Endowed Chair of Neurology at UCSF. SEB holds the Distinguished Professorship in Neurology I at UCSF. Dr. Bove is the recipient of a National Multiple Sclerosis Society Harry Weaver Award. She is supported by the NIH, NMSS, NSF, DOD, UCSF Weill Institute for Neurosciences, and by various foundations. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The Institutional Review Board of University of California San Francisco gave ethical approval for this work (IRB #20-32422).I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesEHR concepts and identification approaches are described in Methods, and concepts are provided in Supplemental Tables 1 and 2. Phecodes can be downloaded at phewascatalog.org/phecodes_icd10 or phewascatalog.org/phecodes, and mappings between ICD-10 codes and SNOMED can be accessed at www.nlm.nih.gov/healthit/snomedct/us_edition.html. Data for UK Biobank phenotype GWAS can be found at www.nealelab.is/uk-biobank/, and eQTL data can be downloaded from www.eqtlgen.org/. The UCSF EHR database can be accessed to UCSF-affiliated. The SPOKE knowledge network can be accessed at spoke.rbvi.ucsf.edu/, and more details about the network can be found in Morris et al. and mappings to EHR concepts can be found in Nelson et al.https://www.nlm.nih.gov/healthit/snomedct/us_edition.htmlhttps://www.genetics.opentargets.org/apihttps://www.phewascatalog.org/phecodes_icd10https://www.spoke.rbvi.ucsf.edu/https://www.eqtlgen.org/https://www.nealelab.is/uk-biobank/