ABSTRACT
Phenotypes extracted from Electronic Health Records (EHRs) are increasingly prevalent in genetic studies. EHRs contain hundreds of distinct clinical laboratory test results, providing a trove of health data beyond diagnoses. Such lab data is complex and lacks a ubiquitous coding scheme, making it more challenging than diagnosis data. Here we describe the first large-scale cross-health system genome-wide association study (GWAS) of EHR-based quantitative lab measurements. We meta-analyzed 70 labs matched between the BioVU cohort from the Vanderbilt University Health System and the Michigan Genomics Initiative (MGI) cohort from Michigan Medicine. We show high replication of known association for these labs, validating EHR-based measurements as high-quality phenotypes for genetic analysis. Notably, our analysis provides the first replication for 700 previous GWAS associations across 46 different labs. We discovered 31 novel associations at genome-wide significance for 22 distinct labs, including the first reported associations for two labs. We replicated 22 of these novel associations in an independent tranche of BioVU samples. The summary statistics for all association tests are available through an interactive webtool to benefit other researchers. Finally, we performed mirrored analyses in BioVU and MGI to assess competing analytic practices for lab data. We find that using the mean of all available lab measurements provides a robust summary value, but alternate summarizations can improve power in certain labs. This study provides a proof-of-principle for cross health system GWAS and is a framework for future studies of quantitative traits in EHRs.
Competing Interest Statement
G.R.A. is an employee of Regeneron Pharmaceuticals. He owns stock and stock options for Regeneron Pharmaceuticals. L.A.B and J.C.D receive royalty payments from Nashville Biosciences, a Vanderbilt University Medical Center owned entity.
Funding Statement
The Michigan Genomics Initiative was supported by institutional funding.
Author Declarations
All relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.
Yes
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
Data cannot be shared publicly due to patient confidentiality. The data underlying the results presented in the study are available from University of Michigan Medical School Central Biorepository at https://research.medicine.umich.edu/our-units/central-biorepository/get-access for researchers who meet the criteria for access to confidential data. The meta-analysis summary statistics are available at http://pheweb.sph.umich.edu/mgi-biovu-labs.
https://research.medicine.umich.edu/our-units/central-biorepository/get-access