Abstract
Background Polycystic ovary syndrome (PCOS) is the most common endocrine disorder affecting women of reproductive age. Previous studies have identified genetic variants associated with PCOS identified by different diagnostic criteria. The Rotterdam Criteria is the broadest and able to identify the most PCOS cases.
Objectives To identify novel associated genetic variants, we extracted PCOS cases and controls from the electronic health records (EHR) based on the Rotterdam Criteria and performed a genome-wide association study (GWAS).
Study Design We developed a PCOS phenotyping algorithm based on the Rotterdam criteria and applied it to three EHR-linked biobanks to identify cases and controls for genetic study. In discovery phase, we performed individual GWAS using the Geisinger’s MyCode and the eMERGE cohorts, which were then meta-analyzed. We attempted validation of the significantly association loci (P<1×10−6) in the BioVU cohort. All association analyses used logistic regression, assuming an additive genetic model, and adjusted for principal components to control for population stratification. An inverse-variance fixed effect model was adopted for meta-analyses. Additionally, we examined the top variants to evaluate their associations with each criterion in the phenotyping algorithm. We used STRING to identify protein-protein interaction network.
Results We identified 2,995 PCOS cases and 53,599 controls in total (2,742cases and 51,438 controls from the discovery phase; 253 cases and 2,161 controls in the validation phase). GWAS identified one novel genome-wide significant variant rs17186366 (OR=1.37 [1.23,1.54], P=2.8×10−8) located near SOD2. Additionally, two loci with suggestive association were also identified: rs113168128 (OR=1.72 [1.42,2.10], P=5.2 x10−8), an intronic variant of ERBB4 that is independent from the previously published variants, and rs144248326 (OR=2.13 [1.52,2.86], P=8.45×10−7), a novel intronic variant in WWTR1. In the further association tests of the top 3 SNPs with each criterion in the PCOS algorithm, we found that rs17186366 was associated with polycystic and hyperandrogenism, while rs11316812 and rs144248326 were mainly associated with oligomenorrhea or infertility. Besides ERBB4, we also validated the association with DENND1A1.
Conclusion Through a discovery-validation GWAS on PCOS cases and controls identified from EHR using an algorithm based on Rotterdam criteria, we identified and validated a novel association with variants within ERBB4. We also identified novel associations nearby SOD2 and WWTR1. These results suggest the eGFR and Hippo pathways in the disease etiology. With previously identified PCOS-associated loci YAP1, the ERBB4-YAP1-WWTR1 network implicates the epidermal growth factor receptor and the Hippo pathway in the multifactorial etiology of PCOS.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
MyCode® was funded by Geisinger and Regeneron Genomics Center; the eMERGE III was funded by NIH U01HG8679 (Geisinger Clinic). The funding sources was not involved in the interpretation of the result or which journal to submit.
Author Declarations
All relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.
Yes
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
↵# Both Kevin Ho (currently employed by Sanofi Genzyme) and Sarah A. Pendergrass (currently employed by Genentech) worked on this study while employed by Geisinger.
Data Availability
Summary data is available provided collaboration with Geisinger.