PT - JOURNAL ARTICLE AU - Actkins, Ky’Era V. AU - Singh, Kritika AU - Hucks, Donald AU - Edwards, Digna R. Velez AU - Aldrich, Melinda AU - Cha, Jeeyeon AU - Wellons, Melissa AU - Davis, Lea K. TI - Characterizing the Clinical and Genetic Spectrum of Polycystic Ovary Syndrome in Electronic Health Records AID - 10.1101/2020.05.08.20095786 DP - 2020 Jan 01 TA - medRxiv PG - 2020.05.08.20095786 4099 - http://medrxiv.org/content/early/2020/05/17/2020.05.08.20095786.short 4100 - http://medrxiv.org/content/early/2020/05/17/2020.05.08.20095786.full AB - Context Polycystic ovary syndrome (PCOS) is one of the leading causes of infertility, yet current diagnostic criteria are ineffective at identifying patients whose symptoms reside outside strict diagnostic criteria. As a result, PCOS is under diagnosed and its etiology is poorly understood.Objective We aim to characterize the phenotypic spectrum of PCOS clinical features within and across racial and ethnic groups.Methods We developed a strictly defined PCOS algorithm (PCOSregex-strict) using International Classification of Diseases, 9th and 10th edition (ICD9/10) and regular expressions mined from clinical notes in electronic health records (EHRs) data. We then systematically relaxed the inclusion criteria to evaluate the change in epidemiological and genetic associations resulting in three subsequent algorithms (PCOScoded-broad, PCOScoded-strict,PCOSregex-broad). We evaluated the performance of each phenotyping approach and characterized prominent clinical features observed in racially and ethnically diverse PCOS patients.Results The best performing algorithm was our PCOScoded-strict algorithm with a positive predictive value (PPV) of 98%. Individuals classified as cases by this algorithm had significantly higher body mass index (BMI), insulin levels, free testosterone values, and genetic risk scores for PCOS, compared to controls. Median BMI was higher in African American women with PCOS compared to White and Hispanic women with PCOS.Conclusions PCOS symptoms are observed across a severity spectrum that parallels genetic burden. Racial and ethnic group differences exist in PCOS symptomology and metabolic health across different phenotyping strategies.Competing Interest StatementThe authors have declared no competing interest.Funding StatementKVA and JC are supported by NIH training grant 5T32GM007628-42 and 5T32DK007061. LKD is supported by U54MD010722. This research was done in part using the resources from the Advanced Computing Center for Research and Education at Vanderbilt University, Nashville, TN. Datasets for this project were obtained using the Synthetic Derivative at Vanderbilt University Medical Center which is supported by multiple grant sources that are institutional, private, and federal. This includes the NIH funded Shared Instrumentation Grant S10RR025141 and CTSA grants UL1TR002243, UL1TR000445, and UL1RR024975. Author DeclarationsAll relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.YesAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesDue to data sharing restrictions related to privacy concerns in the EHR, the datasets generated from our hospital population will not be publicly available, however, all criteria for automated phenotyping is available in supplementary materials.