Abstract
Females with polycystic ovary syndrome (PCOS), the most common endocrine disorder in women, have an increased risk of developing metabolic disorders such as insulin resistance, obesity, and type 2 diabetes (T2D). Furthermore, while only diagnosable in females, males with a family history of PCOS can also exhibit a poor cardiometabolic profile. Therefore, we aimed to elucidate the role of sex in the relationship between PCOS and its comorbidities by conducting bidirectional genetic risk score analyses in both sexes. We conducted a phenome-wide association study (PheWAS) using PCOS polygenic risk scores (PCOSPRS) to understand the pleiotropic effects of PCOS genetic liability across 1,380 medical conditions in females and males recorded in the Vanderbilt University Medical Center electronic health record (EHR) database. After adjusting for age and genetic ancestry, we found that European descent males with higher PCOSPRS were significantly more likely to develop cardiovascular diseases than females at the same level of genetic risk, while females had a higher odds of developing T2D. Based on observed significant associations, we tested the relationship between PRS for comorbid conditions (e.g., T2D, body mass index, hypertension, etc.) and found that only PRS generated for BMI and T2D were associated with a PCOS diagnosis. We then further decomposed the T2DPRS association with PCOS by adjusting the model for measured BMI and BMIresidual (enriched for the environmental contribution to BMI). Results demonstrated that genetically regulated BMI primarily accounted for the relationship between T2DPRS and PCOS. This was further supported in a mediation analysis, which only revealed clinical BMI measurements, but not BMIresidual, as a strong mediator for both sexes. Overall, our findings show that the genetic architecture of PCOS has distinct metabolic sex differences, but these associations are only apparent when PCOSPRS is explicitly modeled. It is possible that these pathways are less explained by the direct genetic risk of metabolic traits than they are by the risk factors shared between them, which can be influenced by biological variables such as sex.
Introduction
Polycystic ovary syndrome (PCOS) is a highly heritable endocrine disorder that affects 5%-21% of females of reproductive age who are typically diagnosed by having two or more of the following features under the Rotterdam criteria: polycystic ovaries, oligo- and anovulation, or hyperandrogenism [1–3]. Although Rotterdam is the most common PCOS criteria, other criteria can be used for diagnosis, including the National Institutes of Health criteria, the Androgen Excess and PCOS Criteria, or the 2018 International Evidence Based PCOS guidelines [2]. Each criterion slightly differs in requirements in an effort to cover the range of PCOS symptoms that are exhibited in patients. However, as a result of the differing diagnostic criteria, the heterogenous presentation of symptoms, and the prevalent comorbidities that reside outside of diagnostic requirements patients often spend years seeking a diagnosis or worse, may be one of the 75% of females estimated to be undiagnosed [4,5].
Many clinicians select criteria based on their perception of the most defining PCOS feature [6]. In some cases, as with the Androgen Excess and PCOS Society criteria [7], this will mean hyperandrogenism, a symptom that typically manifests as acne, hirsutism, or alopecia [8]. Androgen excess is also hypothesized to underlie many of the comorbid metabolic dysfunctions experienced by patients such as insulin resistance, obesity, metabolic syndrome, type 2 diabetes, and cardiovascular diseases (CVD). However, as previous studies have shown, the genetic risk factors present in patients with metabolic manifestations of PCOS could differ from others who have primary presentations of reproductive dysfunction [9,10].
PCOS is multifactorial and twin studies estimate heritability at 70% [11–14]. With an underlying polygenic architecture, multiple variants are hypothesized to be involved in the development of PCOS [15,16]. Furthermore, the variation of clinical features can be partially explained by ancestry informative markers, indicating potential population specific effects [17,18]. Rare variants in genes, such as DENND1A, have also been identified in family studies for PCOS alongside many other genetic variants identified from GWAS [19]. Despite the small effect size of individual common variants, aggregation of common risk variants together as a polygenic (or genetic) risk score (PRS) reflects the overall additive genetic liability to PCOS in individuals. This marker of disease risk is associated with PCOS diagnosis in multiple ancestries and offers many advantages to parsing out the genetic etiology of PCOS that is entangled with its comorbid presentations [11–13]. Furthermore, there is increasing evidence that a spectrum of clinical PCOS manifestations for PCOS is also correlated with the PRS for PCOS [11].
Therefore, in this study, we aimed to determine if the PRS for PCOS demonstrated pleiotropic associations with other health conditions in a hospital biobank population through a phenome-wide association study (PheWAS). By using a genetic model, we were able to determine sex-differentiated effects associated with PCOS genetic risk, revealing the impact of PCOS-associated inherited genetic variation in males, despite the fact that PCOS is only diagnosed in females. We further implemented several analyses to determine the mediating role of body mass index (BMI) on cardiometabolic comorbidities, and further identified the directionality of their effects through causal mediation analyses.
Methods
VUMC EHR-linked Biorepository
Vanderbilt University Medical Center (VUMC) is a tertiary care hospital in Nashville, Tennessee, with several outpatient clinics throughout Tennessee and the surrounding states offering primary and secondary care. Medical records have been electronically documented at VUMC since the early 1990s, resulting in a clinical research database of over 3 million EHRs referred to as the Synthetic Derivative [20]. EHRs include demographic information, health information documented through International Classification of Disease, Ninth Revision (ICD9) and Tenth Revision (ICD10) codes, procedural codes (CPT), clinical notes, medications, and laboratory values. This information is linked with a DNA biorepository known as BioVU. Use of EHRs and genetic data for this study was approved by the Vanderbilt University Institutional Review Board (IRB #160279).
Genetic data
BioVU contains 94,474 individuals genotyped on the MEGAEX platform [21]. The QC pipeline removed SNPs with low genotyping call rate (< 0.98) and individual subjects who were related (pi-hat > 0.2), had low call rates (< 0.98), sex discrepancies, and excessive heterozygosity (Fhet > 0.2). A principal component (PC) analysis (PCA) was performed on remaining individuals to determine genetic ancestry using FlashPCA2 [22]. BioVU genotyped samples were stratified by ancestral origin based on PCs herein referred to as the European (EUR) descent or African (AFR) descent dataset. Extended details for the quality control of these datasets have been described previously [23].
Publicly available summary statistics
Genome-wide association study (GWAS) summary statistics were acquired for PCOS, BMI, diastolic blood pressure, systolic blood pressure, pulse, type 2 diabetes, heart failure, and coronary artery disease (i.e., all of the significant phenotypic associations observed in the PheWAS models) to further establish the strength of associations between PCOS and its comorbidities through subsequent analyses. Each GWAS was selected based on public availability, sample size, and sample diversity.
The Genetic Investigation of ANthropometric Traits (GIANT) consortium body mass index (BMI) summary statistics included 339,224 individuals of European and non-European descent from over 125 studies. These BMI summary statistics were used as a proxy for obesity summary statistics [24]. Blood pressure traits (diastolic, systolic, and pulse) and type 2 diabetes (T2D) summary statistics were obtained from the Million Veterans Program (MVP), a large biobank consortium effort that houses biobank data from various sites in the Department of Veterans Affairs health system [25]. Blood pressure traits were generated from a trans-ethnic sample of over 750,000 individuals from MVP [26]. T2D summary statistics were generated from a meta-analysis using data from 1.4 million participants in various biobanks and consortia groups [27]. Heart failure summary statistics were collected from 47,309 cases and 930,014 controls of European ancestry across nine studies in the Heart Failure Molecular Epidemiology for Therapeutic Targets (HERMES) consortium as a proxy for heart disease [28]. Finally, coronary artery disease (CAD) datasets generated from the Coronary Artery Disease Genome-wide Replication and Meta-analysis plus The Coronary Artery Disease (CARDIoGRAMplusC4D) consortium were used as the genetic measurement for the coronary atherosclerosis phenotype [29]. This meta-analysis assembled 60,801 cases and 123,504 controls of multiple ancestries across forty-eight study sites.
Statistical analysis
Generation of polygenic risk scores (PRS)
PCOSPRS were calculated with PRS-CS software using the weighted sums of the risk allele effects as reported in the summary statistics from the Day et al. GWAS of PCOS and applying a Bayesian continuous shrinkage parameter select SNP features and to model linkage disequilibrium [16,30]. The details of these methods have been previously described elsewhere [11]. We calculated PCOSPRS for both EUR and AFR BioVU genotyped ancestry samples which were previously shown to be associated with a PCOS diagnosis defined by a coded strict PCOS definition in our previously published EHR-based algorithm [11].
Phenome-wide association study (PheWAS)
Next, we were interested in identifying the pleiotropic effects of the genetic susceptibility to PCOS on the medical phenome. Therefore, in a PheWAS framework we analyzed the effects of PCOSPRS across 1,380 medical conditions. This analysis was first performed in a female sample for our EUR and AFR ancestry datasets to validate the PRS and identify potential pleiotropic relationships (Supplementary Figures 1-5). Although males cannot be diagnosed with PCOS, they still harbor genetic risk for PCOS. Thus, we extended the analysis to males to examine sex differences. Finally, we performed a sex-combined PheWAS to increase statistical power. In the sex-combined logistic regression model, covariates included median age of individuals medical record, sex, and the first ten principal components. In the sex-stratified models, covariates included median age and the first ten PCs.
Interaction analysis
For each phenotype with evidence of a significant main effect of PCOSPRS in either sex, we tested for two-way interactions (sex * PCOSPRS) to determine which of the significant sex stratified PheWAS associations were influenced by biological effects of sex. Selected phenotypes of interest that met the phenome-wide false discovery rate (q < 0.05) were tested for interactions (N = 10). For these interaction analyses, sex-combined models included the main effects of PCOSPRS, median age of individuals medical record, sex, and the top ten PCs.
Sensitivity analyses
Several sensitivity analyses were performed to assess the robustness of the significant phenome-wide findings. First, BMI is strongly correlated with both PCOS and its comorbidities, and thus, can influence the strength of the results [31]. Therefore, we adjusted each model for BMI (median measurement across an individual’s EHR) to observe whether the significant phenotypes associated with PCOSPRS were independent of obesity-related effects. In addition to adjusting for BMI, the model was adjusted for median age, sex (only in the sex-combined sample), and the top ten PCs.
Next, we evaluated which phenotypes were dependent on a PCOS diagnosis in females by accounting for PCOS diagnosis [11]. This analysis allowed us to identify true pleiotropic associations and provided insight into which phenotypes were exclusively correlated with genetic risk, even in the absence of PCOS.
Finally, given that BMI has strong contributions from both genetic and environmental sources of variance, we revisited the previous PheWAS analyses to specifically account for the environmental contribution of BMI. To do this, we calculated the residuals of median BMI adjusted for BMIPRS (residuals(medianBMI ∼ BMIPRS)), herein referred to as BMIresidual. BMIresidual were then used in a subsequent sensitivity analysis of the previously described PheWAS.
Genetic correlation
Linkage Disequilibrium Score Regression (LDSC) was used to calculate genetic correlation, an estimate of genetic similarity, between traits [32]. LDSC only utilizes GWAS summary statistics and is not sensitive to sample overlap, which may be present across the publicly available GWAS datasets used in this study. This method utilizes the effect estimate of each SNP and accounts for the effects of SNPs in linkage disequilibrium based on the GWAS reference population. The European reference panel was used for all analyses based on the demographic majority of each of the GWAS samples.
Logistic regression models
Public GWAS datasets were used to generate PRS for phenome-wide significant phenotypes identified in the PheWAS. PRS were then used as the independent variable in a logistic regression model against PCOS diagnosis as the dependent variable. BioVU genotyped datasets were filtered to females and contained 361 PCOS cases and 29,035 controls in the EUR dataset and 189 PCOS cases and 9,229 controls in the AFR dataset. Selection of PCOS cases and controls have been described elsewhere [11]. In brief, cases required PCOS billing codes and no exclusion codes. Controls excluded individuals who had any inclusion or exclusion codes [11]. All models were adjusted for median age of the individuals medical record, and the top ten PCs for each ancestry. Median BMI was included as a covariate in the sensitivity analysis.
Lastly, we adjusted for BMIresidual in the logistic regression models in place of BMI. Next, we performed a sensitivity analysis to test the effect of BMIresidual and BMIPRS separately on PCOS diagnosis. A multiple testing correction of p < 7.35e-04 (0.05/68) was implemented to account for all statistical tests. All statistical analyses were done using R 3.6.0.
Mediation analysis
To test whether BMI or BMIresidual mediates the pleiotropic relationships between PCOS and associated cardiometabolic conditions, a mediation analysis was employed using the mediation R package for both AFR and EUR genotyped samples [33]. For the first set of mediation analyses, we modeled BMIPRS and BMIresidual as exposures, T2D diagnosis as the mediator, and PCOS case status as the outcome. Another reciprocal model was tested with the same exposures, but with PCOS diagnosis as the mediator and T2D as the outcome. This analysis tested the hypothesis that the risk conferred by BMI on one phenotype (e.g., PCOS) in turn increases risk for another other phenotype (e.g., T2D). We restricted this analysis to the female population. All models were further adjusted for median age and the top ten PCs.
In our second set of mediation analyses, we tested whether genetic risk for PCOS or T2D also influenced BMIresidual through the manifestation of either clinical condition. For example, we modeled PCOSPRS as the exposure variable, T2D diagnosis as the mediator and BMIresidual as the outcome. Finally, in a female only analysis, T2DPRS was modeled the exposure variable, PCOS diagnosis was included as the mediator, and BMIresidual again was modeled as the outcome.
Finally, for our last set of mediation analyses, we examined the mediating effect of both measured BMI and BMIresidual on the relationship between PCOSPRS (exposure) and the diagnosis of hypertension and hypertensive heart disease (i.e., outcomes that demonstrated evidence of significant interactions with sex). T2D was also tested because it was significantly associated with PCOSPRS in both sexes. PCOSPRS was modeled as the exposure variable and T2D (phecode = 270.2) hypertension (phecode = 411), and hypertensive heart disease (phecode = 401.21) were modeled as the outcome variables. Models were considered significant when the average direct effect (ADE) and average causal mediation effect (ACME) both passed p < 7.35e-04. All mediation analyses were performed with 1,000 bootstrap simulations to estimate the confidence intervals and determine an empirical p-value.
Results
PCOSPRS PheWAS results
The PCOSPRS PheWAS was applied to 66,903 EUR individuals from BioVU. When restricted to females (N = 37,240), two phenotypes were significantly associated with PCOSPRS (Bonferroni corrected p-value = 4.56e-05) (Figure 1a): T2D (odds ratio [OR] = 1.10, 95% confidence interval [CI] = 1.06-1.14, p = 5.54e-07) and diabetes mellitus (OR = 1.09, 95% CI = 1.05-1.13, p = 2.20e-06). When the analysis was restricted to males (N = 29,663; Figure 1b), hypertensive heart disease was significant (OR = 1.15, 95% CI = 1.08-1.23, p = 2.07e-05) alongside a cluster of nominally (p < 0.05) associated cardiovascular phenotypes with similar effect sizes (OR = 1.06 to 1.08). This included hypertension (OR =1.06, 95% CI = 1.03-1.10, p=1.30e-04), essential hypertension (OR = 1.06, 95% CI = 1.03-1.10, p = 2.49e-04), and coronary atherosclerosis (OR = 1.06, 95% CI =1.02-1.10, p=1.40e-03). T2D (OR=1.07, 95% CI = 1.03-1.11, p=7.04e-04) and several other endocrine phenotypes were also modestly associated with PCOSPRS in males.
In the EUR sex-combined model, four phenotypes were significantly associated with PCOSPRS (Figure 2). This included T2D and diabetes mellitus as the top associations with the same OR of 1.08 (T2D 95% CI = 1.06-1.11, p = 3.15e-09; diabetes mellitus 95% CI = 1.05-1.11, p = 1.03e-08). Following was obesity (OR = 1.07, 95% CI = 1.04-1.11, p = 9.68e-06) and hypertensive heart disease (OR = 1.11, 95% CI = 1.06-1.16, p = 3.57e-05). Three cardiovascular diseases were also nominally associated with PCOSPRS, presumably due to the addition of males. PCOSPRS yielded an OR of 1.21 (95% CI = 1.08-1.36, p = 7.82e-04) for an association with polycystic ovaries, falling just short of phenome-wide significance.
No associations passed Bonferroni correction in the AFR ancestry analysis, which included a total of 12,383 individuals (Supplementary Figure 1).
Interaction analysis
The sex interaction analysis demonstrated that males with a high PCOSPRS were (interaction p = 7.24e-03) more likely to be diagnosed with hypertension (ORMales =1.06, 95% CI = 1.03-1.10, p =1.30e-04; ORFemales = 1.03, 95% CI = 1.00-1.06, p = 0.07), essential hypertension (interaction p=7.71e-03, ORMales = 1.06, 95% CI = 1.03-1.10, p = 2.49e-04; ORFemales = 1.03, 95% CI = 1.00-1.06, p = 0.08), and hypertensive heart disease (interaction p=0.01, ORMales = 1.15, 95% CI = 1.08-1.23, p = 2.07e-05; ORFemales = 1.05, 95% CI = 0.98-1.13, p = 0.16) than females with the same PCOSPRS (Table 1). These sex differences were also observed when calculating the prevalence for each trait by decile of PCOSPRS, which only showed an upwards trend across deciles for traits with a significant sex effect (Figure 3). Together, these results showed that sex is an important modifier of PCOS genetic risk.
Sensitivity analyses adjusting for BMI and PCOS case status
In our first set of PheWAS sensitivity analyses for the EUR population, we adjusted for median BMI in all of the PRS-PheWAS models. There were no surviving associations in the sex-combined model or stratified analyses, suggesting BMI may mediate the pleiotropic effects of PCOSPRS (Supplementary Figures 2a and 3). In a separate sensitivity analysis (female only) we found that after adjusting for PCOS diagnosis, females with a high PCOSPRS still demonstrated a significant positive phenome-wide association with T2D and diabetes mellitus (Supplementary Figure 2b).
Genetic correlation results
We found that BMI (rg = 43.32%, p = 8.77e-19) and T2D (rg = 29.26%, p = 3.25e-09) had the strongest genetic correlation with PCOS (Table 2). Heart failure (rg = 26.51%, p = 0.0025) and pulse pressure (rg = 13.46%, p = 0.011) were also modestly significantly with PCOS. CAD bordered the significance threshold, but systolic and diastolic blood pressure were not significantly genetically correlated with PCOS.
Cardiometabolic PRS Analysis of PCOS diagnosis
Outside of T2D and BMI, none of the PRS built for CAD, heart failure, or blood pressure were significantly associated with a PCOS diagnosis in females of either EUR or AFR descent (Figure 4).
BMIPRS was positively associated with PCOS diagnosis for both EUR and AFR ancestry populations (pEUR = 3.71e-08, pAFR = 0.02), but not independently of clinically measured BMI values. T2DPRS in the EUR dataset yielded similar results with PCOS diagnosis, again losing significance upon addition of a BMI covariate in the model (ORunadjusted = 1.16, 95% CI = 1.04-1.30, p = 0.007; ORadjusted = 1.09, 95% CI = 0.97-1.22, p = 0.16). To determine whether this reduction in effect size was due to the genetic correlation between T2D and BMI, we also tested a model in which T2DPRS was adjusted for BMIresidual (i.e., variance remaining in BMI after removing variance due to BMIPRS) instead of BMI. This model indeed recovered the original association between T2DPRS (OREUR = 1.18, 95% CI = 1.05-1.32, pEUR = 0.005) and PCOS diagnosis for the EUR population, suggesting that genetically predicted BMI mediates the association between T2DPRS and PCOS diagnosis (Figure 5).
Mediation analysis in females with BMI as an exposure variable
To further quantify the degree to which BMI influenced the shared genetic pathways between PCOS and T2D, a mediation analysis was performed with BMIPRS and BMIresidual as the exposure variables, PCOS diagnosis as the outcome variable, and T2D as the mediator. The reciprocal model was then tested with T2D diagnosis as the outcome and PCOS as the mediator (Table 3). We found that both phenotypes acted as significant mediators when BMIPRS and BMIresidual were exposure variables. PCOS diagnosis had a stronger mediating effect on T2D when BMIresidual was the exposure variable compared to BMIPRS (13.6% vs 7.1%). However, the opposite was true for T2D, which mediated more of risk conferred by BMIPRS than BMIresidual on the PCOS outcome (9% vs 2.1%).
PheWAS sensitivity analysis with BMIresidual
We revisited the PheWAS analysis in light of the findings for BMIresidual. By covarying for BMIresidual, (e.g., environmental estimate of BMI), we observed almost no difference from the original results in our female sample (Supplementary Figure 4a). T2D remained significant in females (OR =1.10, 95% CI = 1.06-1.15, p = 1.69e-06), as did diabetes mellitus (OR =1.09, 95% CI = 1.05-1.13, p = 9.91e-06). These associations also appeared in the combined dataset alongside obesity (Supplementary Figure 5). Although none of the phenotypes passed Bonferroni correction for males, many of the associations improved in estimate effects (Supplementary Figure 4b).
Mediation analysis in the sex-combined and sex-stratified samples
Upon finding BMIresidual as a significant risk exposure for T2D and PCOS in the first mediation analysis, but not a strong confounder in the PCOSPRS PheWAS analysis, we tested whether the genetic risk conferred by one condition (e.g., T2DPRS) was mediated by the clinical manifestation of the other (e.g., PCOS diagnosis) when examining BMIresidual as the outcome. Here we observed a modest association that showed PCOS diagnosis mediated 33% of the risk conferred by T2DPRS on BMIresidual whereas T2D diagnosis did not explain any of the variance in BMIresidual due to PCOSPRS (Table 4).
Lastly, the effects of BMI on PCOS comorbidities T2D, hypertension, and hypertensive heart disease were explored (Table 5). We found that BMIresidual was not a significant mediator for any of the tested models, even though clinical BMI was a significant mediator for T2D in females and a significant mediator for hypertensive heart disease in males. This again implicated genetically regulated BMI as the main source of the mediating effect for males since median BMI significantly mediated 14.8% of the variance in hypertensive heart disease (p = <2e-16) and mediated 23.7% of the variance in hypertension (p = 0.002) due to PCOSPRS. The lack of an association with BMIresiudal in females also suggests genetically regulated BMI as the primary mediator for T2D, where it mediated 31.5% of the variance explained by the PCOSPRS.
Discussion
Through a comprehensive analysis of PCOS genetic risk across multiple phenotypes, we identified sex differences in the cardiometabolic traits associated with PCOS genetic risk. Among these, the most notable difference was that males with high PCOSPRS were at greater risk of cardiovascular conditions than females at the same level of genetic risk. Genetically predicted BMI was revealed as a primary mediator of this risk in males, whereas both environmental and genetic BMI variance components mediated risk of T2D conferred by PCOSPRS in females. Furthermore, only T2DPRS and BMIPRS were associated with PCOS diagnosis, indicating that many of the associations observed in the PCOSPRS PheWAS were primarily driven by PCOS genetic risk and not the genetic effects of the identified comorbidities.
There is growing interest in studying PCOS related effects in males whether through their relationship to first-degree family members with PCOS or by an equivalent phenotype [34–36]. Generally, males with mothers or sisters with PCOS tend to exhibit a poorer cardiometabolic profile which can be observed as early as infancy [35]. A previous study suggests that males who exhibit a high genetic risk for PCOS are more likely to present with morbid obesity, T2D, and diabetes mellitus, two findings which were confirmed in this study [12]. Although we also found associations with CVD phenotypes in males, these were largely influenced by genetically predicted BMI which can significantly increase the lifetime risk for CVD and mortality rates of high-risk individuals [37]. Genetic susceptibility for PCOS could be an additional catalyst for these events in males, making individuals already predisposed to adverse metabolic outcomes more vulnerable.
In an effort to determine whether genetic risk for phenotypes comorbid with PCOS could also increase risk for PCOS, we conducted separate multivariable logistic regressions with PRS for BMI, diastolic blood pressure, systolic blood pressure, pulse pressure, T2D, heart failure, and CAD on the PCOS diagnosis outcome and found no significant associations outside of T2DPRS and BMIPRS. This was unsurprising as the relationship between PCOS and CVD are still poorly elucidated and often debated [38]. It is possible that CVD is more prevalent in genetically high-risk males than females. The sex-difference in CVD prevalence may also be reflected in the PRS which may fail to fully capture CVD genetic risk for PCOS when stratified by sex due to fewer females in the discovery sample.
This study provides novel insight into PCOS genetic etiology and continues to underscore the importance of BMI in PCOS risk. As with T2D, we found that genetically predicted BMI is primarily responsible for the phenotypic association between T2DPRS and PCOS. However, PCOSPRS genetic risk led to more comorbid traits in the PheWAS in which both environmental and genetically predicted BMI influenced the metabolic associations. This may mean that PCOS patients with a family history of T2D can have an increased risk for morbid outcomes, as females with PCOS are already more likely to develop T2D at an earlier age [39]. This genetic susceptibility could also explain the high prevalence of insulin resistance in PCOS patients, which can be as high as 70% across all BMI strata [40]. These effects should be investigated further, as the genetic and biological pathways could differ in lean PCOS patients who also experience a high rate of insulin resistance [41].
This study offers many strengths. Firstly, we showed that cardiometabolic associations vary with sex and that the metabolic outcomes related to PCOS genetic architecture can be further understood by studying both males and females. Secondly, we decomposed BMI into genetically predicted (i.e., BMIPRS) and environmentally enriched (i.e., BMIresidual) and evaluated their respective roles in mediating the cardiometabolic profiles associated with PCOSPRS. However, limitations include low power to detect any significant associations in our African descent sample. To date, there is no PCOS GWAS of African descent individuals, limiting all current similar studies to building PRS using European-based genetic variants, which do not perform as well in non-European populations [11,12]. Second, we only examined one environmental risk factor. Although many effects such as lifestyle and diet can be captured through BMI, it is not an exhaustive measurement nor does it accurately account for the full wellness of an individual [42,43]. Although other anthropometric features like hip-to-waist ratio (WHR) may be better indicators of health for some phenotypes [44], this information is not routinely collected in clinical settings or reported in EHRs. Furthermore, evidence does suggest that clinically ascertained BMI may be more informative for PCOS than WHR [45]. Finally, despite using the largest PCOS GWAS to date for this analysis, our PCOSPRS still only explains a small portion of PCOS genetic variance. As these analyses expand, so too will our ability to detect the full genetic spectrum of PCOS and its subphenotypes.
PCOS is a multifaceted disorder with genetic architecture that is reflective of its heterogeneous outcomes. This polygenic structure captures a spectrum of metabolic comorbidities that is even more apparent when compared between sexes. Our findings show that males with high PCOS liability are indeed a high-risk group and those with a family history of PCOS should be closely monitored for hypertension and CVD. This is also true for females with PCOS and a family history of T2D, whose genetic risk could induce more severe comorbid outcomes. As such, management and screening strategies should be updated to reflect advances in PCOS etiology. This call to action is paramount and requires both widespread dissemination of risk factor information to relevant stakeholders and increases in PCOS research priorities and funding. This becomes even more crucial as PCOS comorbidities are often under-recognized in clinical settings and metabolic conditions are underutilized in PCOS screening methods [46–49].
Data Availability
All GWAS summary statistics used in this study are publicly available and can be obtained from the following consortium groups and websites: International PCOS Consortium (https://doi.org/10.17863/CAM.27720), UK Biobank (http://www.nealelab.is/uk-biobank), HERMES (http://www.broadcvdi.org/), and CARDIoGRAMplusC4D (http://www.cardiogramplusc4d.org/). The summary statistics used from the Million Veteran Project can be accessed through dbGaP under accession number phs001672.v6.p1. The electronic health record data that support the findings of this study are available from Vanderbilt University Medical Center, but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are, however, available from the authors upon reasonable request and with permission of Vanderbilt University Medical Center.
Footnotes
Funding/Support: KVA is funded by F31HD103397. LKD is funded by U54MD010722. MCA is funded by a grant from NIH/NCI (U01CA253560). Datasets were obtained using the Vanderbilt University Advanced Computing Center for Research and Education. This resource is funded by institutional, private, and federal grants, which include NIH funded Shared Instrumentation Grant S10OD017985 and S10RR025141 and CTSA grants UL1TR002243, UL1TR000445, and UL1RR024975.
Competing interests: The authors declare they have no competing interests.
Data availability statement: All GWAS summary statistics used in this study are publicly available and can be obtained from the following consortium groups and websites: International PCOS Consortium (https://doi.org/10.17863/CAM.27720), UK Biobank (http://www.nealelab.is/uk-biobank), HERMES (http://www.broadcvdi.org/), and CARDIoGRAMplusC4D (http://www.cardiogramplusc4d.org/). The summary statistics used from the Million Veteran Project can be accessed through dbGaP under accession number phs001672.v6.p1. The electronic health record data that support the findings of this study are available from Vanderbilt University Medical Center, but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are, however, available from the authors upon reasonable request and with permission of Vanderbilt University Medical Center.