Abstract
Objectives A polygenic hazard score (PHS1)—weighted sum of 54 single-nucleotide polymorphism genotypes—was previously associated with age at prostate cancer (PCa) diagnosis and improved PCa screening accuracy in Europeans. Performance in more diverse populations is unknown. We evaluated PHS association with PCa in multi-ethnic populations.
Design PHS1 was adapted for compatibility with genotype data from the OncoArray project (PHS2) and tested for association with age at PCa diagnosis, at aggressive PCa diagnosis, and at PCa death.
Setting Multiple international institutions.
Participants Men with available OncoArray data from the PRACTICAL consortium who were not included in PHS1 development/validation.
Main Outcomes and Measures PHS2 was tested via Cox proportional hazards models for age at PCa diagnosis, age at aggressive PCa diagnosis (any of: Gleason score ≥7, stage T3-T4, PSA≥10 ng/mL, nodal/distant metastasis), and age at PCa-specific death.
Results 80,491 men of various self-reported race/ethnicities were included (30,575 controls, 49,916 PCa cases; genetic ancestry groups: 71,856 European, 6,253 African, 2,382 Asian). Median age at last follow-up was 70 years (IQR 63-76); 3,983 PCa deaths, 5,806 other deaths, 70,702 still alive. PHS2 had 46 polymorphisms: 24 directly genotyped and 22 acceptable proxies (r2 ≥0.94). PHS2 was associated with age at PCa diagnosis in the multi-ethnic dataset (z=54, p<10-16) and in each genetic ancestry group: European (z=56, p<10-16), Asian (z=47, p<10-16), African (z=29, p<10-16). PHS2 was also associated with age at aggressive PCa diagnosis in each genetic ancestry group (p<10-16) and with age of PCa death in the full dataset (p<10-16). Comparing the 80th and 20th percentiles of genetic risk, men with high PHS had hazard ratios of 5.3 [95% CI: 5.0-5.7], 5.9 [5.5-6.3], and 5.7 [4.6-7.0] for PCa, aggressive PCa, and PCa-specific death, respectively. Within European, Asian, and African ancestries, analogous hazard ratios for PCa were 5.5 [5.2-5.9], 4.5 [3.2-6.3], and 2.5 [2.1-3.1], respectively.
Conclusions PHS2 is strongly associated with age at PCa diagnosis in a multi-ethnic dataset. PHS2 stratifies men of European, Asian, and African ancestry by genetic risk for any, aggressive, and fatal PCa.
What is already known on this topic
Genetic risk stratification can identify men with greater predisposition for developing prostate cancer, but these risk models may worsen health disparities, as most have only been validated for men of European ancestry
A polygenic hazard score was previously associated with age at prostate cancer diagnosis and improved PCa screening accuracy in Europeans
Performance of the polygenic hazard score in multi-ethnic populations is unknown
What this study adds
In a dataset from 80,491 men of various self-reported race/ethnicities, the polygenic hazard score was associated with age at prostate cancer diagnosis, aggressive prostate cancer diagnosis, and prostate cancer death.
PHS stratifies men of European, Asian, and African ancestry by genetic risk for any, aggressive, and fatal prostate cancer.
Introduction
Prostate cancer (PCa) is the second most common cancer diagnosed in men worldwide, causing substantial morbidity and mortality1. PCa screening may reduce morbidity and mortality2–5, but to avoid overdiagnosis and overtreatment of indolent disease6–9, it should be targeted and personalized. PCa age at diagnosis is important for clinical decisions regarding if/when to initiate screening for an individual10,11. Survival is another key cancer endpoint recommended for risk models12.
Genetic risk stratification is promising for identifying individuals with greater predisposition for developing cancer13–16, including PCa17. Polygenic models use common variants—identified in genome-wide association studies—whose combined effects can assess overall risk of disease development18,19. Recently, a polygenic hazard score (PHS) was developed as a weighted sum of 54 single-nucleotide polymorphisms (SNPs) that models a man’s genetic predisposition for developing PCa13. Validation testing was done using ProtecT trial data2 and demonstrated the PHS to be associated with age at PCa diagnosis, including aggressive PCa13. However, the development and validation datasets were limited to men of European ancestry. While genetic risk models might be important clinical tools for prognostication and risk stratification, using them may worsen health disparities20–24 because most models are constructed using European data and may underrepresent genetic variants important in persons of non-European ancestry20–24. Indeed, this is particularly concerning in PCa, as race/ethnicity is an important PCa risk factor; diagnostic, treatment, and outcomes disparities continue to exist between different races/ethnicities25,26.
Here, we assessed PHS performance in a multi-ethnic dataset that includes individuals of European, African, and Asian genetic ancestry. This dataset also includes long-term follow-up information, affording an opportunity to evaluate PHS for association with fatal PCa.
Methods
Participants
We obtained data from the OncoArray project27 that had undergone quality control steps described previously18. This dataset includes 91,480 men with genotype and phenotype data from 64 studies (Supplemental Methods). Individuals whose data were used in the prior development or validation of the original PHS model (PHS1) were excluded (n=10,989)13, leaving 80,491 in the independent dataset used here. Table 1 describes available data. Individuals not meeting the endpoint for each analysis were censored at age of last follow-up.
All contributing studies were approved by the relevant ethics committees; written informed consent was acquired from the study participants28. The present analyses used de-identified data from the PRACTICAL consortium.
Polygenic Hazard Score (PHS)
The original PHS1 was validated for association with age at PCa diagnosis in men of European ancestry, using a survival analysis13. To ensure the score was not simply identifying men at risk of indolent disease, PHS1 was also validated for association with age at aggressive PCa (defined as intermediate-risk disease, or above6) diagnosis13. PHS1 was calculated as the vector product of a patient’s genotype (Xi) for n selected SNPs and the corresponding parameter estimates (βi) from a Cox proportional hazards regression:
The 54 SNPs in PHS1 were selected using PRACTICAL consortium data (n=31,747 men) genotyped with a custom array (iCOGS, Illumina, San Diego, CA)13.
Genetic Ancestry Determination
Self-reported race/ethnicities27,29 included European, East Asian, African American, Hawaiian, Hispanic American, South Asian, Black African, Black Caribbean, and Other. Genetic ancestry (European, African, or Asian) for all individuals was used for the present analyses because it is objective and may be more informative than self-reported race/ethnicities30 (Supplemental Methods).
Adapting the PHS to OncoArray
Genotyping for the present study was performed using a commercially-available, cancer-specific array (OncoArray, Illumina, San Diego, CA)18. Twenty-four of the 54 SNPs in PHS1 were directly genotyped on OncoArray. We identified proxy SNPs for those not directly genotyped and re-calculated the SNP weights in the same dataset used for the original development of PHS113 (Supplemental Methods).
The performance of this new, adapted PHS (PHS2), was compared to that of PHS1 in the ProtecT dataset originally used to validate PHS1 (n=6,411). PHS2 was calculated for all patients in the ProtecT validation set and was tested as the sole predictive variable in a Cox proportional hazards regression model (R v.3.5.1, “survival” package31) for age at aggressive PCa diagnosis, the primary endpoint of that study. Performance was assessed by the metrics reported during the PHS1 development13: z-score and hazard ratio (HR98/50) for aggressive PCa between men in the highest 2% of genetic risk (≥98th percentile) vs. those with average risk (30th-70th percentile). HR 95 % confidence intervals (CIs) were determined by bootstrapping 1,000 random samples from the ProtecT dataset32,33, while maintaining the same number of cases and controls. PHS2 percentile thresholds are shown in the Supplement.
Any PCa
We tested PHS2 for association with age at diagnosis of any PCa in the multi-ethnic dataset (n=80,491, Table 1).
PHS2 was calculated for all patients in the multi-ethnic dataset and used as the sole independent variable in Cox proportional hazards regressions for the endpoint of age at PCa diagnosis. Due to the potential for Cox proportional hazards results to be biased by a higher number of cases in our dataset than in the general population, sample-weight corrections were applied to all Cox models13,34 (Supplemental Methods). Significance was set at α=0.01, and p-values reported were truncated at <10-16, if applicable13.
These Cox proportional hazards regressions (with PHS2 as the sole independent variable and age at PCa diagnosis as the outcome) were then repeated for subsets of data, stratified by genetic ancestry: European, Asian, and African. Percentiles of genetic risk were calculated as done previously13, using data from the 9,728 men in the original (iCOGS) development set who were less than 70 years old and without PCa. Hazard ratios (HRs) and 95% CIs for each genetic ancestry group were calculated to make the following comparisons: HR98/50, men in the highest 2% of genetic risk vs. those with average risk (30th-70th percentile); HR80/50, men in the highest 20% vs. those with average risk, HR20/50, men in the lowest 20% vs. those with average risk; and HR80/20, men in the highest 20% vs. lowest 20%. CIs were determined by bootstrapping 1,000 random samples from each genetic ancestry group32,33, while maintaining the same number of cases and controls. HRs and CIs were calculated for age at PCa diagnosis separately for each genetic ancestry group.
Given that the overall incidence of PCa in different populations varies, we performed a sensitivity analysis of the population case/control numbers, allowing the population incidence to vary from 25% to 400% of that reported in Sweden (as an example population; Supplemental Methods).
Aggressive PCa
Recognizing that not all PCa is clinically significant, we also tested PHS2 for association with age at aggressive PCa diagnosis in the multi-ethnic dataset. For these analyses, we included cases that had known tumor stage, Gleason score, and PSA at diagnosis (n=60,617 cases, Table 1). Aggressive PCa cases were those that met any of the following previously defined criteria for aggressive disease6,13: Gleason score ≥7, PSA ≥10 ng/mL, T3-T4 stage, nodal metastases, or distant metastases (Supplemental Methods). As before, Cox proportional hazards models and sensitivity analysis were used to assess association.
Fatal PCa
Using an even stricter definition of clinical significance, we then evaluated association of PHS2 with age at PCa death in the multi-ethnic dataset. All cases (regardless of staging completeness) and controls were included, and the endpoint was age at death due to PCa. This analysis was not stratified by genetic ancestry due to low numbers of recorded PCa deaths in the non-European datasets. Cause of death was determined by the investigators of each contributing study using cancer registries and/or medical records (Supplemental Methods). At last follow-up, 3,983 men had died from PCa, 5,806 had died from non-PCa causes, and 70,702 were still alive. The median age at last follow-up was 70 years (IQR 63-76). As before, Cox proportional hazards models and sensitivity analysis were used to assess association.
PHS and Family History
Family history (presence/absence of a first-degree relative with a PCa diagnosis) was also tested for association with any, aggressive, or fatal PCa. There were 46,030 men with available PCa family history data.
Cox proportional hazards models were used to assess family history for association with any, aggressive, or fatal PCa. To evaluate the relative importance of each, a multivariable model using both family history and PHS was compared to using family history alone (log-likelihood test; α=0.01). HRs were calculated for each variable.
Results
Adaption of PHS for OncoArray
Of the 30 SNPs from PHS1 not directly genotyped on OncoArray, proxy SNPs were identified for 22 (linkage disequilibrium ≥0.94). Therefore, PHS2 included 46 SNPs, total (Supplemental Results). PHS2 association with age at aggressive PCa diagnosis in ProtecT was similar to that previously reported for PHS1 (z=22 for PHS1, z=21 for PHS2, each p<10-16). HR98/50 was 4.7 [95% CI: 3.6-6.1] for PHS2, compared to 4.6 [3.5-6.0] for PHS1.
Any PCa
PHS2 was associated with age at PCa diagnosis in all three genetic ancestry groups (Table 2). Comparing the 80th and 20th percentiles of genetic risk, men with high PHS had a HR of 5.3 [5.0-5.7] for any PCa. Within each genetic ancestry group, men with high PHS had HRs of 5.5 [5.2-5.9], 4.5 [3.2-6.3], and 2.5 [2.1-3.1] for men of European, Asian, and African ancestry, respectively.
Aggressive PCa
PHS2 was associated with age at aggressive PCa diagnosis in all three genetic ancestry groups (Table 3). Comparing the 80th and 20th percentiles of genetic risk, men with high PHS had a HR of 5.9 [5.5-6.3] for aggressive PCa; within each genetic ancestry group, men with high PHS had HRs of 5.6 [5.2-6.0], 5.2 [4.8-5.6], and 2.4 [2.3-2.6] for men of European, Asian, and African ancestry, respectively.
Fatal PCa
PHS2 was associated with age at PCa death for all men in the multi-ethnic dataset (z=16, p<10-16). Table 4 shows z-scores and corresponding HRs for fatal PCa. Comparing the 80th and 20th percentiles of genetic risk, men with high PHS had a HR of 5.7 [4.6-7.0] for PCa death.
Sensitivity Analyses
Sensitivity analyses demonstrated that large changes in assumed population incidence had minimal effect on the calculated HRs for any, aggressive, or fatal PCa (Supplemental Results).
PHS and Family History
Family history was also associated with any PCa (z=40, p<10-16; Table 5), aggressive PCa (z=32, p<10-16), and fatal PCa (z=16, p<10-16) in the multi-ethnic dataset. Among those with known family history, the combination of family history and PHS performed better than family history alone (log-likelihood p<10-16). This pattern held true when analyses were repeated on each genetic ancestry. Additional family history analyses are reported in the Supplemental Results.
Discussion
These results confirm the previously reported association of PHS with age at PCa diagnosis in Europeans and show that this finding generalizes to a multi-ethnic dataset, including men of European, Asian, and African genetic ancestry. PHS is also associated with age at aggressive PCa diagnosis and at PCa death. Comparing the highest and lowest quintiles of genetic risk, men with high PHS had HRs of 5.3, 5.9, and 5.7 for any PCa, aggressive PCa, and PCa death, respectively.
We found that PHS is associated with PCa in men of European, Asian, and African genetic ancestry (and a wider range of self-reported race/ethnicities). Current PCa screening guidelines suggest possible initiation at earlier ages for men of African ancestry, given higher incidence rates and worse survival when compared to men of European ancestry26. Using the PHS to risk-stratify men might help with decisions regarding when to initiate PCa screening: perhaps a man with African genetic ancestry in the lowest percentiles of genetic risk by PHS could safely delay or forgo screening to decrease the possible harms associated with overdetection and overtreatment9, while a man in the highest risk percentiles might consider screening at an earlier age. Similar reasoning applies to men of all genetic ancestries. Risk-stratified screening should be prospectively evaluated.
PHS performance was better in those with European and Asian genetic ancestry than in those with African ancestry. For example, comparing the highest and lowest quintiles of genetic risk, men with of European and Asian genetic ancestry with high PHS had HRs for any PCa of 5.5 and 4.5 times, respectively, while the analogous HR for men of African genetic ancestry was 2.5 (similar trends were seen for aggressive PCa). This suggests PHS can differentiate men of higher and lower risk in each ancestral group, but the range of risk levels may be narrower in those of African ancestry. Possible reasons for relatively diminished performance include increased genetic diversity with less linkage disequilibrium in those of African genetic ancestry35–37. Known health disparities may also contribute25, as the availability—and timing—of PSA results may depend on healthcare access. Alarmingly, there has historically been poor representation of African populations in clinical or genomic research studies20,21. This pattern is reflected in the present study, where most men of African genetic ancestry were missing clinical diagnosis information used to determine disease aggressiveness. That such clinical information is less available for men of African ancestry also leaves open the possibility of systematic differences in the diagnostic workup—and therefore age of diagnosis—across different ancestry populations. Notwithstanding these caveats, the present PHS is associated with age at PCa diagnosis in men of African ancestry, possibly paving the way for more personalized screening decisions for men of African descent.
The first PHS validation study used data from ProtecT, a large PCa trial2,13. ProtecT’s screening design yielded biopsy results from both controls and cases with PSA ≥3 ng/mL, making it possible to demonstrate improved accuracy and efficiency of PCa screening with PSA testing. Limitations of the ProtecT analysis, though, include few recorded PCa deaths in the available data, and the exclusion of advanced cancer from that trial2. The present study includes long-term observation, with both early and advanced disease18, allowing for evaluation of PHS association with any, aggressive, and fatal PCa; we found PHS to be associated with all outcomes.
Age is critical in clinical decisions of whether men should be offered PCa screening38–40 and in how to treat men diagnosed with PCa38,39. Age may also inform prognosis39,41. Age at diagnosis or death is therefore of clinical interest in inferring how likely a man is to develop cancer at an age when he may benefit from treatment. One important advantage of the survival analysis used here is that it permits men without cancer at time of last follow-up to be censored, while allowing for the possibility of them developing PCa (including aggressive or fatal PCa) later on. PCa death is a hard endpoint with less uncertainty than clinical diagnosis (which may vary with screening practices and delayed medical attention). PHS may help identify men with high (or low) genetic predisposition to develop lethal PCa and could assist physicians deciding when to initiate screening.
Current guidelines suggest considering a man’s individual cancer risk factors, overall life expectancy, and medical comorbidities when deciding whether to screen6. The most prominent clinical risk factors used in practice are family history and race/ethnicity6,42,43. Combined PHS and family history performed better than either alone in this multi-ethnic dataset. This finding is consistent with a prior report that PHS adds considerable information over family history alone. The prior study did not find an association of family history with age at PCa diagnosis, perhaps because the universal screening approach of the ProtecT trial diluted the influence of family history on who is screened in typical practice13. In the present study, family history and PHS appear complementary in assessing PCa genetic risk. Moreover, the HRs for PHS suggest clinical relevance similar or greater to predictive tools routinely used for cancer screening (e.g., breast cancer) and for other diseases (e.g., diabetes and cardiovascular disease). HRs reported for those tools are around 1-3 for disease development or other adverse outcome44–48; HRs reported here for PHS (for any, aggressive, or fatal PCa) are similar or greater.
Limitations to this work include that the dataset comes from multiple, heterogeneous studies, from various populations with variable screening rates. This allowed for a large, multi-ethnic dataset that includes clinical and survival data, but comes with uncertainties avoided in the ProtecT dataset used for original validation. However, the heterogeneity would likely reduce the PHS performance, not systematically inflate the results. Second, we note that no germline SNP tool, including this PHS, has been shown to discriminate men at risk of aggressive PCa from those at risk of only indolent PCa. Third, while the genetic ancestry classifications used here may be more accurate than self-reported race/ethnicity alone30, possible admixed genetic ancestry within individuals was not assessed; future development will consider local ancestry. As noted above, clinical data availability was not uniform across contributing studies and was lower in men of African genetic ancestry. The PHS may not include all SNPs associated with PCa; in fact, more such SNPs have been reported since the development of the original PHS18, some specifically within non-European populations49–51. Further model optimization (possibly by incorporating additional SNPs) may improve PCa risk stratification. Future work could also evaluate the PHS performance in relation to epidemiological risk factors previously associated with PCa risk beyond those currently used in clinical practice (i.e., family history and race/ethnicity). Finally, various circumstances and disease-modifying treatments may have influenced post-diagnosis survival to unknown degree. Despite this possible source of variability in survival among men with fatal PCa, PHS was still associated with age at death, an objective and meaningful endpoint. Future development and optimization hold promise for improving upon the encouraging risk stratification achieved here in men of different genetic ancestries, particularly African.
Conclusion
In a multi-ethnic dataset comprising men of European, Asian and African ancestry, PHS was associated with age at PCa diagnosis, as well as age at aggressive PCa diagnosis, and at death from PCa. PHS performance was relatively diminished in men of African genetic ancestry, compared to performance in men of European or Asian genetic ancestry. PHS risk-stratifies men of European, Asian and African ancestry and should be prospectively studied as a means to individualize screening strategies seeking to reduce PCa morbidity and mortality.
Data Availability
The data used in this work were obtained from the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL) consortium, and from the ProtecT study. PRACTICAL consists of a collaborative group of researchers, each of whom retains ownership of their contributed data. Members of the consortium can use pooled data via proposals that are reviewed by the Data Access Committee. Approved proposals are then sent to the principal investigators of the PRACTICAL member studies, each of whom may opt to participate or not in the specific request. Readers interested in participating in the PRACTICAL consortium and gaining access to member data may find information about application at: http://practical.icr.ac.uk. ProtecT (http://www.bristol.ac.uk/population-health-sciences/projects/protect/) study data were provided by the principal investigators for the trials who retain ownership of this data; access to these data can be requested by contacting those principal investigators and submitting a request form for approval.
Footnotes
↵* Additional members from the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome consortium (PRACTICAL, http://practical.icr.ac.uk/) are provided in the Supplemental Material.