Abstract
The role of race/ethnicity in genetic predisposition of early-onset cancers can be estimated by comparing family-based cancer concordance rates among ethnic groups. We used linked California health registries to evaluate the relative cancer risks for first degree relatives of patients diagnosed between ages 0-26, and the relative risks of developing distinct second malignancies (SPMs). From 1989-2015, we identified 29,631 cancer patients and 62,863 healthy family members. Given probands with cancer, there were increased relative risks of any cancer for siblings and mothers [standardized incidence ratio (SIR)=3.32;95% confidence interval (CI):2.54-4.35;P<0.001)]and of SPMs (SIR=7.12;95%CI:5.46-9.28;P<0.001). Higher relative risk of any cancer in siblings and mothers (P=0.001) was observed for Latinos (SIR=3.36;95%CI:2.24-5.05) compared to non-Latino White subjects (SIR=2.60;95%CI:1.66-4.06). Latinos had higher relative risks in first degree relatives and higher SPM risk compared to non-Latino Whites for most cancers, supporting a need for increased attention to the genetics of early-onset cancer predisposition in Latinos.
Key Messages
We identified 29 631 cancer patients and their 62 863 healthy family members in California from 1989 to 2015.
The risk of early-onset cancer in siblings and mothers was elevated by having a proband with cancer in the same family.
The majority of early-onset cancers exhibited higher relative risks in siblings and mothers and second primary malignancies for Latinos when compared to non-Latino Whites.
Introduction
Both genetic and environmental factors play a role in the causes of early-onset cancer. Several well-defined genetic syndromes contribute to early-onset cancer risk, along with a wider array of common alleles that influence risk marginally and detected at the population level. As an example of the former, Li-Fraumeni syndrome caused by mutations in the tumor suppressor gene TP53, is associated with an increased risk of a spectrum of cancers diagnosed at early ages1. An example of low penetrance common genetic variations associated with cancer risk include IKZF1 and ARID5B genes in pediatric acute lymphoblastic leukemia (ALL)2. Both of these classes of variants may vary in frequency by race/ethnic group and cluster by families3-5, where familial germline deletion of ETV6 was reported for ALL6. Examination of cancer predisposition requires investigation in ethnic strata particularly where cancer incidence rates are known to differ as it does for many pediatric cancer types, such as leukemia7 and brain cancer8.
In addition to primary cancers, incidence patterns of second independent malignancies may also provide a perspective of underlying genetic predisposition. Among childhood cancer survivors, more second primary malignancy cases are observed among non-Latino Whites (NLW) than Latino subjects9. This is also reflected in adult cancers, where Latino breast cancer survivors had lower risk of second cancers than NLW and NL Black women10.
Germline pathogenic/likely pathogenic variants in cancer predisposition genes are found in approximately 10% of pediatric cancer patients1,11-13, and may be inherited or arise de novo. Highly penetrant inherited variants will contribute to clustering of cancer cases within the family. Shared environments within the family unit may also be considered alongside genetic risk as potential causes for family-based cancer concordance.
Familial concordance of a wide variety of cancers has been assessed using the Swedish Family-Cancer Database, leading to a deep understanding of familial relative risks14-16. The Victorian Paediatric Cancer Family Study in Australia also explored the cancer risks for relatives of children with cancer in a small population17. In the US, the Utah Population database may be the best-known population for studying familial risk18-22. Importantly, these studies largely comprise families of European ancestry, and therefore have not examined potential ethnic-specific familial risks. Here, we utilized linked population registries with over 64,000 individuals to quantify the familial risks (siblings and mothers) and the risks of early-onset second primary malignancies in the highly diverse and large population of California.
Methods
Source of Data
We used linked population-based registries in California to evaluate the relative risks of early-onset cancers (0-26 years age of onset) for siblings and mothers of children, adolescents, and young adults (AYA) aged 0-26 diagnosed with cancer, as well as the relative risks of early-onset second primary malignancies (SPMs) among the proband patients. The dataset was created by linking information from the California Cancer Registry (CCR) and California Birth Statistical Master File, allowing the capture of siblings and parents of cancer probands, along with their cancer incidence. The linked dataset comprehensively encompassed all cancer cases 0-26 years old, as well as their sibling and mother cancers, diagnosed from 1989 to 2015 in California. Our upper age limit of 26 was set based on the available age range covered by this relatively young cohort. Overall, the dataset included a total of 121 571 individuals. The information on healthy siblings and mothers was available during the whole study period, whereas the information on fathers was not available until 2004 in the birth files and therefore is not included in the current analysis.
For the analysis of cancer familial risks, we included all primary incident cancer cases diagnosed from 1989 to 2015 among patients aged 0-26 years, with patient age-at-diagnosis limited by the study time period for which California maintained a statewide SEER gold-standard cancer registry. In addition, we performed a subgroup analysis on pediatric cancers including patients aged 0-15 years diagnosed within the same study period. For the analysis of secondary cancer risks, we included all SPMs diagnosed over the same years and patient age ranges. Although the CCR only records primary malignancies, some misclassification of relapsed or recurrent disease is possible. A physician (E.N.) reviewed diagnosis codes of all the cases diagnosed after the first primary case to prevent the misclassification of relapsed first primary malignancies (FPMs) as SPMs. For both analyses, we classified the cancers into twelve-broad groups and subgroups as defined by the International Classification of Childhood Cancer, Third edition (ICCC-3, November 2012) (https://seer.cancer.gov/iccc/iccc3.html). The abbreviations for cancer types are included in Table 1. We also grouped the cancers into hematologic or solid categories in the analyses. Hematologic cancers were defined as leukemias and lymphomas. Solid cancers were defined as CNS tumors, neuroblastomas, retinoblastomas, renal tumors, hepatic tumors, bone tumors, sarcomas, germ cell tumors, epithelial neoplasms and other and unspecified malignant neoplasms.
Statistical Analysis
We quantified the relative risks for siblings and mothers, and the relative risks of SPMs by calculating the standardized incident ratios (SIRs) of a given cancer or of SPMs among the healthy siblings, and the SIRs of a given cancer among healthy mothers of probands using a previous published method14. We defined a proband as a pediatric or AYA patient with a given cancer. Only one child/AYA in each family can be a proband, so that in families with two or more cases, the proband is defined as the patient with the earliest date of diagnosis. Given a proband with cancer, we calculated the SIRs for a sibling or a mother in the same family for all types of cancers. Separately, we calculated the SIRs for a sibling for the same type of cancer as the proband. We also stratified the analyses by self-identified race/ethnicity of the mother in each family. The SIRs in siblings, mothers or of SPMs can be denoted as:
Where N is the number of families, ni is the number of non-proband individuals of interest (siblings/SPMs/mothers) in family i, and Kmax is the total number of age intervals. The data for each individual includes a disease indicator (Dij) and the number of years “at risk” during the kth age interval (tijk). A given individual is defined to be at risk beginning at their age when the proband in their family is diagnosed and ending either when they become affected themselves or they are censored due to end of study follow-up. For siblings and mothers, age was stratified into seven groups as 0, 0-4, 5-9, 10-14, 15-19, 20-24 and 25-29 years. For the calculation of SIRs within a given race group, λk is the race-, sex- and age-specific incidence rate of a given cancer. We compared the SIRs across race/ethnic groups with approximate Chi-squared tests. The approximate chi-square method compares the probability of occurrence of events in one group to another, based on a binomial distribution. This comparison is not related to the 95% confidence intervals for the SIRs which are based on a Poisson distribution and may overlap between two groups that are significantly different by approximate chi-square comparison. We designated that all events occurred right at the middle point of each calendar year. We also stratified the analysis by 5-year age groups. The 95% confidence intervals (CIs) and p-values were calculated assuming a Poisson distribution. Statistical analyses were performed using R software (v 3.6.0). Any two-sided p-value less than 0.05 was considered statistically significant. A supplement is included with this manuscript with more information on the statistical tests and computational codes used. Please access this supplement “Statistics and coding supplement: Familial_risk_CCR_eLife” for more information.
Results
Demographics of the study population
From 1989 to 2015, we identified a total of 29 249 pediatric and AYA patients with a primary malignancy, comprising 29 072 probands, 112 affected siblings (from 110 families) and 65 affected mothers. All siblings were diagnosed after the proband’s diagnosis as defined, and 56 (86.15%) of the 65 mothers were diagnosed after the proband’s diagnosis. We also identified 387 SPMs among all pediatric and AYA probands (Table 2).
Familial relative risks of early-onset cancers
Overall, we found a 3.32-fold (95%CI:2.54-4.35) increased relative risk of any cancer among siblings and mothers who have a proband with cancer in the same family. Briefly, we found a 2.97-fold (95%CI:1.96-4.50) increased relative risk of any cancer given a proband with hematologic cancers and a 4.54-fold (95%CI:3.19-6.45) increased relative risk of any cancer given a proband with solid cancers. When stratified by cancer types, with the exception of neuroblastomas, hepatic tumors, bone tumors, and malignant neoplasms, proband cancers siblings and mothers were found to have increased SIRs for any cancer (Figure 1A).
For the relative risk of specific cancer types, we found a 2.68-fold (95%CI:1.35-5.32) increased risk of hematologic cancers among siblings and mothers of a proband with hematologic cancer, and a 6.78-fold (95%CI:4.18-10.98) increased relative risk of solid cancers among siblings and mothers of a proband with solid cancer. Furthermore, CNS tumors and epithelial neoplasms exhibited statistically significantly increased relative risk for the same type of cancer as the proband [CNS, SIR=6.19; 95%CI:1.42-26.93; epithelial neoplasms, SIR=40.68; 95%CI: 3.17-521.78] (Supplementary Table 1). We also observed increased relative risks for sarcomas (SIR=7.36;95%CI:1.12-48.23) and epithelial neoplasms (SIR=5.37;95%CI:1.52-18.92) given a proband with leukemia, for epithelial neoplasms given a proband with CNS tumors (SIR=6.37;95%CI:1.33-30.41), and for leukemias given a proband with sarcomas (SIR=7.01;95%CI:1.12-43.92) (Figure 2 & Supplementary Table 1).
When stratified by more finely defined cancer subtypes, increased relative risks of any cancer for siblings and mothers were observed given a proband with lymphoid leukemia, acute myeloid leukemia, astrocytoma, certain gliomas, certain specified intracranial and intraspinal neoplasms, nephroblastoma and other nonepithelial renal tumors, rhabdomyosarcomas, certain specified soft tissue sarcomas, malignant gonadal germ cell tumors and certain unspecified carcinomas (Supplementary Table 4).
When stratified by race/ethnicity, the relative risk of any cancer for siblings and mothers given a proband with cancer was significantly higher among Latino subjects than non-Latino White (NLW) subjects [Latino: SIR=3.36;95%CI:2.24-5.05; NLW: SIR=2.60;95%CI:1.66-4.06; P=0.001]. Non-Latino Asians/Pacific Islanders (API) and Blacks also had higher SIRs than non-Latino Whites (SIR=4.58 and 9.96, respectively, Supplementary Table 2), but small numbers resulted in unstable estimates. Latino subjects also exhibited higher relative risks of any cancer than NLW subjects for solid cancers [Latino: SIR=4.98;95%CI:2.86-8.68; NLW: SIR=3.02;95%CI:1.72-5.28; P=0.001], while NLW subjects exhibited higher relative risks than Latino subjects for hematologic cancers [Latino: SIR=2.48;95%CI:1.37-4.50; NLW: SIR=2.69;95%CI:1.29-5.62; P=0.016] (Figure 1B & Supplementary Table 2). For the relative risk of specific cancers given a proband with the same cancer, Latino subjects showed higher relative risk of solid cancers than NLW subjects [Latino: SIR=7.94;95%CI:3.64-17.34; NLW: SIR=4.41;95%CI:2.10-9.23; P=0.005] (Supplementary Table 3).
In the subgroup analysis of pediatric patients aged 0-15 years, we observed a similar trend where the relative risk of any cancer for siblings and mothers was significantly higher among Latino subjects than NLW subjects [Latino: SIR=1.51;95%CI:1.00-2.26; NLW: SIR=0.93;95%CI:0.60-1.46; P=0.003] (Supplementary Table 5).
Relative risks of second primary malignancies
Overall, we found a 7.12-fold increased risk of SPMs relative to the general population among children/AYAs with an FPM (SIR=7.12;95%CI:5.46-9.28). Most primary cancer types were associated with an elevated relative risk of SPMs (Figure 3A). When stratified by race/ethnicity, slightly higher relative risk of all SPMs given a proband with cancer was observed among Latino subjects than NLW subjects [Latino: SIR=6.85;95%CI:4.56-10.30; NLW: SIR=6.67;95%CI:4.26-10.43; P=0.001] (Figure 3B).
For the relative risks of SPMs of the same cancer types as the FPM, we found elevated risks for both hematologic and solid cancers. When stratified by race/ethnicity, higher relative risk of hematologic cancers was observed among NLW subjects compared to Latino subjects given a proband with hematologic cancers [Latino: SIR=4.23;95%CI:1.68-10.68; NLW: SIR=6.61;95%CI:1.80-24.29; P=0.012] (Supplementary Table 6).
Discussion
To our knowledge, this is the first study to quantify the familial clustering risks and risks of SPMs among early-onset cancer patients with an emphasis on racial/ethnic differences. Using linked population registry data in the California population, we found that the risk for a sibling child/AYA or mother to have early-onset cancer was elevated once a proband was identified with an early-onset cancer. Likewise, the relative risks for SPMs were elevated among children/AYAs who contracted a first primary cancer. The findings were consistent across race/ethnic groups; however, the magnitude was different. Latinos had higher sibling/maternal relative risks as well as risks for SPMs compared to NLWs for most cancers.
Consistent with our results, a rich literature with a primary focus on European ancestry populations has reported excessive familial risks of hematologic malignancies14, lymphomas23-25, brain tumors26,27, neuroblastomas28, retinoblastomas25,28, germ cell tumors29, sarcomas30 and melanomas31. In terms of secondary cancers, studies have reported excessive risks of second primary malignancies among of survivors of hereditary retinoblastoma32, chronic myeloid leukemia33, chronic lymphocytic leukemia34, Hodgkin’s lymphoma35, non-Hodgkin’s lymphoma36, and neuroblastoma37. The excessive familial risks of certain cancers are highly likely to be associated with genetic predisposition. The archetypic examples are germline loss-of-function mutations in RB1, which are found in ∼40% of retinoblastoma cases28, and adrenal cortical cancer, with germline TP53 mutations accounting for most familial cases28. Low penetrance common genetic variations, for instance in CEBPE, IKZF1, and ARID5B genes in ALL, are associated with cancer risk and may also contribute to familial concordance as combinations of low frequency alleles or “polygenic risk scores” have been shown to be as impactful as single strong predisposition mutations in adult cancers38,39; however, their contribution to cancer clustering among children and their families has not yet been studied.
Our data demonstrate a higher degree of familial-based clustering among Latinos compared to non-Latino Whites. This familial concordance is likely due to both shared genetic and environmental causes and is accompanied by a clearly higher incidence of some cancer types. Latinos are an admixed population, comprising an ancestral mixture from Native American, European, and African sources. California Latinos, particularly the youth population are largely from Mexico, and harbor a higher risk of certain cancers particularly pediatric leukemias, the most common cancer in children40; however this higher risk is partially accounted for by a higher frequency of common risk alleles which do not address strong familial predisposition loci4. This higher risk identified in relation to the family unit has not been studied, and our results here beg for an analysis of comparative sources of genetic and environmental risk that contribute to the higher risk and familial clustering of cancers in Latinos.
Therapy of the first primary cancer is a major factor in the induction of secondary independent malignancies41-43. The purpose of radiation and chemo-therapies of cancer patients is to improve survivorship. In the current analysis, better survivorship of hematologic cancers for non-Latino Whites compared to Latinos may however contribute to the excessive risk of second primary malignancies among White subjects44, as some Latino subjects with hematologic cancers may not have survived long enough to develop a secondary cancer. Nevertheless, multiple primary cancer diagnoses are considered a key feature of hereditary cancer predisposition syndromes45. As such secondary cancers are rare, genetics are still likely to play a strong role46, and our overall SPM results here emphasize a similar ethnic-specific patterning as cancer clustering in first primary malignancies. Of note, our analysis was not designed to distinguish relative risk contributions from therapy and genetic sources for secondary cancers.
For some tumor types the germline predisposition was readily noted in this cohort, for example ten of the fourteen affected relatives who had a proband with retinoblastoma were diagnosed with the same cancer, an unsurprising finding given that germline RB1 mutations account for a significant proportion of retinoblastoma are highly penetrant and those tumors tend to be diagnosed young. We also observed increased relative risks for sarcomas given a proband with leukemias, suggesting the presence of families with Li-Fraumeni syndrome, which is characterized by a spectrum of childhood and adult-onset cancers including adrenocortical carcinoma, breast cancer, CNS tumors, sarcomas, and leukemia45.
Population-level selection pressures are thought to influence the relative frequencies of alleles. For instance, genetic adaptations that shaped the Native American genome to cold and warm environments47, and immune response following colonization by Europeans48. Our result suggests that some adaptive selection pressures, or simply genetic drift exacerbated by bottlenecking of genetic diversity during the Native American population history may differentially influence familial cancer clustering by age of cancer onset49. If replicated in other study settings, this contrast between genetic risk child and adult onset cancers by ethnicity should be studied further for a fuller understanding of familial risks.
Our analyses capitalized on the highly diverse population in California, allowing us to quantify the relative risks across different ethnic groups. Moreover, the utilization of linked population-based registries in California enabled us to minimize the selection and information biases introduced by a case-control study design or other strategies that only sample portions of the population. Furthermore, although Latinos usually have more people per household than NLWs50, detection bias of cancers among siblings and mothers is less likely, which is reflected by the excessive relative risk of lymphomas among NLW compared to Latino subjects in the current analysis in agreement with previous reports51,52. There are also some limitations of our study. Despite the large number of total cancer cases, the number of affected siblings and second primaries are very small for some cancer types, thus limiting the power to detect significant relative risks. Also, we are unable to track cancer incidence for affected siblings, maternal cancers, and SPMs that may have been diagnosed outside of California. In addition, the follow-up time of 26 years is not enough for a comprehensive detection of SPMs in the probands, nor for cancers arising in proband mothers at older ages. These insufficient follow-up time and loss-to follow-up issues have limited our ability to quantify the relative risks among mothers with cancer onset at older ages (>40 yrs). Lastly, the lack of records on fathers reduces our ability to quantify the relative risks among other first-degree relatives and may reduce the appreciation of the potential contribution of high-risk cancer predisposition syndromes which can be inherited from either parent.
Accepting those limitations with the current dataset, our study has several important implications that may open windows to future research. First, the genetic predispositions driving the excessive childhood cancer risks among the Latino population, whether from higher frequencies of known cancer predisposition syndromes or mutations in novel genes, or a higher burden of common or rare genetic risk alleles, warrants further investigation. Second, the comparative attributable fraction of familial risk based on environmental risk factors interacting with genetic predispositions warrants further investigation. Lastly, descriptive studies on familial and secondary cancer risks among race/ethnic groups other than Latinos and non-Latino Whites may provide additional insights into cancer incidence variation by race/ethnicity.
Data Availability
We are prohibited by the State of California from sharing individual-level patient data.
Funding
This work was supported by the V Foundation for funding this work (Grant FP067172). The collection of cancer incidence data used in this study was supported by the California Department of Public Health as part of the statewide cancer reporting program mandated by California Health and Safety Code Section 103885; the National Cancer Institute’s Surveillance, Epidemiology, and End Results Program under contract HHSN261201000140C awarded to the Cancer Prevention Institute of California, contract HHSN261201000035C awarded to the University of Southern California, and contract HHSN261201000034C awarded to the Public Health Institute; and the Centers for Disease Control and Prevention’s National Program of Cancer Registries, under agreement U58DP003862-01 awarded to the California Department of Public Health. The ideas and opinions expressed herein are those of the author(s), and no endorsement by the California Department of Public Health, the National Cancer Institute, or the Centers for Disease Control and Prevention or their contractors and subcontractors is intended or should be inferred.
Declaration of interests
None.