Abstract
Introduction There is a paucity of disaggregated data to monitor cancer health inequalities in Canada. We used data linkage to estimate site-specific cancer relative survival by race, immigration status, household income, and education level in Canada.
Methods We pooled the Canadian Census Health and Environment Cohorts, which are linked datasets of 5.9 million respondents of the 2006 long-form census and 6.5 million respondents of the 2011 National Household Survey. Individual-level respondent data from these surveys were probabilistically linked with the Canadian Cancer Registry up to 2015 and with the Canadian Vital Statistics Database up to 2019. We used propensity score matching and Poisson models to calculate age-standardized relative survival by equity stratifiers for all cancers combined and for 22 individual cancer sites for the period 2006-2019.
Results There were 757,485 primary cancer cases diagnosed over follow-up included in survival analyses; the age-standardized period relative survival was 72.5% at 5 years post-diagnosis. Relative survival was higher in immigrants (74.6%, 95%CI 74.3-74.8) than in Canadian-born persons (70.4%, 95%CI 70.2-70.6), and higher in racial groups with high proportions of immigrants. There was a marked social gradient by household income and education level, with 11-12% lower relative survival in cancer patients in the lowest household income and education levels than in the highest levels. Socioeconomic gradients were observed for most cancer sites, though the magnitude varied.
Conclusions Despite the availability of universal healthcare in Canada, the observed differences in relative survival suggest there remain important inequities in cancer control and care.
Introduction
There is a paucity of data to monitor and address cancer health inequalities in Canada.1 While net cancer survival has markedly improved over the past few decades,2 it is unclear whether all Canadians have benefited equally from these improvements. Data from other countries have documented significant inequalities in cancer survival by social equity stratifiers such as race, ethnicity, education, income, and immigration status.3,4 Some studies have examined survival by these health equity stratifiers in Canada, but they have generally been restricted to analyses from individual provinces (mostly Ontario).5–9 It has proved more challenging to provide pan-Canadian-level statistics because the Canadian Cancer Registry (CCR) does not collect unique health identifiers or sociodemographic data beyond age, sex, and postal code. National-level statistics on cancer survival inequalities have, therefore, mostly focused on location-based inequalities such as survival by province, rurality, and ecological neighborhood-level socioeconomic status based on residential postal code,10,11 rather than individual-level equity stratifiers.
Cancer survival is generally reported by cancer registries using relative survival as the estimate of net survival to cancer.12,13 Relative survival is calculated by dividing the observed survival in cancer cases by the expected survival for the general population, generally estimated using population life tables. Relative survival is interpretable as the expected probability of cancer survival in the absence of other causes of death. Part of the difficulty in estimating cancer survival disaggregated by health equity stratifiers comes from a lack of background mortality data to calculate expected survival. For example, relative survival by race in the US can be estimated due to the availability of life tables by race.14 Canadian life tables are only stratified by age, sex, and province.15 Because the background risk of non-cancer mortality differs by socioeconomic and demographic factors, it is not possible to calculate relative survival with life tables unless sociodemographic data can be linked to both cancer registries and vital statistics databases.12 The validity of relative survival is also predicated on the assumption that life tables provide a good estimate of the expected mortality risk that a person with cancer would have experienced had they not had a cancer diagnosis (the counterfactual).13 This assumption is questionable, as cancer cases likely differ from the general population on a number of social determinants affecting both cancer and non-cancer mortality. Methods to estimate background mortality based on more variables than those typically included in life tables could potentially lead to improved estimates of relative survival, especially when comparing groups with different background mortality rates.
Recent probabilistic data linkages since 2019 of the CCR and vital statistics with survey data by Statistics Canada make it now possible to examine health inequalities by individual-level stratifiers in cancer relative survival at the national level. Our primary objective was to estimate relative cancer survival in Canada stratified by race, income, education, immigration status, and cancer site in a representative sample to assess the presence of inequalities for health equity monitoring at the pan-Canadian level. Our secondary objective was to compare two methods for estimating the expected background mortality in different groups and assess its impact on relative survival estimates: matching of cancer cases to controls versus group-specific life tables.
Methods
Data sources
The CCR collects data on all new primary cancer cases diagnosed among Canadian residents since 1992. It covers all provinces and territories; however, cancer cases from the province of Québec are missing after 2010, as at the time of data linkage the province of Québec had not submitted data to the CCR since 2010. The Canadian Vital Statistics Death (CVSD) database collects information on all deaths in Canada. We used the CCR to identify primary cancer diagnoses and the CVSD to identify their date of death.
The Canadian long-form census is a mandatory survey conducted every five years targeting a representative sample of 20% of the population residing in Canada. It collects information on the demographic, social, and economic characteristics of households to support planning and government activities. In 2011, the long-form was replaced with the voluntary National Household Survey (NHS), which targeted the same population and collected the same information. While the voluntary NHS had a lower response rate (68.6%) than the mandatory 2006 long-form census (93.8%), measures were deployed to offset data quality risks and validate the representativity of results using other data sources.16 We used the self-reported data from these surveys to define health equity stratifiers.
The Canadian Census Health and Environment Cohorts (CanCHECs) 2006 and 2011 are a probabilistic linkage of the respondents of the 2006 long-form census and 2011 NHS with the CCR and CVSD.17 The linkage rates were 90.8% and 96.7% for 2006 and 2011 respondents, corresponding to 5.9 and 6.5 million individuals, respectively. Data were linked to the CCR up to 2015, and to the CVSD up to 2019. We pooled both cohorts to increase sample size and follow-up time.
Health equity stratifiers
Health equity stratifiers were based on respondent data from the long-form and NHS. Race is a social construct based on perceived differences in physical appearance, and was categorized into the following groups based on Canadian Institute for Health Information recommendations:18 Black, East Asian, Indigenous, Latin American, Middle Eastern, South Asian, Southeast Asian, White, and Other. The Other group includes those who do not identify with any of the named groups or who identify as a mix of different groups. Income was measured as the total after-tax income from all household members, based on either linkage to tax files or self-report in the census questionnaire. Household income was adjusted by an equivalency factor which accounts for household size, and categorized into quintiles at the national level. Education level was measured according to a person’s highest completed degree. A Canadian-born person is someone born in Canada and a citizen by birth. An immigrant is a person who has been granted the right to live permanently in Canada by immigration authorities, including both naturalized citizens and permanent residents. A non-permanent resident is a person who has a temporary work/study permit or is a refugee claimant.
Cancer case inclusions
We used a period survival analysis approach,19 including both incident cases diagnosed during follow-up (2006-2019) and prevalent cases who were within 10 years of their diagnosis on census day (mid-May 2006 or 2011). We included primary cancers among individuals aged 15 to 99 years at diagnosis; this age range was selected for consistency with national cancer statistics.20 We excluded cases whose diagnosis was established by autopsy only or death certificate only, cases with a missing diagnosis date, and cases whose death date preceded their diagnosis date. We allowed multiple primary cancers per person to contribute to analyses if they were diagnosed in different years. Cases were classified by cancer site using the grouping definitions of the 2021 Canadian Cancer Statistics (Supplementary Methods & Figures).20
Follow-up started either on the date of cancer diagnosis for incident cases, or on census day for prevalent cases. Cases contributed person-time to interval-specific conditional survival analyses for each year they were alive during follow-up, up to 10 years after their cancer diagnosis or the last date of data linkage on December 31st 2019. We excluded prevalent cancer cases for the analyses stratified by income to prevent reverse causality, as a cancer diagnosis can potentially affect a household’s income.
Statistical analyses
Relative survival based on matched controls
In our main analysis, we estimated relative survival using matching of cancer cases to controls to estimate the expected background mortality risk. Controls were selected among CanCHEC respondents who had no prior record of a site-specific cancer diagnosis in the CCR since 1992, and who were alive during the year of the diagnosis of their matched cancer case. Controls were matched to cancer cases using propensity score matching, using greedy nearest neighbor matching without replacement.21 The propensity score models included age, sex, race, household after-tax income quintile, education level, immigration status, neighborhood income level quintile, province of residence, rurality of residence, and calendar year as predictors of cancer diagnosis. Exact matches were requested for age, sex, race, and calendar year. Separate propensity score models were fit for each cancer site. Up to 3 controls were selected for each cancer case, except for the model for all cancer sites combined, where only 1 control was selected per case to reduce computational burden. The index date of start of follow-up for controls was the date of start of follow-up of their matched cancer case. We excluded controls whose date of death preceded the index date. Estimates were weighted using propensity matching weights.
We calculated relative cancer survival for each year up to 10 years post cancer diagnosis using a Poisson regression model. Dickman et al. (2004) proposed that relative survival could be estimated assuming a Poisson process for the number of deaths by interval since diagnosis, which assumes a piecewise constant mortality hazard function.22 We modified the parameterization of their proposed model to fit a Poisson model estimating the number of deaths from all causes (µi) for observation i: where agei is a categorical variable for age at diagnosis (15-44, 45-54, 55-64, 65-74, 75-99 years), sexi is a categorical variable for sex, intervali is a categorical variable indicating the time since cancer diagnosis in 1-year intervals, canceri indicates whether i is a cancer case or a control, stratifieri is a health equity stratifier of interest, and PYi are person-years at risk during the follow-up interval.
This model estimates the excess mortality rate in cancer cases compared to controls by time interval since diagnosis (parameters β4, β5), and allows different groups to have different excess cancer mortality rates (β7) while controlling for differences in background mortality from other causes (β6). We used a type 3 test of parameters to assess whether the excess cancer mortality rate differs across groups. The model parameters were used to estimate mortality rate ratios using the exponent of model parameters, and the cumulative survival by time t (St,i) in cases and controls using the transformation of the hazard approach, where Xi′β is the vector of model covariates: Cancer relative survival (RSt,i) was then calculated by dividing the expected survival for cancer cases by the expected survival for controls of the same age, sex, and stratifier group. Confidence intervals (95%CI) for relative survival were calculated using the log-log transformation and the delta method with the R msm package.23,24
Relative survival based on life tables
In secondary analyses, we recalculated relative survival using group-specific life tables stratified by race and household income quintile. The objective of this secondary analysis was to obtain relative survival estimates comparable to those reported by the Canadian Cancer Statistics for validation of sample representativity, and to compare results with relative survival estimated with the aforementioned matching methods. We calculated age, sex, income- and race-specific life tables for the entire CanCHEC 2006 & 2011 cohorts. We used the hazard transformation approach to estimate relative cancer survival using the period method.25 Estimates were weighted using CanCHEC survey weights. CIs for relative survival estimates were obtained through bootstrapping, using the 2.5th to 97.5th percentiles of 500 CanCHEC-specific bootstrapping weights. The Supplementary Methods & Figures provide more details on the methods used to generate life tables, relative survival estimates, and background mortality rates.
Age standardization
Relative survival estimates were age-standardized using the Canadian Cancer Survival Standard age weights for individual cancer sites.2,20 For all cancers combined, we used the international single standard of the EUROCARE-2 study.26 For Poisson models, we specified age weights in estimating equations to estimate age-standardized survival. For survival calculated with life tables, we used direct standardization.
Validation of estimates
To assess whether our survival estimates are representative of the Canadian population, we compared the relative survival estimates based on life tables in CanCHECs with official statistics of relative cancer survival based on life tables for all of Canada. Cancer survival has significantly improved in Canada over the past 20 years.2 Because our survival estimates span the period from 2006 to 2019, we compared our results to national relative survival estimates for 2006-2008 and 2015-2017 to define the target range.20,27
Ethics and confidentiality
We obtained ethical approval from the McGill University Institutional Review Board for this analysis of secondary data. To protect respondent confidentiality, analysis outputs were vetted using rules developed by Statistics Canada, which include rounding all counts to base 5 and not disclosing statistics for groups with less than 5 contributing events.
Results
Baseline characteristics of cancer cases and controls
The demographic characteristics of the CanCHECs have previously been described;17,28 we focus here on describing the characteristics of cancer cases contributing to analyses (Table 1). There were 757,485 cancer cases and 754,495 matched controls eligible for the all cancer sites combined period survival analyses. Cancer cases had a mean age at diagnosis of 63.1 years (standard deviation 14.2), and 50% were female. The majority self-identified as White (88%) and Canadian-born (75%). Cases and matched controls were comparable in terms of age, sex, race, immigration status, province of residence, rurality, household income, neighborhood income, and education level.
Relative survival compared to matched controls
Poisson model predictions for relative cancer survival by time since diagnosis for the overall cohort of cancer cases are presented in Table 2. Mortality rates for cancer cases were highest in the year following diagnosis; the mortality rate for all cancers combined was 15.9 times higher (95%CI 15.5-16.3) in cancer cases than in controls the first year after diagnosis, and declined to 1.5 times higher (95%CI 1.4-1.5) in cancer cases than in controls by the 10th year after diagnosis. The predicted age-standardized relative survival for all cancers combined was 84.1% 1 year post-diagnosis, 72.5% 5 years post-diagnosis, and 67.6% 10 years post-diagnosis.
There were significant differences in relative survival by race, immigration status, household income, and education level for all cancers combined (Figure 1, Tables 3 & 4). Specifically, the 5-year relative cancer survival was higher in non-White and non-Indigenous racial groups (72.0-76.9%) than in White (71.6%, 95%CI 71.4-71.8) and Indigenous (58.7%, 95%CI 57.9-59.4) persons, and was higher in immigrants (74.6%, 95%CI 74.3-74.8) than in Canadian-born persons (70.4%, 95%CI 70.2-70.6) (Table 3). There was a strong gradient in relative cancer survival by household income and education level, with 5-year relative cancer survival being lowest in the poorest income quintile (63.7%, 95%CI 63.3-64.1) and highest in the richest income quintile (75.2%, 95%CI 74.9-75.6), and lowest in those with no secondary school diploma (67.7%, 95%CI 67.4-68.0) compared with those with a graduate or medical university degree (79.3%, 95%CI 78.8-79.8) (Table 4). Estimates for all 10 years of follow-up by group can be found in the Supplementary Table 1. Similar social gradients were observed across most cancer sites, though the type 3 test for heterogeneity was not always significant due to low case numbers for some cancer sites, leading to higher uncertainty in estimates.
Relative survival compared to life tables
Relative survivals compared to life tables are presented in Table 2 for the overall population, and in Supplementary Table 2 by race and income. Many estimates could not be reported using life tables due to the number of cancer cases and deaths being below the confidentiality disclosure threshold of 5 contributing cases. The relative survival estimates using matched controls were higher than the relative survival estimates using life tables for all cancer sites, with the single exception of brain and central nervous system cancers, which had higher 5- and 10-year relative survival estimates using life table methods (Table 2). While the point estimates changed using different methods, the social gradients in survival by race and income were similar between methods.
Validation of estimates with historical data
The relative survival estimates using life table methods in CanCHECs 2006 & 2011 were comparable to relative cancer survival statistics based on life table methods for all of Canada during the period of 2006-2017 (Supplementary Methods & Figures). Some cancer sites had slightly lower relative survival estimates in CanCHECs (melanoma, uterine cancer, and leukemias), but the difference was of only 1-2%.
Discussion
We found significant differences in relative cancer survival by race, immigration status, household income, and education level in Canada for many major cancer sites over 2006-2019. Relative cancer survival was worse in persons with lower household income and lower education levels. Conversely, relative cancer survival was higher in immigrants and in non-White and non-Indigenous racial groups with high proportions of immigrants. Our validation exercise found that relative survival estimates in CanCHECs are comparable to those reported for Canada during the same time period using comparable methods, and consequently our estimates are likely representative of the Canadian population. The cancer survival differences we observed between social groups can be contextualized in terms of the historical progress in cancer control. Relative cancer survival improved by 6% over the 20-year period spanning from 1992/1994 to 2012/2014 in Canada.2 The 3-12% survival differences we observed between groups for many cancer sites are therefore analogous to being 10-40 years “behind” in benefiting from improvements in cancer control compared to more advantaged groups.
An important limitation of our analysis is that, despite the combination of two CanCHEC cycles over multiple years of follow-up to increase sample size, there were low event numbers for some cancer sites and demographic groups. This led to issues of power, data confidentiality, and limited ability to examine interactions between the different dimensions of health equity stratifiers affecting survival (intersectionality). While some of the differences in excess cancer mortality rates between groups were not statistically significant, this probably reflects the low number of events in some groups leading to higher uncertainty in estimates. A lack of statistical significance should not be interpreted as indicating an absence of inequalities, but rather a lower power to detect significant differences for some rarer cancers. This issue was particularly important for survival disaggregated by race, as the vast majority of cancer cases were in White persons (88%), due in part to the White population being older and having higher cancer incidence rates than other racial groups in Canada.28 This led to lower power to assess clinically meaningful differences by race. While the effects of sex, immigration, household income, and education are also likely to differ by racial group, we were unable to assess these interactions due to low case numbers.
We found that higher household income and education level were consistently associated with improved relative cancer survival across most sites. The majority of previous studies of disparities in cancer survival in Canada have looked at neighborhood-level rather than individual-level measures of income.7–9 Our results confirm that these socioeconomic gradients apply at the individual level as well. Canada has a universal health care system funded through taxes, with all citizens and permanent residents having insurance covering most primary and hospital care.29 While this partly mitigates inequalities in healthcare access by income, the financial burden of a cancer diagnosis remains unequally distributed as coverage for pharmacy-dispensed prescription drugs and home care is more variable, and cancer patients still face indirect costs such as travel or income loss.30 Inequalities in health persist because individuals with more resources such as higher income and education are better able to avail themselves of preventive and therapeutic care to improve their health.31 Cancer screening rates display a socioeconomic gradient despite being covered by insurance in Canada.32,33 While screening and early diagnosis may contribute to some of the observed differences, previous studies have found that adjusting for stage at diagnosis, however, only partly explain the survival gradient by socioeconomic status.7 Differences in treatments received may also contribute to observed differences in survival by socioeconomic status.8,9
Studies of immigrants in Canada have documented barriers in access to care such as language and cultural norms.34 Immigrants have lower cancer screening participation rates than Canadian-born persons, and are are less likely to have their cancer detected by screening.35–37 Nonetheless, our study as well as others consistently find strong evidence of a healthy immigrant effect, where immigrants have generally better cancer outcomes and mortality rates than those born in Canada.5,28,38,39 The majority of immigrants to Canada are economic immigrants who are selected based on their ability to contribute to Canada’s economy,40 and are required to undergo a medical examination as part of the selection process. This selection process leads to an immigrant population with better average health and performance on a number of health indicators such as fewer chronic conditions as well as lower rates of obesity and smoking than those born in Canada.41 The presence of more underlying comorbidities in Canadian-born persons could influence cancer survival through more advanced stages at diagnosis and longer diagnostic intervals, and potentially also differences in tumour biology.42
We observed large differences in relative survival by race, though these were not always significant. A large proportion of the individuals from non-White and non-Indigenous racial groups are immigrants,28 which may in part explain the higher relative cancer survival across multiple cancer sites compared with White and Indigneous cancer patients, who are predominantly born in Canada. The observed lower cancer survival in Indigenous peoples reflects the enduring legacy of colonialism, which continues to cause barriers in accessing high-quality cancer care for the Indigenous peoples of Canada (the First Nations, Inuit and Métis peoples).43 Several prior studies have examined the cancer survival experiences of different Indigenous peoples using a distinctions-based approach.44–48
The choice of method to calculate net cancer survival is important when comparing social groups with different background mortality rates, as different methods lead to different results.49 In this study, we used relative rather than cause-specific survival to estimate net cancer survival. Cause-specific survival methods use cause-of-death data to estimate net survival from cancer while censoring deaths from other causes (e.g. Kaplan-Meier, Cox models). However, relative survival is generally preferred over cause-specific survival by cancer registries to measure net cancer survival because it avoids errors due to misclassification of the cause of death.12,13 Death certificates require a subjective assessment of the cause of death, which may be difficult to resolve in the presence of multiple underlying health conditions. We opted not to use cause-specific survival in part due to differences in the prevalence of comorbities between different groups,41 which could lead to differential misclassification of death by the certifier. While relative survival avoids potential misclassification biases from cause of death certification, it can still be a biased estimate of net survival if the comparator is not a good estimate of expected survival in the absence of cancer. Our results suggest that there may be important differences in the background mortality risk of cancer cases compared with general population life tables, and that consequently relative survival compared with life tables likely underestimates net cancer survival, even when using group-specific life tables.
In conclusion, we believe our study fills an important data gap by providing representative national-level estimates of net cancer survival stratified by key individual-level health equity stratifiers for monitoring health equities. We provide estimates using two methods of calculating net survival, including estimates that are comparable with national statistics. Despite universal healthcare access, differences in cancer survival exist by race, immigration status, household income, and education level in Canada, suggesting the root causes of cancer survival inequalities go beyond differences in healthcare access.
Funding
This study was funded by a Canadian Institutes of Health Research (CIHR) (grant 179901) to E.L. Franco. The analysis presented in this paper was conducted at the Quebec Interuniversity Centre for Social Statistics (QICSS) which is part of the Canadian Research Data Centre Network (CRDCN); the services and activities provided by the QICSS are made possible by the financial or in-kind support of the Social Sciences and Humanities Research Council (SSHRC), the Canadian Institutes of Health Research (CIHR), the Canada Foundation for Innovation (CFI), Statistics Canada, the Fonds de recherche du Québec and the Québec universities.
Conflicts of interest
SBP and AM have no conflicts of interest to declare. TM is a board member of the International Papillomavirus Society. ELF reports grants to his institution from the Canadian Institutes of Health Research, the National Institutes of Health, and Merck during the conduct of the study; and personal fees from Merck. MZ and ELF hold a patent related to the discovery “DNA methylation markers for early detection of cervical cancer”, registered at the Office of Innovation and Partnerships, McGill University, Montréal, Québec, Canada.
Data availability
Restrictions apply to the availability of these data, which were used under license by Statistics Canada for this study. Eligible researchers can apply for data access through the Statistics Canada Research Data Centre program (https://www.statcan.gc.ca/en/microdata/data-centres/access). Program code and Poisson model outputs used for the current analyses are available at the Borealis repository at https://doi.org/10.5683/SP3/STER2V
Acknowledgements
Adapted from Statistics Canada, Canadian Census Health and Environment Cohorts 2006 & 2011, 2006 long-form census, 2011 National Household Survey, Canadian Vital Statistics Death Database 2006-2019, and Canadian Cancer Registry 2006-2015. This does not constitute an endorsement by Statistics Canada of this product. The Postal CodeOM Conversion File Plus (7D) is based on data licensed from the Canada Post Corporation. The analysis presented in this paper was conducted at the Quebec Interuniversity Centre for Social Statistics (QICSS) which is part of the Canadian Research Data Centre Network (CRDCN). The views expressed in this paper are those of the authors, and not necessarily those of the CRDCN, the QICSS or their partners.