Abstract
BACKGROUND The benefits of mammography screening have been controversial, with conflicting findings from various studies.
METHODS We hypothesize that unmeasured heterogeneity in tumor aggressiveness underlies these conflicting results. Based on published data from the Canadian National Breast Screening Study (CNBSS), we develop and parameterize an individual-based mechanistic model for breast cancer incidence and mortality that tracks five stages of breast cancer progression and incorporates the effects of age on breast cancer incidence and all-cause mortality.
RESULTS The model accurately reproduces the reported outcomes of the CNBSS. By varying parameters, we predict that the benefits of mammography depend on the effectiveness of cancer treatment and tumor.
CONCLUSIONS In particular, patients with the most rapidly growing or potentially largest tumors have the highest benefit and least harm from the screening, with only a relatively small effect of age. However, the model predicts that confining mammography populations with a high risk of acquiring breast cancer increases the screening benefit only slightly compared with the full population.
1. Background
Breast cancer is one of three most commonly diagnosed cancers in women, making up 30% of all cancer cases in women in the United States in 2018 [1]. Screening mammography was introduced to detect small and more treatable tumors before they cause symptoms. Several trials, such as the Health Insurance Trial [2], the Edinburgh randomised trial [3,4], the Canadian National Breast Screening Study (CNBSS) [5] and the Swedish Two-Country Trial [6, 7], have quantified the benefits of screening mammography. The Swedish study and many others reported that breast cancer mortality was significantly reduced due to screening mammography [6, 8], while the CNBSS found no benefits [5, 9]. In addition, Welch et. al. also found no benefit in their analysis of the SEER data [10]. These contradictory conclusions have spurred intense debate over the benefits of screening mammography. The wide implementation of screening mammography has led to an increased rate of small tumor detection and a decreased rate of large tumor detection over the last decades [10]. The primary cost of screening is overdiagnosis of small benign or unaggressive tumors that would have remained asymptomatic during a patient’s lifetime, turning a healthy individual into a patient, and requiring follow-up tests and treatments with deleterious side effects including death [11]. Overdiagnosis also results in unwanted economic and psychological burdens. To address the controversy, the WISDOM study based on a woman’s individual risk was initiated in the United States in 2016 [12].
Studies based on statistical or stochastic models [13–15] have quantified the influence of various factors such as age, screening frequency and adherence behavior on the benefits and harmful effects of mammography screening based on different data sources or trials other than the CNBSS. Most of transition probabilities in these models were held constant or age-dependent, and thus did not include the effects of tumor heterogeneity across patients. Using the CNBSS, several analyses ([16, 17] and references therein) have estimated screening sensitivities, transition probabilities and sojourn time distributions. As far as we know, no study has developed a mathematical model to quantify the benefits and harm of screening that explicitly takes tumor heterogeneity into consideration. In this work, we propose a mechanistic model focusing on differences among individuals that provides a mathematical tool to gain insight into breast cancer progression. This model includes all possible transitions of breast cancer before and after detection from cancer incidence and detection through progression, treatment and mortality. Our central focus is on unmeasured heterogeneity, which we include through variation in the aggressiveness of tumor growth and maximum tumor size. By including unaggressive cancers, we are able to model the role of unmeasured heterogeneity in incidence levels, detected tumor sizes, and long-term outcomes to address the balance between costs and benefits of screening. The benefits are measured as the increase in 25-year survival. The costs are the increase in overdiagnosis quantified in two ways, through the difference in the number of patients diagnosed [18], and through the number who would have died due to other causes if treatment were relatively ineffective.
The proposed model is designed to first reproduce the cancer incidence and mortality in the CNBSS with a minimum of parameter fitting to the data itself [5]. By varying key model parameters, we simulate different scenarios of tumor aggressiveness and cancer treatment effectiveness to quantify their effect on the benefit and harm of mammography screening in a population.
The paper first presents the model framework and describes how parameters were estimated from the literature and the CNBSS. Because of the focus on unmeasured differences in underlying cancers, we term this the Breast Cancer Heterogeneity Aggressiveness Model (BCHAM). Using BCHAM, we experiment with the effectiveness of treatment and the underlying mean and variance of tumor aggressiveness to identify when mammography should provide the greatest benefit and the least harm.
2. Materials and Methods
2.1. The CNBSS
The CNBSS has been described in detail [5, 19], and we here summarize its key features (Figure 1a). The CNBSS was designed to investigate the benefits of mammography screening in women aged 40 − 59. The patients were followed up for up to 25 years (22 years on average). A population of 89, 835 healthy women aged 40 − 59 was randomly assigned to mammography (five annual mammography screens) and control (no mammography). Women in the mammography arm received both annual mammography and physical examination for the first 5 years of follow-up. In the control arm, women aged 40 − 49 received only a single physical examination at enrollment, and women aged 50 − 59 received annual physical examination for the first 5 years of follow-up. Participants were considered eligible if they were in good health, had no mammography in the previous 12 months, and had no history of breast cancer. The number of detected breast cancers, breast cancer mortality and all-cause mortality was recorded during the follow-up period.
2.2. BCHAM: An individual-based stochastic model
We use a five-compartment model to track the number of women at each cancer stage via the probabilities and rates of transition between consecutive stages (Figure 1b). Let a denote the age of a woman at enrollment and t the time since the beginning of follow-up. The rate of cancer incidence, ca(t), is a bell-shaped function based on the report of Canadian Cancer Registry and Health Statistics Division [20]. The rate of non-breast cancer mortality is captured by an exponential function ha(t) obtained from the 1991 Canadian statistics reported in [21].
Several models of tumor growth have been used in the literature [22, 23]. We model tumor diameter at time t with initial diameter dini at initial time tini, d(t, tini, dini, k), with a Gompertz model of human breast cancer growth [22]. The key parameter k is the tumor aggressiveness constant. Detection sensitivities of a tumor are modeled by sigmoid functions [24]. Sm(d), Sp(d) and Ss(d) denote the tumor detection sensitivities of mammography, physical screening and self-examination respectively during the follow-up period. To capture undetected cancers entering the study, we let Sb(d) be the tumor detection sensitivity of self-examination before the beginning of the study to eliminate candidates with noticeable tumors. Let td represent the time when a tumor is detected (the time when a patient moves from Stage 2 to 3). Suppose that breast cancers originate from a single cell of the diameter d0 at time t0, the time when a woman moves from Stage 1 to 2. Let dde =d(td, t0, d0, k) be the size of a tumor at detection time td > t0. The hazard of cancer mortality depends on the size at detection, tumor aggressiveness and time since detection according to h(dde, k, t − td), which follows a two-parameter Weibull distribution based on the probability of cancer mortality [25]. This function includes a parameter α that captures the effectiveness of treatment. All parameter values are presented in Tables 1a and 1b.
At enrollment, the participants can be in either the healthy or the undetected cancer compartment. As time passes, they can transfer between stages (Fig. 1b).
Stage 1: An individual in the healthy compartment may develop undetected breast cancer at a rate ca(t) or die due to other causes apart from breast cancer at a rate ha(t) (solid arrows in Fig. 1b).
Stage 2: During follow-up, an individual with undetected cancer can be detected with a probability of Sj(d), j ∈ {m, p, s}, or die of other causes at a rate of ha(t) (long-dashed arrows in Fig. 1b).
Stage 3: An individual with detected cancer may die of breast cancer at a rate of h(dde, k, t − td) or of other causes at rate ha(t) (short-dashed arrows in Fig. 1b).
We simulate a population of N0 individuals. The total women in the i-th compartment at time t is where , i = 1, …, 5 is an indicator function of the stage i of a individual k at time t.
2.3. Parameter calibration based on the CNBSS
We simulate the model using the CNBSS [5] to calibrate model parameters that cannot be estimated independently from the literature. In the CNBSS, the large number of breast cancers detected during the first year of follow-up (253 and 170 diagnosed cancers in the mammography and control arms respectively (Table 1 and [5]) suggests that at enrollment the participants may have been either in the healthy or undetected cancer compartment. To capture this, we began each simulated patient as healthy 6 years prior to the study initiation, with 6 years as a sufficient time period for previously originating cancers to be diagnosed before the first year of the study. In accordance with the study design, we assume that cancers can be diagnosed only by self physical examination with the detection sensitivity Sb(d) before the study. Patients who get diagnosed, die of breast cancer or of other causes before the beginning of the study are not included in the simulated population. A time step of 1 day is chosen for numerical simulation. Due to the nature of discretization, possibilities of two events occurring during a period of a time step are encompassed in our numerical simulations. In particular, Algorithm I in [29] was used to speed up the simulation of Stage 1 and also eliminate the simultaneous occurrence of undetected cancer and non-breast cancer death events. Because implementation of Algorithm I requires the integration of the cancer incidence rate ca(t), we used a polynomial approximation of ca(t). Moreover, we considered all possible cases including the concurrence of detected cancer and non-breast cancer death events when simulating Stage 2. At any time during the follow-up, if the death event occurs, the simulation is terminated.
Let Xi be the simulated outcomes, the number of detected cancers and the number of cancer deaths, and Yi be the corresponding recorded observations from the study. For the jth realization of our model, Sj is the sum of squared deviations, i.e. Sj =Σi(Xi − Yi)2. The three unknown parameters (bb2, γ and Z) are chosen to minimize the expected value of S. Model calibration is carried out using the data only from the control arm. Then we simulate the model with the estimated parameters over both arms to reproduce the outcomes of the CNBSS.
2.4. Statistics and calculation of overdiagnosis
To quantify the survival and overdiagnosis, we simulated each patient in the BCHAM model 100 times, 50 times in the mammography and 50 in the control arm, with identical parameters and time of onset of cancer. We used Cox proportional hazards (the coxph function in R [30]) to evaluate the effect of treatment arm on survival from the time of acquiring cancer, thus avoiding the effects of lead time bias [18]. To illustrate the effects, we conduct these regressions on data broken up into sextiles of aggressiveness k and maximum tumor diameter dmax, and by the time of cancer acquisition before, during or after the study. We term these 108 groups as study subcohorts. To estimate confidence limits, we bootstrap the simulated patients by sampling with replacement.
We quantify the number of diagnoses by computing the number of patients diagnosed in each arm and comparing in each subcohort with a χ2 test. To test for overdiagnosis itself, we compute the probability of death from other causes before death from cancer using the hazards h and ha from Table 1a and integrating as in [31]. To minimize confounding the benefits of treatment that sufficiently delay cancer-induced mortality to allow death from other causes, we lower the treatment effectiveness parameter in the cancer mortality hazard h to its minimum value α = 2.5.
3. Results
3.1. Fit with CNBSS results
Simulations of BCHAM accurately capture the outcomes of the CNBSS including the number of detected cancers, deaths from cancer, deaths from other causes and the distributions of age at diagnosis in both arms (Figures 2a, 2b and Table 2).
3.2. Benefit and harm of mammography screening
The accurate fit of the model to the data under current conditions motivates testing how various model parameters affect the balance between benefit and harm. We first quantify the benefit of increasing the parameter α that describes the effectiveness of treatment (Figure 3a). Value of screening is estimated as the percent increase in survival after 25 years of follow-up. As can be seen, when cancer treatment is highly effective, mammography screening provides only a minimal benefit.
To compare the benefit and harm as a function of age, tumor aggressiveness k, and maximum tumor diameter dmax, we simulate identical populations in both arms. In the simulation, participants receive an annual mammography and physical examinations in the mammography arm, or only an annual physical examination in the control for the first five years of follow-up. The simulation of each patient was repeated 50 times to reduce variance due to individual variation. Higher values of k and dmax strongly increase survival and decrease overdiagnosis (Figure 3b). The effects of age are much weaker, with slightly improved benefits in the middle age groups (women between the ages of 44 and 56). For patients with unaggressive tumors (small values of k and/or dmax), mammography provides little benefit and the highest harm.
We also illustrate overdiagnosis and survivorship as functions of whether patients first acquired their cancer before, during, or after the study, and of the aggressiveness (k) and the maximum tumor diameter dmax of their tumor. For overdiagnosis (Figures 4a and b, we compare the difference in number of patients per thousand diagnosed in the mammography and control arms (top number) with the difference in the number of patients per thousand who would have died of other causes with less effective treatment (α = 2.5, bottom number). For survivorship (Figures 4c and d, we compare the difference in number of deaths per thousand in the mammography and control arms (top number) with the hazard ratio of death due to inclusion in the mammography arm (bottom number).
For patients who acquired an undetected tumor before the beginning of the study, those with ultimately small tumors are highly overdiagnosed and experience reduced survival due to the effects of the treatment. The greatest survival benefits accrue to patients with largest and most aggressive tumors. Patients who acquire a tumor during the five years of the study show a similar but weaker pattern of overdiagnosis, and no strong survival cost of overtreatment of rapidly-growing but ultimately small tumors. Patients who acquired tumors after the conclusion of the study show no effect of mammography as expected (results not shown). These observations suggest that overdiagnosis mainly occurs at the first screening [32].
To capture a high-risk population, such as women with germline BRCA1 and BRCA2 mutations, family breast cancer history, hormone therapy and smoking history, we assume an increase of breast cancer incidence by a factor of 5 [33] over the baseline case. The main observations remain largely unchanged. In a high risk population, mammography screening is slightly more beneficial for younger women, illustrated by a slight shift in the age effect in Figure 3c compared with Figure 3b.
4. Discussion
We have developed an individual-based mechanistic model of breast cancer incidence and mortality in a population based on the Canadian National Breast Screening Study (CNBSS). All but three of the parameters could be estimated independently from the literature or taken from the CNBSS report, with the remaining ones calibrated to the outcomes of the CNBSS. The model includes two forms of heterogeneity: tumor aggressiveness describing the growth rate and maximum tumor size.
The model accurately matches the cancer incidence and survival in the CNBSS (Section 3). We then use the model to quantify the benefit and harm of mammography screening, with benefit measured as the increase in 25-year survival, and harm as the increase in overdiagnosis. The benefit of screening decreases almost to zero with highly effective treatment. In general, patients with the most rapidly growing or potentially largest tumors have the highest benefit and least harm from mammography screening, with only a relatively small effect of age.
We measured overdiagnosis in two ways, through the difference in the number of patients diagnosed (excess incidence [18]), and through the number who would have died of other causes if treatment were relatively ineffective. The goal of treatment is, of course, to ensure that all patients have the chance to die of something else, and thus comparing the number of deaths with relatively effective treatment confounds true overdiagnosis with successful treatment. An alternative defines overdiagnosis as cancers that would not have presented clinically during the patient’s lifetime [15,31] which is most appropriate for modeling studies that optimize timing and type of testing.
Unlike age or other know risk factors, it is difficult in practice to predict specific tumor characteristics in an individual patient before recommending screening. In addition to improving mammography technology, increasing the net benefit of screening may require pretreatment tests that can identify women at the greatest risk of the highly aggressive cancers.
The CNBSS has been criticized because participants were volunteers [34] and thus possibly at higher risk than the general population. However, because participants were then randomized, this selection of volunteers should only increase the effect size of screening, but not create a change in direction.
Our model has several limitation. The parameters from the literature come from a variety of sources and studies that might not apply across all populations. The remaining three are based on a single study, and future work will test how effectively it can reproduce the outcome of other clinical trials, such as the Swedish trial [6,7], by modifying few or no parameter values. In addition, our modeling of treatment is quite simplified, without taking into account recent improvements or different treatments for different breast cancer types.
Our model brings a new quantitative tool to bear on the controversy over the use of mammography screening. We have found, in line with recent trials, that the benefits are sufficiently small and the harm sufficiently large to make screening of dubious value except in patients destined to have highly aggressive cancers, who of course are difficult if not impossible to identify in advance.
Data Availability
NA
Author contributions
All the authors contributed equally to the manuscript.
Additional information
Ethics approval and consent to participate
Ethical approval and consent were not required since this work was conducted based on published data in the literature.
Competing interests
The authors declare no competing interest.
Funding information
This research was partially supported by NIH/NCI U54 grant CA209978 to Andrea Bild, the Modeling the Dynamics fund at the University of Utah, and the University of Utah Department of Mathematics.
Acknowledgment
We thank Dr. Adam Cohen for useful comments on an earlier draft, and the members of sLaM for discussion.
Footnotes
E-mail address: thuyttle{at}umich.edu, adler{at}math.utah.edu.