Misleading and avoidable: design-induced biases in observational studies evaluating cancer screening—the example of site-specific effectiveness of screening colonoscopy ========================================================================================================================================================================== * Malte Braitmaier * Sarina Schwarz * Vanessa Didelez * Ulrike Haug ## Abstract Observational studies evaluating the effectiveness of cancer screening are often biased due to non-alignment at time zero, which can be avoided by target trial emulation (TTE). We aimed to illustrate this by evaluating site-specific effectiveness of screening colonoscopy regarding colorectal cancer (CRC) incidence. Based on a German health care database, we assessed the effect of screening colonoscopy vs. no screening colonoscopy in preventing CRC in the distal and the proximal colon over 12 years of follow-up in 55–69-year-old persons. We compared four different study designs: cohort and case-control study, each with/without alignment at time zero. In both analyses with time zero-alignment, screening colonoscopy showed a rather similar effectiveness in reducing the incidence of distal and proximal CRC (cohort analysis: 32% (95% CI: 27-37%) vs. 28% (20-35%); case-control analysis: 27% vs. 33%). Both analyses without alignment suggested a difference by site: Incidence reduction regarding distal and proximal CRC, respectively, was 65% (61-68%) vs. 37% (31-43%) in the cohort analysis and 77% (67-84%) vs. 46% (25-61%) in the case-control analysis. Violations of basic design principles can substantially bias the results of observational studies. In our example, it falsely suggested a much stronger preventive effect of colonoscopy in the distal vs. the proximal colon. Our study illustrates that TTE avoids such design-induced biases. Keywords * Cancer screening * colonoscopy * effectiveness * proximal * distal * target trial emulation * observational study * cohort study * case-control study ## Introduction Randomized controlled trials (RCT) are the gold standard for evaluating the effectiveness of cancer screening. However, existing RCTs in this field do not answer all relevant research questions. For screening colonoscopy, for example, an RCT has recently been published (NordICC trial) demonstrating its effectiveness in reducing colorectal cancer (CRC) incidence overall1, but it was not powered to compare the effectiveness in the distal vs. the proximal colon. Complementary evidence from observational studies is therefore needed. Apart from potential confounding, there is a high risk of bias and thus of misleading results if such studies are inadequately designed. Indeed, several observational studies have reported a markedly stronger preventive effect of screening colonoscopy in the distal as compared to the proximal colon2–4, while a cohort study designed following the principle of target trial emulation (TTE) showed a similar effectiveness of screening colonoscopy in the distal and the proximal colon5. We argued that the difference by site in the former studies was due to biases induced by non-alignment at “time zero”, i.e. at baseline. This means that I) the assessment of eligibility, II) the assignment to study arms and III) the start of follow-up were not aligned as they would be in an RCT and as it would be ensured in an observational study designed based on the principle of TTE6. Specifically, previous studies often defined exposure based on pre- or post-baseline information on colonoscopy. As we further argued, this lack of alignment in previous studies led to overestimating the effectiveness of screening colonoscopy. Due to the different age pattern of distal and proximal CRC, this bias affected distal CRC more than proximal CRC, i.e. the difference in effectiveness by site was an artefact. To demonstrate this, we compared different study designs with and without alignment at time zero aiming to investigate the question of site-specific effectiveness of at least one screening colonoscopy in reducing CRC incidence. For the two designs without alignment we used a cohort study design, where the assignment to study arms occurs *before* time zero (pre-baseline), and a nested case control study design, where the assignment to study arms occurs *after* time zero (post-baseline). The current paper is part of a growing literature identifying violations of alignment at time zero as a potential source of major bias in observational studies6–8. ## Methods ### Data source and study population We used the German Pharmacoepidemiological Research Database (GePaRD) which comprises claims data from four statutory health insurance providers in Germany and covers about 20% of the German population9. In GePaRD, information on utilization of screening colonoscopy, offered in Germany to persons aged 55 or older since October 2002 (since April 2019 also offered to men aged 50-54) with a screening interval of 10 years, is distinguishable from diagnostic colonoscopy. Each year, 2-3% of the target population utilize screening colonoscopy, yielding a cumulative participation of 20-30%10. As previously described, the data source enables the valid identification of incident CRCs11. Furthermore, it contains appropriate information to apply in- and exclusion criteria and to adjust for confounding as relevant to the research question on the effectiveness of (at least one) screening colonoscopy in reducing CRC incidence5. For the present study, we used data from 2004 to 2020. Based on this data source, we applied four different study designs to address the research question, specifically a cohort and a case-control study design, each with and without alignment at time zero. The study designs without alignment at time zero were inspired by published examples2,12–14, and were partly complemented by sensitivity analyses. For each of these four studies, persons were selected from the same population. Specifically, the source population was a cohort of persons aged 55–69 at baseline, who were continuously insured for at least three years before baseline. ### Cohort study without alignment at time zero The cohort started in 2009 (baseline). Similar to a previous study2, individuals were assigned to the screening colonoscopy arm if they had a screening colonoscopy any time before baseline, including the baseline quarter. Individuals were assigned to the control arm if they did not undergo screening colonoscopy any time before baseline, including the baseline quarter. In a sensitivity analysis, we considered both screening and diagnostic colonoscopies for the assignment to the study arms, because some of the previous studies did not distinguish between these examinations. Eligibility criteria were checked at baseline (among others, prevalent CRC as exclusion criterion) and the outcome variable (incident CRC) was assessed beginning with baseline (start of follow-up). Persons were followed up until end of study period (end of 2020), end of continuous insurance coverage, death or CRC diagnosis, whichever occurred first. We also conducted sensitivity analyses starting the cohort in 2010 and 2011, respectively. When using such a study design, the assessment of eligibility and the start of follow-up are aligned, but the assignment to the screening and the control arm is based on a period *before* time zero (pre-baseline). Specifically, individuals in the colonoscopy arm had the examination in the past (i.e. they were assigned to the screening arm based on past exposure) rather than *at* time zero. ### Cohort study with alignment at time zero As described previously5, we emulated sequential trials for each calendar quarter from 2007 to 2011. The emulation of sequential target trials makes full use of the information from longitudinal data without violating principles of study design by using pre- or post-baseline information for the assignment to study arms. At the baseline quarter of each trial, eligibility was assessed (among others, individuals with prevalent CRC or previous colonoscopy were excluded). Eligible individuals were then assigned to the screening arm if they underwent a screening colonoscopy *in the baseline quarter* of the respective trial and to the control arm otherwise. Individuals were followed up until end of study period (end of 2020, i.e. follow-up was longer than in our previous analysis), end of continuous insurance coverage, death or CRC diagnosis, whichever occurred first. This study design made sure that assessment of eligibility criteria, assignment to the screening and control arm, and start of follow-up were aligned at time zero as would be the case in an RCT. ### Case-control study without alignment at time zero We applied a case-control design frequently used in the published literature12–17. Essentially, CRC cases are identified (date of diagnosis corresponds to index date) and matched with controls free of CRC at index date. Then screening colonoscopy use *ever before* or within a certain time period *before* the index date is assessed in cases and controls, i.e. colonoscopies leading to CRC diagnosis are not considered as exposure in this type of study. Here, we selected all individuals from the source population entering the cohort in 2009 with a CRC diagnosis in 2018-2020. For each case we matched up to five controls on age (+/− one year) and sex (sampling without replacement). The exposure variable was then defined as any screening colonoscopy between 2009 and the index date, i.e. exposure to colonoscopy use was assessed within 10-12 years *before* the index date. Colonoscopies conducted in the six months before CRC diagnosis were not considered in defining the exposure. As mentioned above, this approach corresponds to published case-control studies which ignore colonoscopies conducted as part of the diagnostic process leading to the current diagnosis12–17. In general, it is a fundamental characteristic of traditional case-control studies to assess exposure before disease onset. In a sensitivity analysis, we considered both screening and diagnostic colonoscopies for the assignment to exposure groups. Again, we also conducted sensitivity analyses using the years 2010 and 2011 for cohort entry, i.e. the source population underlying this nested case-control study. In the case-control design we used here (nested within a cohort), the assessment of eligibility and the start of follow-up were aligned, while the assignment to the screening and the control arm occurred *after* time zero (post-baseline) instead of *at* time zero. Note that in case-control studies not nested in a cohort, there typically are additional misalignments 12,15. Specifically, eligibility is assessed at index date and the start of follow-up is unclear. ### Case-control study with alignment at time zero Following the approach described by Dickerman et al.18, a case-control study was nested within the original cohort of sequential emulated target trials, and colonoscopy use was assessed at baseline of each emulated trial. We included CRC patients with an incident CRC diagnosis at any point during follow-up (until 2020) and then used risk set sampling to match up to five controls to each case. We sampled matched controls with replacement, i.e. the same control could be matched to more than one case. Matching variables were the same as above. The key difference to the case-control study without alignment is that exposure assignment was based on information available at the start of the emulated trial, i.e. at time zero, instead of information occurring after time zero. This approach has been shown to avoid self-inflicted biases in the same way as a prospective study using TTE18. ### Data analysis Time was discretized into units of calendar quarters (i.e. three-month time intervals), since some information in the database (outpatient diagnosis codes) is only available on a quarterly basis. We estimated the effect on site-specific CRC incidence, i.e. the outcome was the first CRC diagnosis during follow-up. A CRC diagnosis at the other site or with unknown location and death from any cause were regarded as competing events. We estimated the site-specific total effects, i.e. without elimination of competing events (see Young et al.19 for details on total and controlled direct effects in competing risk survival settings). As shown in our initial study, treating death as a censoring event did not substantially change the effect estimates given the included age range (55-69 years).5 For the cohort studies, we estimated site-specific cumulative incidence functions (CIF) via pooled logistic regressions, which were adjusted for baseline confounders via inverse probability of treatment weighting (see Supplement 8 for details on the pooled logistic regression). Adjustment variables were: Age, sex, educational attainment, type 2 diabetes, codes indicative of alcohol abuse, codes indicative of nicotine dependence, antiplatelet therapy, use of preventive services, menopausal hormone therapy, coded family history of CRC. Effects were estimated as adjusted relative risks (RR) at the end of follow-up based on these CIFs. As previously shown, adjustment yielded satisfactory covariate balance and a negative control analysis did not indicate any residual confounding5. For the cohort study with alignment at time zero, our effect estimate corresponds to the observational analog of the intention-to-treat effect. Controls (i.e. persons with no screening colonoscopy at baseline) may have undergone screening colonoscopy during follow-up but as reported in our previous study, this had no relevant impact on our results: The proportion of CRCs among persons in the screening arm who were also included as controls in at least one previous trial was low and similar in the distal and the proximal colon (4.2% and 4.8%)5. Confidence intervals were estimated via person-level bootstrap. For the case-control studies, effects were estimated as adjusted odds ratios (ORs) obtained via conditional logistic regression. For the case-control analysis with alignment, no confidence intervals could be obtained due to computational limitations: The emulation of sequential trials with repeated cohort entry would require bootstrapping of the underlying study population, where matching is repeated for every bootstrap sample18, resulting in run times of several months. Bootstrapping the matched cohort would not yield valid confidence intervals20. ## Results ### Cohort study without alignment at time zero We selected a random sample of 200,000 individuals in the control arm and 200,000 individuals in the screening colonoscopy arm. The adjusted relative risk after 12 years of follow-up was 0.35 (95% CI: 0.32-0.39) for distal CRC and 0.63 (95% CI: 0.57-0.69) for proximal CRC (Table 1). The adjusted cumulative incidence curves are given in Fig. 1. As shown in Supplement 1, results were similar when the year 2010 or the year 2011 was used as baseline. In sensitivity analyses considering both screening and diagnostic colonoscopies as exposure, the adjusted 12-year relative risk was 0.40 (95% CI: 0.37-0.44) for distal CRC and 0.66 (95% CI: 0.60-0.72) for proximal CRC (Supplement 2). ![Fig. 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/11/28/2024.04.29.24306522/F1.medium.gif) [Fig. 1:](http://medrxiv.org/content/early/2024/11/28/2024.04.29.24306522/F1) Fig. 1: Adjusted cumulative incidence functions for distal (left column) and proximal (right column) CRC from the cohort study design with alignment at time zero (top row) and the cohort study design without alignment at time zero (bottom row) View this table: [Table 1:](http://medrxiv.org/content/early/2024/11/28/2024.04.29.24306522/T1) Table 1: Results of cohort study designs without and with alignment at time zero (adjusted for baseline covariates). ### Cohort study with alignment at time zero Overall, 192,054 persons were included in the screening colonoscopy arm. The 5% random sample (restriction due to computational limitations) of controls assigned to the no screening arm included 116,452 persons (1,241,071 non-unique). The adjusted relative risk after 12 years of follow-up was 0.68 (95% CI: 0.63-0.73) for distal CRC and 0.72 (95% CI: 0.65-0.80) for proximal CRC (Table 1). Figure 1 shows the adjusted cumulative incidence curves for distal and proximal CRC. The distribution of screen-detected and post-colonoscopy CRCs (i.e. non-screen-detected CRCs) by site is shown in Supplement 6. ### Case-control study without alignment at time zero Overall, 446 cases with distal CRC matched to 2,230 controls and 302 cases with proximal CRC matched to 1,510 controls were included. The adjusted ORs for distal and proximal CRC were 0.23 (95% CI: 0.16-0.33) and 0.54 (95% CI: 0.39-0.75), respectively (Table 2). When the year 2010 or the year 2011 was used to define the source population, the difference by site was similar (Supplement 1). The sensitivity analysis considering both screening and diagnostic colonoscopy as exposure yielded similar results; the adjusted ORs for distal and proximal CRC were 0.20 (95% CI: 0.15-0.26) for distal CRC and 0.44 (95% CI: 0.33-0.58) for proximal CRC, respectively (Supplement 2). View this table: [Table 2:](http://medrxiv.org/content/early/2024/11/28/2024.04.29.24306522/T2) Table 2: Results of case-control study designs without and with alignment at time zero. ### Case-control study with alignment at time zero Overall, 19,081 cases with distal CRC matched to 93,650 controls and 9,916 cases with proximal CRC matched to 49,065 controls were included. The adjusted ORs for distal and proximal CRC were 0.70 and 0.72, respectively (Table 2). A table of baseline covariates by study design is given in Supplement 7. ## Discussion To the best of our knowledge, our study is the first to systematically compare different study designs to assess the effectiveness of screening colonoscopy in reducing CRC incidence in the distal vs. the proximal colon. Our cohort and case-control analyses with alignment at time zero showed no relevant difference in the effectiveness by site. Using study designs without alignment at time zero led to an overestimation of the effectiveness of screening colonoscopy overall. The overestimation affected distal CRCs considerably more than proximal CRCs, i.e. purely by design there appeared to be a difference in effectiveness by site. This finding held up in sensitivity analyses varying data years and the type of examinations considered for the exposure definition (only screening or also diagnostic colonoscopy). Our findings demonstrate that the difference in the effectiveness of colonoscopy by site reported by previous observational studies was due to bias introduced by inadequate study design. As illustrated in Supplement 3 using directed acyclic graphs, the bias underlying studies using pre-baseline information on colonoscopy for the assignment to study arms can be expressed as a form of collider stratification bias21,22. To give an intuitive explanation, let us revisit the study by Guo et al.2,5: At baseline, patients were asked about past colonoscopy use and— based on this information—assigned as exposed or unexposed to colonoscopy. Persons reporting a prior CRC diagnosis at baseline were excluded2. Given that colonoscopy is one of the main tools by which CRC is diagnosed, this process removes individuals with previously diagnosed CRC from the exposed group, i.e. it enriches the exposed group with individuals who are known to be free of CRC. No such selection process takes place in the unexposed group. This leads to a lower prevalence of preclinical CRC at baseline in the exposed as compared to the unexposed group. As a consequence, this selection reduces the number of CRCs occurring during follow-up in the exposed group as compared to the unexposed group and thus leads to overestimation of the effect of screening on CRC incidence. This is also evident from Figure 1, showing that the absolute risk in the screened group is much lower without alignment at time zero as compared to the design with alignment. As the vast majority of CRCs diagnosed at an age when persons are typically included into screening studies are in the distal colon23 while proximal CRCs become more common at older age, this bias mainly affects results for distal CRC, i.e. as mentioned above there appeared to be a difference in effectiveness by site purely by design. We note that in addition to the initial exposure assignment, Guo et al. also used an updated exposure variable in a Cox model with time-dependent covariates. However, this does not correct the initial selection issue at the start of follow-up. Further, including time-updated covariate information in the outcome model of a time-dependent exposure may introduce bias due to exposure-covariate feedback24. Moreover, hazard-based effect measures are discouraged in causal analyses due to the built-in selection bias 19,25,26. The above argument applies to studies using pre-baseline information for the assignment to exposure groups. Many other studies used post-baseline information for the assignment to exposure groups, also inducing bias. We illustrated this by the case-control study without alignment at time zero: Whenever after baseline CRC is detected in a person at his or her first colonoscopy, as is the case for most screen-detected CRCs, this person is assigned to the unexposed group as there was no previous colonoscopy and the actual colonoscopy detecting the CRC is not considered as prior exposure. This enriches the unexposed group with CRCs and thus leads to overestimation of the effectiveness of screening. As the majority of screen-detected CRCs are in the distal colon, this bias predominantly affects CRCs in the distal colon and thus leads to an artificial difference in the effectiveness of colonoscopy by site (see also Supplement 4). In our case-control study design embedded in an emulated target trial with alignment at time zero, in which screen-detected CRCs are correctly assigned, no relevant difference in the effectiveness of colonoscopy by site was observed. Of note, misalignment due to post-baseline exposure assignment is typical of but not limited to case-control designs on cancer screening. It can also occur in inadequately designed cohort studies and is not overcome by using a time-varying exposure variable in a hazard model. This is explained in more detail in Supplement 5 based on the example of the study by Nishihara et al.3 In summary and more generally, both study designs without alignment at time zero have in common that there are mechanisms that lead to inappropriate consideration of screen-detected CRCs, i.e. in the screening arm there was no peak in CRC incidence immediately after baseline as it would be the case in an RCT. Of course, this overestimates the impact of screening on CRC incidence, particularly for distal CRC, as illustrated in Figure 1. The flawed approaches ignore the fact that a screening colonoscopy sometimes comes too late to prevent CRC. Following the publication of the NordICC study, there was a discussion whether it is appropriate to include persons with preclinical CRC, causing the peak at baseline, in a prevention trial27,28. However, from a public health perspective, it is important to also take into account CRCs that are not prevented by screening in order to avoid overestimating the effectiveness of CRC screening at the population level. Apart from this, it should be noted that studies without alignment at time zero do not provide a valid answer to the question regarding the size of the preventive effect of colonoscopy in persons free of CRC at baseline. It should be noted that, although we focus our discussion on biases most relevant for site-specific effectiveness of screening colonoscopy, misalignment at time zero should also be avoided for many other reasons. Our examples for using pre- and post-baseline information for the assignment to exposure groups were selected according to their relevance to our research question but there are certainly other types misalignment. Rasouli et al.29 demonstrated that time related issues such as prevalent user bias or time-varying confounding are a threat to case-control designs not embedded in an emulated target trial. Also, Dickerman et al. showed—based on case-control studies investigating the impact of statins on CRC risk— the biases inherent to traditional case-control studies and the potential of avoiding bias and wrong conclusions if the study is designed following the principle of TTE18. Similarly, there are many examples of biases other than those we discussed here that are inherent to cohort studies without alignment at time zero8. Our findings have several implications. First, regarding research on CRC screening, previous studies suggesting a lower effectiveness of colonoscopy in the proximal colon stimulated a search for reasons that may explain the occurrence of post-colonoscopy CRCs specifically in the proximal colon. It was suggested that one main reason relates to sessile serrated lesions as they are more difficult to detect and more often occur in the proximal colon30. While we do not question the important role of these lesions, our findings may encourage a broadening of the discussion of potential reasons leading to post-colonoscopy CRCs. Indeed, in our emulated target trial on screening colonoscopy, the proportion of post-colonoscopy CRCs located in the distal vs. the proximal colon was rather similar (Supplement 6). A one-sided focus on lesions that occur more frequently in the proximal colon therefore seems too narrow regarding the identification of lesions possibly leading to post-colonoscopy CRCs. Our results also have implications beyond the specific research question of our study. Observational data are often used to evaluate the effectiveness of cancer screening. They represent a valuable data source to complement RCT evidence in this field, as RCTs on cancer screening are scarce, were often conducted many years ago and are typically not powered to estimate, for example, subgroup-specific effects or differences by cancer subtype. However, our study illustrates that—in addition to appropriate control of confounding—it is of key importance to design these studies in a way to ensure alignment at time zero. This means that assessment of eligibility, assignment to the screening and control arm and start of follow-up must be aligned. Otherwise, there is a high risk of bias. Weighing the advantages and disadvantages of the two study designs with alignment at time zero, we generally recommend a cohort design for several reasons, some of which are: 1) The computational cost of embedding a case-control study in an emulated target trial (cohort study) is higher than simply conducting the emulated target trial itself. 2) The matching of cases to controls leads to a loss of sample size among controls, possibly lowering statistical efficiency. However, a case-control design with alignment might be useful and cost-efficient if the available data source lacks pivotal information and needs to be substituted by additional data (e.g. biomarkers measured in baseline samples). Specific strengths of our study include the systematic comparison of different study designs as well as the comprehensive sensitivity analyses. Given that all analyses were conducted using the same data source and referred to the same setting, there is no heterogeneity regarding, for example, the study variables or setting-related factors such as the uptake of surveillance colonoscopy or colonoscopy quality. This strengthens our conclusion that differing results of the analyses with and without alignment at time zero are exclusively due to the study design. It should be noted that our findings apply to the population aged 55-69, covering the typical screening age range of CRC screening. Whether screening colonoscopy is equally effective in the distal and proximal colon in older age groups cannot be answered by our study, nor did we address the endpoint CRC mortality. These research questions were beyond our study’s scope, as our primary objective was to illustrate the relevance of design-induced biases and the possibility to avoid them using TTE, exemplified by investigating site-specific effectiveness of screening colonoscopy in reducing CRC incidence. In conclusion, our study demonstrates that violation of alignment at time zero can substantially bias the results of observational studies on cancer screening. In our example, it falsely suggested an almost doubled preventive effect of colonoscopy in the distal vs. the proximal colon. The difference disappeared when the same data were analyzed using a TTE approach, which is known to avoid design-induced biases. ## Statements & Declarations ### Conflicts of interest MB, SS, VD and UH are working at an independent, non-profit research institute, the Leibniz Institute for Prevention Research and Epidemiology – BIPS. Unrelated to this study, BIPS occasionally conducts studies financed by the pharmaceutical industry. These are post-authorization safety studies (PASS) requested by health authorities. The design and conduct of these studies as well as the interpretation and publication are not influenced by the pharmaceutical industry. The study presented was not funded by the pharmaceutical industry and was performed in line with the ENCePP Code of Conduct. ### Funding BIPS intramural funding ### Data availability As we are not the owners of the data we are not legally entitled to grant access to the data of the German Pharmacoepidemiological Research Database. In accordance with German data protection regulations, access to the data is granted only to employees of the Leibniz Institute for Prevention Research and Epidemiology – BIPS on the BIPS premises and in the context of approved research projects. Third parties may only access the data in cooperation with BIPS and after signing an agreement for guest researchers at BIPS. ### Author contributions MB: Conceptualization, data curation, formal analysis, investigation, methodology, software, visualization, writing – original draft, writing – review and editing; SS: Conceptualization, investigation, writing – review and editing; VD: Conceptualization, investigation, methodology, supervision, writing – review and editing; UH: Conceptualization, funding acquisition, investigation, methodology, project administration, resources, supervision, writing – original draft, writing – review and editing. ### Research ethics and informed consent In Germany, the utilisation of health insurance data for scientific research is regulated by the Code of Social Law. All involved health insurance providers as well as the German Federal Office for Social Security and the Senator for Health, Women and Consumer Protection in Bremen as their responsible authorities approved the use of GePaRD data for this study. Informed consent for studies based on claims data is required by law unless obtaining consent appears unacceptable and would bias results, which was the case in this study. According to the Ethics Committee of the University of Bremen studies based on GePaRD are exempt from institutional review board review. ## Supplement ### Supplement 1: Cohort and case-control study without alignment at time zero for different baseline years As mentioned in the methods section, for the study designs without alignment at time zero, we selected individuals from the source population entering the cohort in 2009. In sensitivity analyses, we varied the baseline year, i.e. individuals entering the cohort in 2010 and 2011, respectively. The respective results are shown in Table S1 and Figure S1 for the cohort study and in Table S2 for the case-control study. For comparison, also the results of the base case analysis (baseline year 2009) are shown. View this table: [Table S1:](http://medrxiv.org/content/early/2024/11/28/2024.04.29.24306522/T3) Table S1: Results of cohort study designs without alignment at time zero for different baseline years. ![Figure S1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/11/28/2024.04.29.24306522/F2.medium.gif) [Figure S1:](http://medrxiv.org/content/early/2024/11/28/2024.04.29.24306522/F2) Figure S1: Adjusted cumulative incidence functions for distal and proximal CRC from the cohort study design without alignment at time zero for different baseline years. View this table: [Table S2:](http://medrxiv.org/content/early/2024/11/28/2024.04.29.24306522/T4) Table S2: Results of case-control designs without alignment at time zero for different baseline years. ### Supplement 2: Cohort and case-control study without alignment at time zero: considering both screening and diagnostic colonoscopy for the assignment to exposure groups As mentioned in the methods section regarding the study designs without alignment at time zero, only screening colonoscopies were considered for the assignment to exposure groups in the base case analysis. In a sensitivity analysis, we considered both screening and diagnostic colonoscopies for the exposure assignment. The respective results are shown in Table S4 and Figure S2 for the cohort study design and in Table S5 for the case-control study. View this table: [Table S4:](http://medrxiv.org/content/early/2024/11/28/2024.04.29.24306522/T5) Table S4: Results of cohort study designs without alignment at time zero: sensitivity analyses considering both screening and diagnostic colonoscopy for the assignment to exposure groups. For comparison, also the results of the base case analysis are shown. ![Figure S2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/11/28/2024.04.29.24306522/F3.medium.gif) [Figure S2:](http://medrxiv.org/content/early/2024/11/28/2024.04.29.24306522/F3) Figure S2: Adjusted cumulative incidence functions for distal and proximal CRC from the cohort study design without alignment at time zero: sensitivity analysis considering both screening and diagnostic colonoscopy for the assignment to exposure groups. For comparison, also the cumulative incidence functions of the base case analysis are shown. View this table: [Table S5:](http://medrxiv.org/content/early/2024/11/28/2024.04.29.24306522/T6) Table S5: Results of case-control designs without alignment at time zero: sensitivity analyses considering both screening and diagnostic colonoscopy for the assignment to exposure groups. For comparison, also the results of the base case analysis are shown. ### Supplement 3: Structural explanation of the bias inherent to study designs using pre-baseline information for the assignment to exposure groups For simplicity, we divide time into three periods *t* ∈ {−1, 0, 1} with *t* = −1 being the pre-baseline period, *t* = 0 the baseline and *t* = 1 the post-baseline or follow-up period. Let *Et* ∈ {0, 1} described a person’s exposure to screening colonoscopy at time *t*. Let *Pt* ∈ {0, 1} indicate the presence of colorectal precursors at time *t* and *Ct* ∈ {0, 1} the onset of preclinical CRC by time *t*. Let *Yt* ∈ {0, 1} indicate a diagnosis of colorectal cancer by time *t*. The box around *Y*−1 indicates that inclusion in the analysis population is based on previous CRC diagnoses, i.e. individuals are only included in the study population if *Y*−1 = 0. ![Figure S3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/11/28/2024.04.29.24306522/F4.medium.gif) [Figure S3:](http://medrxiv.org/content/early/2024/11/28/2024.04.29.24306522/F4) Figure S3: DAG of bias resulting from violation of alignment at time zero in the form of exposure assessment based on pre-baseline information. As shown in Figure S3, at time point *t* the causal mechanism that leads to a diagnosis of CRC is as follows: Precursors *Pt* lead to the development of CRC *Ct*, which in turn progress to the outcome of interest, CRC diagnosis *Yt*. At the same time, exposure to screening colonoscopy *Et* leads to CRC diagnosis *Yt* at the same time point, if the disease is present. Furthermore, exposure at time *t* prevents disease onset at the later time *t* + 1 by removing precursor stages present at time *t*. Importantly, the variable *Yt* is a collider variable on the path *Pt → Ct → Yt* ← *Et*. When cohort selection is based on this collider, a non-causal association is introduced between *Et* and *Ct*. If the cohort selection process excludes individuals with CRC diagnosis before baseline (*Y*−1) while including individuals with past exposure *E*−1 in the exposed group of the analysis dataset, the unexposed group appears to have a higher CRC incidence. Individuals who were screened in the past and had prevalent CRC received a diagnosis and were filtered out of the study cohort. Individuals who were screened in the past and did not have prevalent CRC are included in the exposed group. No such selection takes place in the unexposed group, where individuals must not have had any screening colonoscopy before baseline. Therefore, there is a non-causal association between exposure before baseline and prevalent, undiagnosed CRC before baseline. This non-causal association means that there are now open non-causal paths from exposure before baseline to the study outcome at later time points. The resulting bias, therefore, can be expressed as a form of collider stratification bias. The analysis with alignment at time zero would also exclude individuals with previous exposure *E*−1 = 1 and would aim to estimate the effect of exposure at time zero (*E*) on subsequent outcomes. The analysis without alignment at time zero on the other hand does not exclude individuals with previous exposure and instead aims to estimate the effect of the composite exposure *E*misaligned = {*E*−1 = 1 or *E* = 1}, which does not correspond to a causal question since past exposure cannot be intervened upon. The strength of the bias will depend on the prevalence of *C*−1. If, conceptually, the prevalence of CRC before baseline were to approach zero, no such selection would take place. In the age group under study here, the prevalence of proximal CRC before baseline will be much lower than the prevalence of distal CRC before baseline, which means that this bias will impact the effect estimate for distal CRC more severely. ### Supplement 4: Illustration of the mechanism underlying the misallocation of screen-detected CRCs in case-control studies without alignment at time zero Figure S4 illustrates the mechanism that underlies the misallocation of screen-detected CRCs in case-control studies without alignment at time zero, resulting in an overestimate of the effectiveness of screening colonoscopy. First, let us imagine a hypothetical RCT investigating the effectiveness of screening colonoscopy on CRC incidence. At baseline, screening-naïve persons are randomly assigned to either the screening or the control arm. Analysing this data as a case-control study without alignment at time zero would mean that for CRCs occurring in both arms, it is assessed whether they had a colonoscopy *before* CRC diagnosis. Given that screen-detected CRCs did not have a colonoscopy *before* CRC diagnosis, they are assigned (post-baseline, i.e. after randomization) to the control arm and are thereby classified as unexposed. This overestimates the effectiveness of screening given that CRCs accumulate in the control group (unexposed group). Given that screen-detected CRCs are more frequent in the distal colorectum, the resulting bias affects distal CRC more severely than proximal CRC. ![Figure S4:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/11/28/2024.04.29.24306522/F5.medium.gif) [Figure S4:](http://medrxiv.org/content/early/2024/11/28/2024.04.29.24306522/F5) Figure S4: Illustration of the mechanism of misallocation of screen-detected CRCs in case-control studies without alignment at time zero Of note, in published case-control studies investigating the effectiveness of screening colonoscopy based on primary data, selection bias in the control arm (higher prevalence of screening colonoscopy as compared to the general population) can—as an additional mechanism—also contribute to overestimating the effectiveness of screening colonoscopy, but it is not expected that this bias leads to a difference in the effectiveness by site. In our case-control study without alignment at time zero, there was a second mechanism leading to overestimating the effectiveness of screening colonoscopy due a compromise we had to make because of the left truncation of our data. Specifically, we had to select CRC cases diagnosed in 2018-2020 from those entering the cohort in 2009 (see methods section) in order to be able to assess exposure in the 10 years prior to CRC diagnosis. CRCs diagnosed between 2009 and 2017 in the context of screening, which are more often in the distal than in the proximal colon, were not included in the final set of cases, i.e. distal CRCs exposed to screening colonoscopy were underrepresented in the final set of cases. We conducted additional analyses to disentangle the effect of both mechanism (data not shown), which did not change our conclusion, i.e. that the mechanism described in Figure S4 (also) leads to an artificial difference in the effectiveness of colonoscopy by site. Further sources of bias due to misalignment in case-control designs may exist, e.g. when eligibility and covariates are assessed at time zero but exposure assessment uses information from after time zero. For example, exposure after time zero may be influenced by time-dependent confounders that changed since baseline. ### Supplement 5: Bias due to post-baseline information for exposure assignment in a cohort study In the cohort study by Nishihara et al. the assessment of eligibility criteria (e.g. no prior cancer except for nonmelanoma skin cancer, no prior endoscopy) as well as the start of follow-up was in 1988 (baseline)1. As part of a questionnaire administered every 2 years, participants were then asked whether they had undergone either sigmoidoscopy or colonoscopy and, if so, the reason for the investigation and whether there was a diagnosis of colorectal polyps. This means that the assignment to exposure groups used information after the assessment of eligibility and the start of follow-up, and it was updated every two years, i.e. post-baseline information was used to determine exposure. The outcome was the incidence of colorectal cancer, which was compared between participants without a lower endoscopy (control group), participants with a polypectomy, participants with a negative sigmoidoscopy and participants with a negative colonoscopy. The mechanism described for the case-control study without alignment at time zero also applies to this design. In each two-year time interval CRCs detected in persons who had their first colonoscopy during this two-year time interval are—per definition—assigned to the unexposed group as they had no colonoscopy prior to CRC diagnosis. This overestimates the effectiveness of screening because CRCs are filtered to the unexposed group. As the majority of screen-detected CRCs are in the distal colon, this bias predominantly affects CRCs in the distal colon and thus leads to an artificial difference in the effectiveness of colonoscopy by site. ### Supplement 6: Post-colonoscopy CRC diagnoses For the cohort analysis with alignment at time zero, we quantified the occurrence of post-colonoscopy CRC diagnoses occurring in the screening arm and assessed their site distribution. CRC diagnoses with a screening colonoscopy in the same calendar quarter or in the 180 days before CRC diagnosis were considered screen-detected and were not counted as post-colonoscopy CRC. The frequencies and percentages are given in the below Table: View this table: [Table7](http://medrxiv.org/content/early/2024/11/28/2024.04.29.24306522/T7) ### Supplement 7: Table of baseline covariates by study design and screening group View this table: [Table8](http://medrxiv.org/content/early/2024/11/28/2024.04.29.24306522/T8) ### Supplement 8: Pooled logistic regression In our prospective study designs, we used pooled logistic regression to estimate the cumulative incidence of colorectal cancer (see Robins et al. for details on this approach 2). We chose this analytical approach because it allowed us to compare the risk of CRC throughout follow-up and avoided (unjustified) proportional hazards assumptions. First, we estimated inverse probability of treatment weights (IPTW) to adjust for baseline confounding. For this, we fitted a logistic model with the exposure group as dependent variable and baseline covariates as explanatory variables. This model yielded an estimated probability of being assigned to the exposed group ![Graphic][1] which was used to estimate IPTW weights as ![Graphic][2] for the exposed group and ![Graphic][3] for the unexposed group, with *E* ∈ {0,1} being the exposure groups. Next, we fitted a pooled logistic regression model on the long-format data (i.e. one data row per (possibly non-unique) person per time point), which took the form ![Formula][4] with *t* being follow-up time and *e* being the observed exposure value. The above model was fitted on the weighted data and predicted probabilities were extracted per exposure group and time point to estimate cumulative incidence functions. ## Acknowledgements We thank Anja Gabbert and Inga Schaffer for the programming of analysis datasets. ## Footnotes * This revised version of the manuscript incorporates reviewer comments from the first round of blinded peer-review. * Received April 29, 2024. * Revision received November 27, 2024. * Accepted November 28, 2024. * © 2024, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at [http://creativecommons.org/licenses/by/4.0/](http://creativecommons.org/licenses/by/4.0/) ## References 1. 1.Bretthauer M, Loberg M, Wieszczy P, et al. Effect of colonoscopy screening on risks of colorectal cancer and related death. N Engl J Med. Oct 27 2022;387(17):1547–1556. doi:10.1056/NEJMoa2208375 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa2208375&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=36214590&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) 2. 2.Guo F, Chen C, Holleczek B, Schottker B, Hoffmeister M, Brenner H. Strong reduction of colorectal cancer incidence and mortality after screening colonoscopy: prospective cohort study from Germany. Am J Gastroenterol. May 1 2021;116(5):967–975. doi:10.14309/ajg.0000000000001146 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.14309/ajg.0000000000001146&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33929378&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) 3. 3.Nishihara R, Wu K, Lochhead P, et al. Long-term colorectal-cancer incidence and mortality after lower endoscopy. N Engl J Med. Sep 19 2013;369(12):1095–105. doi:10.1056/NEJMoa1301969 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa1301969&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24047059&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000324554800007&link_type=ISI) 4. 4.Brenner H, Stock C, Hoffmeister M. Effect of screening sigmoidoscopy and screening colonoscopy on colorectal cancer incidence and mortality: systematic review and meta-analysis of randomised controlled trials and observational studies. BMJ. Apr 9 2014;348:g2467. doi:10.1136/bmj.g2467 [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiYm1qIjtzOjU6InJlc2lkIjtzOjE3OiIzNDgvYXByMDlfMS9nMjQ2NyI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDI0LzExLzI4LzIwMjQuMDQuMjkuMjQzMDY1MjIuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 5. 5.Braitmaier M, Schwarz S, Kollhorst B, Senore C, Didelez V, Haug U. Screening colonoscopy similarly prevented distal and proximal colorectal cancer: a prospective study among 55-69-year-olds. J Clin Epidemiol. Jun 6 2022;149:118–126. doi:10.1016/j.jclinepi.2022.05.024 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jclinepi.2022.05.024&link_type=DOI) 6. 6.Garcia-Albeniz X, Hsu J, Hernan MA. The value of explicitly emulating a target trial when using real world evidence: an application to colorectal cancer screening. Eur J Epidemiol. Jun 2017;32(6):495–500. doi:10.1007/s10654-017-0287-2 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=doi:10.1007/s10654-017-0287-2&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28748498&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) 7. 7.Wakabayashi R, Hirano T, Laurent T, Kuwatsuru Y, Kuwatsuru R. Impact of “time zero” of Follow-Up Settings in a Comparative Effectiveness Study Using Real-World Data with a Non-user Comparator: Comparison of Six Different Settings. Drugs Real World Outcomes. Mar 2023;10(1):107–117. doi:10.1007/s40801-022-00343-1 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s40801-022-00343-1&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=36441486&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) 8. 8.Hernan MA, Sauer BC, Hernandez-Diaz S, Platt R, Shrier I. Specifying a target trial prevents immortal time bias and other self-inflicted injuries in observational analyses. J Clin Epidemiol. Nov 2016;79:70–75. doi:10.1016/j.jclinepi.2016.04.014 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=doi:10.1016/j.jclinepi.2016.04.014&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27237061&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) 9. 9.1. Sturkenboom M, 2. Schink T Haug U, Schink T. German Pharmacoepidemiological Research Database (GePaRD). In: Sturkenboom M, Schink T, eds. Databases for pharmacoepidemiolpogical research. Springer; 2021:119–124. 10. 10.Altenhofen L. *Projekt Wissenschaftliche Begleitung von Früherkennungs-Koloskopien in Deutschland, Berichtszeitraum 2014-12. Jahresbericht, Version 2*. Zentralinstitut für kassenärztliche Versorgung in Deutschland; 2016. 11. 11.Schwarz S, Hornschuch M, Pox C, Haug U. Colorectal cancer after screening colonoscopy: 10-year incidence by site and detection rate at first repeat colonoscopy. Clin Transl Gastroenterol. Jan 1 2023;14(1):e00535. doi:10.14309/ctg.0000000000000535 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.14309/ctg.0000000000000535&link_type=DOI) 12. 12.Baxter NN, Goldwasser MA, Paszat LF, Saskin R, Urbach DR, Rabeneck L. Association of colonoscopy and death from colorectal cancer. Ann Intern Med. Jan 6 2009;150(1):1–8. doi:10.7326/0003-4819-150-1-200901060-00306 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.7326/0003-4819-150-1-200901060-00306&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19075198&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000262655100001&link_type=ISI) 13. 13.Mulder SA, van Soest EM, Dieleman JP, et al. Exposure to colorectal examinations before a colorectal cancer diagnosis: a case-control study. Eur J Gastroenterol Hepatol. Apr 2010;22(4):437–43. doi:10.1097/MEG.0b013e328333fc6a [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/MEG.0b013e328333fc6a&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19952765&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) 14. 14.Doubeni CA, Weinmann S, Adams K, et al. Screening colonoscopy and risk for incident late-stage colorectal cancer diagnosis in average-risk adults: a nested case-control study. Ann Intern Med. Mar 5 2013;158(5 Pt 1):312–20. doi:10.7326/0003-4819-158-5-201303050-00003 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.7326/0003-4819-158-5-201303050-00003&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23460054&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000316057700002&link_type=ISI) 15. 15.Brenner H, Chang-Claude J, Seiler CM, Rickert A, Hoffmeister M. Protection from colorectal cancer after colonoscopy: a population-based, case-control study. Ann Intern Med. Jan 4 2011;154(1):22–30. doi:10.7326/0003-4819-154-1-201101040-00004 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.7326/0003-4819-154-1-201101040-00004&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21200035&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000285830900003&link_type=ISI) 16. 16.Kahi CJ, Pohl H, Myers LJ, Mobarek D, Robertson DJ, Imperiale TF. Colonoscopy and colorectal cancer mortality in the veterans affairs health care system: a case-control study. Ann Intern Med. Apr 3 2018;168(7):481–488. doi:10.7326/M17-0723 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.7326/M17-0723&link_type=DOI) 17. 17.Baxter NN, Warren JL, Barrett MJ, Stukel TA, Doria-Rose VP. Association between colonoscopy and colorectal cancer mortality in a US cohort according to site of cancer and colonoscopist specialty. J Clin Oncol. Jul 20 2012;30(21):2664–9. doi:10.1200/JCO.2011.40.4772 [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiamNvIjtzOjU6InJlc2lkIjtzOjEwOiIzMC8yMS8yNjY0IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMTEvMjgvMjAyNC4wNC4yOS4yNDMwNjUyMi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 18. 18.Dickerman BA, Garcia-Albeniz X, Logan RW, Denaxas S, Hernan MA. Emulating a target trial in case-control designs: an application to statins and colorectal cancer. Int J Epidemiol. Oct 1 2020;49(5):1637–1646. doi:10.1093/ije/dyaa144 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ije/dyaa144&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32989456&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) 19. 19.Young JG, Stensrud MJ, Tchetgen Tchetgen EJ, Hernan MA. A causal framework for classical statistical estimands in failure-time settings with competing events. Stat Med. Apr 15 2020;39(8):1199–1236. doi:10.1002/sim.8471 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.8471&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) 20. 20.Abadie A, Imbens G. On the failure of the bootstrap for matching estimators. Econometrica. 2008;76(6):1537–1557. doi:10.3982/ECTA6474 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3982/ECTA6474&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000261055400010&link_type=ISI) 21. 21.Greenland S. Quantifying biases in causal models: classical confounding vs collider-stratification bias. Epidemiology. May 2003;14(3):300–6. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/00001648-200305000-00009&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=12859030&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000183393100009&link_type=ISI) 22. 22.Hernan MA, Hernandez-Diaz S, Robins JM. A structural approach to selection bias. Epidemiology. Sep 2004;15(5):615–25. doi:10.1097/01.ede.0000135174.63482.43 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/01.ede.0000135174.63482.43&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15308962&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000223382600020&link_type=ISI) 23. 23.Meza R, Jeon J, Renehan AG, Luebeck EG. Colorectal cancer incidence trends in the United States and United kingdom: evidence of right- to left-sided biological gradients with implications for screening. Cancer Res. Jul 1 2010;70(13):5419–29. doi:10.1158/0008-5472.CAN-09-4417 [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiY2FucmVzIjtzOjU6InJlc2lkIjtzOjEwOiI3MC8xMy81NDE5IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMTEvMjgvMjAyNC4wNC4yOS4yNDMwNjUyMi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 24. 24.Robins JM, Hernan MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. Sep 2000;11(5):550–60. doi:10.1097/00001648-200009000-00011 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/00001648-200009000-00011&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=10955408&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000088854500011&link_type=ISI) 25. 25.Hernan MA. The hazards of hazard ratios. Epidemiology. Jan 2010;21(1):13–5. doi:10.1097/EDE.0b013e3181c1ea43 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/EDE.0b013e3181c1ea43&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20010207&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000272872900005&link_type=ISI) 26. 26.Martinussen T. Causality and the Cox regression model. Annual Review of Statistics and Its Application. 2022;9(1):249–259. 27. 27.Song M, Bretthauer M. Interpreting epidemiologic studies of colonoscopy screening for colorectal cancer prevention: understanding the mechanisms of action is key. Eur J Epidemiol. Sep 2023;38(9):929–931. doi:10.1007/s10654-023-01043-y [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s10654-023-01043-y&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=37667139&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) 28. 28.Brenner H, Heisser T, Cardoso R, Hoffmeister M. When gold standards are not so golden: prevalence bias in randomized trials on endoscopic colorectal cancer screening. Eur J Epidemiol. Sep 2023;38(9):933–937. doi:10.1007/s10654-023-01031-2 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s10654-023-01031-2&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=37530938&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) 29. 29.Rasouli B, Chubak J, Floyd JS, et al. Combining high quality data with rigorous methods: emulation of a target trial using electronic health records and a nested case-control design. BMJ. Dec 28 2023;383:e072346. doi:10.1136/bmj-2022-072346 [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiRlVMTCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiYm1qIjtzOjU6InJlc2lkIjtzOjE5OiIzODMvZGVjMjhfMi9lMDcyMzQ2IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMTEvMjgvMjAyNC4wNC4yOS4yNDMwNjUyMi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 30. 30.Crockett SD, Nagtegaal ID. Terminology, molecular features, epidemiology, and management of serrated colorectal neoplasia. Gastroenterology. Oct 2019;157(4):949–966 e4. doi:10.1053/j.gastro.2019.06.041 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1053/j.gastro.2019.06.041&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=31323292&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) ## References 1. 1.Nishihara R, Wu K, Lochhead P, et al. Long-term colorectal-cancer incidence and mortality after lower endoscopy. N Engl J Med. Sep 19 2013;369(12):1095–105. doi:10.1056/NEJMoa1301969 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa1301969&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24047059&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000324554800007&link_type=ISI) 2. 2.Robins JM, Hernan MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. Sep 2000;11(5):550–60. doi:10.1097/00001648-200009000-00011 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/00001648-200009000-00011&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=10955408&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F28%2F2024.04.29.24306522.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000088854500011&link_type=ISI) [1]: /embed/inline-graphic-1.gif [2]: /embed/inline-graphic-2.gif [3]: /embed/inline-graphic-3.gif [4]: /embed/graphic-15.gif