Potential Biases in Test-Negative Design Studies of COVID-19 Vaccine Effectiveness Arising from the Inclusion of Asymptomatic Individuals ========================================================================================================================================= * Edgar Ortiz-Brizuela * Mabel Carabali * Cong Jiang * Joanna Merckx * Denis Talbot * Mireille E. Schnitzer ## Abstract The test-negative design (TND) is a popular method for evaluating vaccine effectiveness (VE). A “classical” TND study includes symptomatic individuals tested for the disease targeted by the vaccine to estimate VE against symptomatic infection. However, recent applications of the TND have attempted to estimate VE against infection by including all tested individuals, regardless of their symptoms. In this article, we use directed acyclic graphs and simulations to investigate potential biases in TND studies of COVID-19 VE arising from the use of this “alternative” approach, particularly when applied during periods of widespread testing. We show that the inclusion of asymptomatic individuals can potentially lead to collider stratification bias, uncontrolled confounding by health and healthcare-seeking behaviors (HSBs), and differential outcome misclassification. While our focus is on the COVID-19 setting, the issues discussed here may also be relevant in the context of other infectious diseases. This may be particularly true in scenarios where there is either a high baseline prevalence of infection, a strong correlation between HSBs and vaccination, different testing practices for vaccinated and unvaccinated individuals, or settings where both the vaccine under study attenuates symptoms of infection and diagnostic accuracy is modified by the presence of symptoms. Key words * Test-negative design * COVID-19 * vaccine * effectiveness * asymptomatic * SARS-CoV-2 * simulation study * collider bias * confounding * misclassification ## Introduction The post-licensure evaluation of COVID-19 vaccine effectiveness (VE), defined as the effect of vaccination on the risk of an infection-related outcome,(1) has provided crucial insights into questions not addressed through randomized clinical trials.(2–4) Among the study designs used to estimate VE, the test-negative design (TND) has gained popularity, in part because of its rapid and seemingly straightforward implementation.(2,5,6) In a “classical” TND study, symptomatic individuals tested for the disease targeted by the vaccine are prospectively selected, for example, from surveillance centers or hospitals.(5–8) Then, VE against symptomatic infection is estimated by comparing the odds of vaccination between patients with positive and negative test results using logistic regression.(5,7,8) However, recent literature(9–20) has also applied the term TND to studies aiming to estimate VE against infection by including all tested individuals, regardless of their symptom status (hereafter referred to as “alternative” TND) — for a clearer distinction between classical and alternative TND, see **Table 1**. Nonetheless, this approach may introduce additional threats to validity and warrants a more comprehensive evaluation.(21) View this table: [Table 1.](http://medrxiv.org/content/early/2024/07/18/2023.11.16.23298633/T1) Table 1. Relevant definitions used in this article. In this article, we investigate potential biases in TND studies of COVID-19 VE arising from the use of this “alternative” approach, particularly when applied during periods of widespread testing. We begin by discussing the identifiability of two causal target parameters: 1) the risk ratio (RR) for medically attended and laboratory-confirmed symptomatic COVID-19 relative to vaccination status, referred to as *RRCOVID* (the target parameter for the classical TND), and 2) the RR for SARS-CoV-2 infection relative to vaccination status, referred to as *RRinfect* (the target parameter for the alternative TND) — note that throughout this manuscript we use the expression “symptomatic COVID-19” to emphasize the difference between symptomatic and asymptomatic infections. We conclude with a simulation study that aims to estimate and compare the magnitude of the bias in the odds ratio (OR) estimates for medically attended and laboratory-confirmed symptomatic COVID-19 (*ORCOVID*) and the OR for SARS-CoV-2 infection (*ORinfect*), relative to their respective target parameters (*RRCOVID* and *RRinfect*), under scenarios where a classical TND is considered valid. ## Identifiability of target parameters In this section, we use causal directed acyclic graphs (DAGs) to examine the identifiability of the target parameters (*RRCOVID* and *RRinfect*).(22) By “identifiability,” we refer to the ability to express a counterfactual quantity as a function of the distribution of observed data, implying the absence of systematic biases.(23,24) However, before delving into the specifics of target parameter identifiability, and to facilitate a better understanding, we introduce the reader to a DAG based on previous work(7,8,21,25,26) that illustrates the assumed causal relationships between relevant variables in TND studies of COVID-19 VE, whether classical or alternative (**Table 2**). View this table: [Table 2.](http://medrxiv.org/content/early/2024/07/18/2023.11.16.23298633/T2) Table 2. Directed acyclic graph for test-negative design studies of COVID-19 vaccine effectiveness (either classical or alternative). ### The case of the classical TND In this context, if the test-negative state (having some illness that presents with COVID-19-like symptoms – such as an infection not targeted by the vaccine (1= 1) – and a negative test) and the vaccination status are independent conditional on covariates, the case-status OR (*ORCOVID*) relative to vaccination status — derived from a correctly specified multivariable logistic regression model — provides an unbiased estimate of its target parameter: the conditional RR for medically attended and laboratory-confirmed symptomatic COVID-19 (1 = 2, S = 1, T = 1) relative to vaccination status (*RRCOVID*).(8,35) Mathematical definitions for *ORCOVID* and *RRCOVID* are provided in **Table 3**. In hospital-based TND studies, *RRCOVID* represents the risk ratio for medically attended infections leading to hospitalization (where T stands for “hospitalized and tested due to symptoms”).(8) On the other hand, in outpatient-based TND studies, *RRCOVID* represents the risk ratio for medically attended infections in outpatient settings (where T stands for “accessed care and tested”). View this table: [Table 3.](http://medrxiv.org/content/early/2024/07/18/2023.11.16.23298633/T3) Table 3. Mathematical definitions for *RRCOVID*, *ORCOVID*, *RRinfect*, and *ORinfect*. Importantly, unlike some traditional case-control studies, the TND does not require the rare disease assumption for the OR to approximate the RR. (8) It is also noteworthy that the *ORCOVID* estimate is often re-expressed as VE using the estimator: ![Graphic][1]. (5) However, to avoid ambiguity, throughout this manuscript and unless otherwise specified, the term VE will continue to refer to any causal effect of vaccination on the risk of an infection-related outcome, regardless of the outcome assessed.(1) In addition, the estimands mentioned above can be given a causal interpretation if the identifiability assumptions are also met (**Appendix S2**).(8,36,37) One such assumption, known as conditional exchangeability, implies that all common causes of vaccination status (V) and outcome (medically attended and laboratory-confirmed symptomatic SARS-CoV-2 infection [1 = 2, S = 1, T = 1]) have been measured and appropriately accounted for in the analysis.(8,36–38) However, among these common causes, an individual’s health and healthcare-seeking behaviors (HSBs) are inherently unmeasured.(21) In a best-case scenario, if HSBs were deterministic (meaning that individuals who exhibit these behaviors would always seek medical care when ill, while individuals who do not exhibit HSBs would not), a classical TND study could mitigate bias related to this variable by restricting the study population to individuals who have sought medical attention and are therefore assumed to exhibit HSBs (H= 1).(5,7) Although this scenario seems unlikely in real-world settings, where these behaviors may only modify the probability of seeking medical care, classical TND studies could still offer better control for confounding by HSBs than other observational study designs such as cohorts or traditional case-control studies.(7,8) To illustrate this point, **Figure 1(a)** shows a DAG that represents an ideal scenario for a classical TND study evaluating VE against medically attended and laboratory-confirmed symptomatic COVID-19. By “ideal”, we mean that the study is conducted in a population where HSBs are deterministic and only perfect tests are used (with 100% sensitivity and specificity). In this study, the effect of interest is represented by the path ![Graphic][2]. The square shapes of the nodes *S*, *T*, and *H* indicate that the study sample is restricted to symptomatic and tested individuals who sought medical care and therefore exhibit HSBs (*S* = 1, *T* = 1, *H* = 1). Conditioning on HSBs is critical because it not only allows control of confounding through the path ![Graphic][3] but also blocks other biasing paths (e.g., ![Graphic][4]) that were opened by conditioning on colliders such as testing (*T*). In other words, restricting the study sample in a classical TND study to symptomatic and tested individuals is essential to prevent (or minimize) confounding and collider-stratification bias by HSBs,(7) and thus, to identify the *RRCOVID*. Figure 1: Causal-directed acyclic graphs for test-negative-design (TND) studies of COVID-19 vaccine effectiveness: a) the classical TND setting, b) the alternative TND setting, c) potential for differential misclassification of outcome status in alternative TND studies. **Abbreviations**: *C*: Confounders; *H*: health and healthcare-seeking behaviors; *I*: Infection status; *Ix*: Measured infection status; *S*: COVID-19-like symptoms; *T*: SARS-CoV-2 diagnostic tests; *V*: COVID-19 vaccination status. **Notes**: The square nodes indicate that the variable is controlled by either the study design or the analysis, while unconditioned nodes are represented as circles. ### The case of the alternative TND #### Potential for collider stratification bias and uncontrolled confounding by HSBs Having studied the classical TND, we now turn to the challenges presented by the alternative approach. In this scenario, since the study sample includes all tested individuals, regardless of symptoms, the target parameter is no longer the *RRCOVID*. Instead, it is the conditional RR for SARS-CoV-2 infection relative to vaccination status (*RRinfect*), represented by the path ![Graphic][5]. Mathematical definitions for *ORinfect* and *RRinfect* are given in **Table 3**. To assess the identifiability of this target parameter, it is crucial to identify the specific drivers of testing leading to selection. Importantly, these drivers may have varied across settings, depending on the stage of the pandemic and the prevailing indications for testing.(39,40) For example, in a prospective, nonprobability-based, cross-sectional online survey of testing practices conducted between August 23, 2021, and March 12, 2022, among 418,279 U.S. adults aged ≥18 years, the most common reasons for testing, other than having symptoms, were exposure to COVID-19 (23.1%), a prerequisite for travel (20.6%), and requirement for work or school (16.1%).(39) Given that some of these reasons were mandatory at various points in time, we can no longer assume that the study sample is limited to healthcare seekers,(21) and that the TND effectively controls for bias related to HSBs. This may also be relevant to some hospital-based TND studies, as some institutions implemented policies for universal screening on admission.(41,42) We illustrate this point in the DAG shown in **Figure 1(b)**, which represents an alternative TND study of VE against SARS-CoV-2 infection. In this DAG, the circular shapes of the nodes HSBs (H) and COVID-19-like symptoms (S) indicate that we are no longer conditioning on these variables. As a result, several biasing paths are opened, leading to uncontrolled confounding by HSBs (![Graphic][6]; path 1 in **Figure S1**) and collider stratification bias (![Graphic][7] (path 2); ![Graphic][8] (path 3); ![Graphic][9] (path 4); ![Graphic][10] (path 5); and ![Graphic][11] (path 6)). In other words, the *RRinfect* is not identifiable in the alternative TND setting. #### Potential for differential outcome misclassification Another perceived benefit of the TND is its potential to reduce outcome misclassification compared to traditional cohort or case-control studies.(7,8) This is primarily attributed to the restriction of the study sample to individuals with confirmed infection status.(7,8) However, the inclusion of asymptomatic individuals in alternative TND studies could potentially compromise this apparent benefit and instead lead to differential misclassification of the outcome relative to the exposure status. The main reason for this is that while SARS-CoV-2 diagnostic tests have excellent specificity for diagnosing acute infection, their sensitivity is likely modified by the presence of symptoms.(40) For example, two studies comparing the diagnostic accuracy of SARS-CoV-2 tests, reported antigen test sensitivities in symptomatic individuals of 72.0% (95% confidence interval [95% CI], 63.7% to 79.0%) and 58.1% (95% CI, 40.2% to 74.1%) in those asymptomatic,(43) and nucleic acid amplification tests (NAATs) sensitivities in symptomatic individuals of 97.1% (95% CI, 96.7% to 97.3%) and 87.6% (95% CI, 85.2% to 89.6%) in those without symptoms.(44) This concept is illustrated in the DAG shown in **Figure 1(c)**. This DAG introduces a new node for the measured infection status (*1x*), which may differ from its true value (*1*). There is differential misclassification of the outcome (*1*) relative to the exposure status (*V*) because, as described in **Table 2**, COVID-19 vaccines (*V*) may affect the presence of symptoms (*S*),(31–33) which in turn modify the sensitivity of SARS-CoV-2 tests (represented by the path ![Graphic][12]). This situation may be further complicated when considering that the selection of a specific test and its respective diagnostic performance, may depend on a number of additional factors.(40) These include the intended use of the test (i.e., screening versus confirmation, with molecular tests preferred for confirming diagnoses of acute infection in symptomatic individuals(30,40)), the time of infection onset,(45) the stage of the pandemic(40), and the target population (e.g., travelers vs. frontline workers(40)). As a result, it is difficult to accurately predict the magnitude and direction of this bias without a comprehensive understanding of the causal structure underlying a given study.(46) ## Simulation study ### Methods #### Overview We conducted a simulation study to estimate and compare the magnitude of the bias in *ORCOVID* and *ORinfect* estimates — relative to their respective target parameters (*RRCOVID* and *RRinfect*) — under scenarios where a classical TND is considered valid. For simplicity, all simulations assume causal consistency,(38) no interference,(38) positivity,(38) and no unmeasured confounding other than that related to HSBs. To assess the bias pathways, we divided the simulation study into two parts. First, we evaluate the potential for collider bias and uncontrolled confounding by HSBs, assuming perfect diagnostic tests. Second, we incorporate the potential for differential outcome misclassification by simulating an extreme scenario where symptomatic individuals are tested exclusively with NAATs, and asymptomatic individuals are tested exclusively with antigen tests. Variables other than SARS-CoV-2 infection are assumed to be perfectly measured in all simulations. #### Data generation process To generate realistic data consistent with the DAG shown in **Table 2**, we generated 1,000 datasets (each with 1,000,000 observations) that simulated a cumulative prevalence of SARS-CoV-2 infection of ∼12.2%,(47) including an asymptomatic infection prevalence of ∼35%,(34) and a cumulative prevalence of fully vaccinated individuals of ∼54.6%.(48,49) In addition, we simulated the proportion of individuals who reported using any SARS-CoV-2 diagnostic test (either NAATs or antigen tests) within 30 days in the aforementioned survey (∼27.7%).(39) Each node in the DAG was generated conditionally on its parent nodes using the models shown in **Table 4**, with parameters for these models selected based on the available literature whenever possible (**Table S1**). Importantly, although the simulations were designed to represent an outpatient setting, similar results can be expected for hospital-based TND studies, as the data generating structure may be similar for both scenarios (**Table 2**). View this table: [Table 4.](http://medrxiv.org/content/early/2024/07/18/2023.11.16.23298633/T4) Table 4. Models used in the simulation study to generate data consistent with the directed acyclic graph shown in Table 2. #### Data analysis After generating the data, we first created classical and alternative TND samples from each simulated dataset by selecting either symptomatic and tested individuals or all tested individuals, respectively. Second, we estimated the *ORCOVID* (in the classical samples) and the *ORinfect* (in the alternative samples) using logistic regression models conditional on the set of measured confounders (c). For these models, we used the true SARS-CoV-2 infection status as the outcome variable when evaluating the potential for collider bias and uncontrolled confounding by HSBs, and the measured SARS-CoV-2 infection status, when incorporating the potential for differential outcome misclassification. Third, to estimate the true value of the target parameters (*RRCOVID* and *RRinfect*), we generated two counterfactual populations, each consisting of 100,000,000 individuals, representing scenarios in which everyone was either vaccinated or unvaccinated. Finally, we estimated the bias by subtracting the true values of the target parameters from the exponentiated mean of the log(OR) estimates obtained from the 1,000 simulated datasets (denoted as ![Graphic][13] or ![Graphic][14]. Additionally, we report the Monte Carlo standard error (MCSE, the standard deviation of the estimated log(ORs)) and the average standard error (aSE, the average of the standard error estimates of the log(ORs)). To investigate the potential for larger biases, we iterated over the processes described above, each time varying the assumed strength of the relationships between the simulated variables. Specifically, we varied the magnitude of the association between *H* and *V* (parameter *β*2 in **Table 4**), between *H* and both *1*= 1 and *1*= 2 (parameters γl,2 and γ2,2, respectively), as well as the relationships of *1*= 1 and *1*= 2 with *S* (δ4 and δS, respectively). We also varied the coefficient for the interaction term of *1*= 2 and *V* with *S* (δ6), and the strengths of the relationships between *H* and *T* (*θ*2), between *V* and *T* (*θ*3), and between *S* and *T* (*θ*4). We adjusted these parameters according to the direction of the association between the independent and dependent variables. Specifically, for parameters representing a positive association (i.e., OR > 1), the strength was increased by units of 1 from 1.5 to 10.5. Conversely, for parameters representing a negative association (i.e., OR < 1), the strength was decreased from 0.95 to 0.05 by units of 0.1. We also varied the intercepts for *T* (*θ*O) and *1*= 2 (γ2,O), aiming for a baseline prevalence of testing from 0.25 to 0.7 and for a baseline prevalence of acute SARS-CoV-2 infection from 0.05 to 0.5. Finally, we simultaneously varied the strength of two, and then three, of the most influential parameters (i.e., *β*2, *β*3, and the intercept for *1*= 2). It should be noted that any adjustment to these parameters could lead to changes in the proposed prevalences for all simulated variables. All analyses were performed using R (version 4.2.2, R Foundation for Statistical Computing, Vienna, Austria, 2022).(50) ### Results When we assessed the potential for collider bias and uncontrolled confounding by HSBs we found a “true value” for *RRCOVID* of 0.149 and an estimated ![Graphic][15] of 0.141 (MCSE and aSE = 0.019; bias = –0.008). On the other hand, the “true value” for *RRinfect* was 0.142, with an estimated ![Graphic][16] of 0.125 (MCSE and aSE = 0.014; bias = –0.016). The results after varying the strength of selected parameters for this scenario, either individually or simultaneously, are shown in **Figures 2**-**3** and **Tables S2**-**S3**. In every case, the classical TND outperformed the alternative in terms of bias, with the biases for ![Graphic][17] and ![Graphic][18] ranging from –0.034 to –0.001 and from –0.211 to 0.728, respectively. When we incorporated imperfect tests, we found estimates for ![Graphic][19] of 0.154 (MCSE = 0.019; aSE = 0.018; bias = 0.005) and ![Graphic][20] of 0.146 (MCSE and aSE = 0.014; bias = 0.004), respectively. However, after varying the strength of selected parameters for this scenario, either individually or simultaneously, we found larger bias for ![Graphic][21] (ranging from –0.188 to 0.766) than for ![Graphic][22] (ranging from –0.005 to 0.018) (**Figures 4-5** and **Tables S4-S5**). Figure 2: Bias in odds ratio (OR) estimates for symptomatic COVID-19 (**ORCOVID**) and SARS-CoV-2 infection (**ORinfect**) relative to their respective target parameters (**RRCOVID** and **RRinfect**) after varying the strength of selected parameters (assuming perfect SARS-CoV-2 diagnostic tests). **Abbreviations**: *C*: Confounders; *H*: health and healthcare-seeking behaviors; *I*: Infection status; *I* =-2: SARS-CoV-2 infection status; OR, odds ratio; RR, risk ratio; *S*: COVID-19-like symptoms; *T*: SARS-CoV-2 diagnostic tests; TND, test-negative design; V: COVID-19 vaccination status. **Notes**: Bias of the OR = (exp(mean(measured log[OR])) – (true RR). Figure 3: Bias in odds ratio (OR) estimates for symptomatic COVID-19 (**ORCOVID**) and SARS-CoV-2 infection (**ORinfect**) relative to their respective target parameters (**RRCOVID** and **RRinfect**) after simultaneously varying the strength of two selected parameters (assuming perfect SARS-CoV-2 diagnostic tests). **Abbreviations**: *H*: health and healthcare-seeking behaviors; *I* =-2: SARS-CoV-2 infection status; NAATs, Nucleic Acid Amplification Tests; OR, odds ratio; RR, risk ratio; *T*: SARS-CoV-2 diagnostic tests; TND, test-negative design; V: COVID-19 vaccination status. **Notes**: Bias of the OR = (exp(mean(measured log[OR])) – (true RR). Figure 4: Bias in odds ratio (OR) estimates for symptomatic COVID-19 (**ORCOVID**) and SARS-CoV-2 infection (**ORinfect**) relative to their respective target parameters (**RRCOVID** and **RRinfect**) after varying the strength of selected parameters (assuming symptomatic individuals are tested exclusively with NAATs and asymptomatic individuals are tested exclusively with antigen tests). **Abbreviations**: *C*: Confounders; *H*: health and healthcare-seeking behaviors; *I*: Infection status; *I* =-2: SARS-CoV-2 infection status; NAATs, Nucleic Acid Amplification Tests; OR, odds ratio; RR, risk ratio; *S*: COVID-19-like symptoms; *T*: SARS-CoV-2 diagnostic tests; TND, test-negative design; V: COVID-19 vaccination status. **Notes**: Bias of the OR = (exp(mean(measured log[OR])) – (true RR). Figure 5: Bias in odds ratio (OR) estimates for symptomatic COVID-19 (**ORCOVID**) and SARS-CoV-2 infection (**ORinfect**) relative to their respective target parameters (**RRCOVID** and **RRinfect**) after simultaneously varying the strength of two selected parameters (assuming symptomatic individuals are tested exclusively with NAATs and asymptomatic individuals are tested exclusively with antigen tests). **Abbreviations**: *H*: health and healthcare-seeking behaviors; *I* =-2: SARS-CoV-2 infection status; NAATs, Nucleic Acid Amplification Tests; OR, odds ratio; RR, risk ratio; T: SARS-CoV-2 diagnostic tests; TND, test-negative design; *V*: COVID-19 vaccination status. **Notes**: Bias of the OR = (exp(mean(measured log[OR])) – (true RR). ## Discussion The TND has become a preferred method for assessing VE against pathogens such as influenza, largely due to its seamless integration with existing laboratory-based surveillance systems.(6) More recently, this observational study design has been widely used to estimate VE against COVID-19.(21) This widespread application was made possible primarily by the extensive use of SARS-CoV-2 testing during the pandemic, which generated large amounts of potential data for TND studies.(21) However, the use of these data for VE estimation poses unique validity challenges.(21) Originally, the TND was proposed as an efficient way to identify a control group (test-negative controls) for individuals who contracted the vaccine-targeted infection (test-positive cases) while also accounting for confounding by HSBs.(5,7,8) However, to effectively control for such bias in TND studies, researchers must assume that individuals seeking care and undergoing testing have comparable HSBs.(5,7,8) In the pre-COVID-19 era, testing for diseases such as influenza was primarily driven by clinical indications and focused on those who actively sought care for symptoms.(51) This arguably made it easier for researchers to assume that individuals enrolled in TND studies had comparable HSBs, provided the same clinical definition was used for cases and controls. However, this paradigm shifted with COVID-19, when testing criteria were expanded to identify asymptomatic infections, often on a “mandatory” basis, in order to break transmission chains.(40) In this article, we have shown that TND applications that ignore this fact and include individuals regardless of their symptoms (i.e., using the alternative TND) can potentially introduce collider stratification bias, uncontrolled confounding by HSBs, and differential misclassification of the outcome relative to the exposure status. Our simulation study, based on educated guesses for parameter values and representing a scenario in which the classical TND would yield unbiased estimates for its target parameter, showed only minor bias in the alternative TND setting. However, we also showed that the inclusion of asymptomatic individuals can lead to substantial bias in some scenarios. For example, our simulations showed that as the effect of HSBs on vaccination increases (OR > 1), the estimated ![Graphic][23] can become progressively closer to the null relative to its target parameter, *RRinfect*, leading to further underestimation of VE when expressed as 1 – OR × 100. Similarly, in settings where vaccination status is more negatively correlated with testing (OR < 1), the ![Graphic][24] would also be progressively biased towards the null. In addition, we observed that a higher baseline prevalence of SARS-CoV-2 infection would lead to a greater bias away from the null of ![Graphic][25], resulting in an overestimation of VE. We also found that these biases may be exacerbated by the introduction of differential outcome misclassification when different diagnostic approaches are used for symptomatic and asymptomatic individuals. Of note, the TND is well known to be particularly susceptible to misclassification bias compared to cohort and traditional case-control studies.(52) It is also important to mention that if one were to condition on the reason for testing or restrict the study sample to asymptomatic individuals, we would expect to observe the same bias described here, as all biasing pathways would remain open. It could be argued that the biases discussed here may be only relevant to retrospective TND studies that rely on routinely collected health data (i.e., data collected without predetermined research questions(53)). For example, some TND studies using administrative data sources may have included asymptomatic individuals simply because of a lack of information on reasons for testing, as acknowledged by some authors.(14,15) However, it should be noted that some retrospective TND studies of COVID-19 VE intentionally included asymptomatic individuals to estimate VE against infection, even when data on symptoms were available.(10–12,16,17) In fact, some prospective TND studies (i.e., those collecting primary research data) conducted during the COVID-19 pandemic also included individuals regardless of symptoms.(18–20) That said, the potential for inclusion of asymptomatic individuals may be influenced not only by the data source, but also by the testing practices in the specific context or time frame in which the study is conducted. The changing landscape of COVID-19 testing practices throughout the pandemic further highlights the need for proper attention to context in TND studies. At the onset of the COVID-19 pandemic, testing was primarily NAAT-based and focused on symptomatic individuals.(40) However, as the pandemic progressed, testing guidelines were modified to support mass testing of asymptomatic individuals as a tool for pandemic containment.(40) Any TND study conducted under these circumstances, whether prospective or retrospective, would be at risk of the bias described here if a clinical definition was not included in the study’s eligibility criteria. Thus, we believe that the issues discussed here are relevant not only to TND studies using routinely collected health and administrative data, but also to all studies conducted in settings where the rationale for testing extends beyond purely clinical reasons. **Table 5** provides a list of suggested strategies to minimize bias in TND studies conducted in such scenarios. View this table: [Table 5.](http://medrxiv.org/content/early/2024/07/18/2023.11.16.23298633/T5) Table 5. Suggested strategies for minimizing bias in TND studies conducted during infectious disease outbreaks. It is worth noting that there may be some specific circumstances in which the magnitude of the biases discussed here could be attenuated by other features of the study design, particularly those that allow for the selection of a population with comparable HSBs. For example, TND studies focusing on booster effectiveness may be less susceptible to HSBs-related bias because subjects in these studies typically have completed a primary immunization schedule and are therefore expected to have more similar HSBs.(21) Likewise, some might argue that hospital-based TND studies are likely to include individuals with more homogeneous HSBs because only those evaluated in clinical settings are eligible. However, as discussed elsewhere,(7) it may be unrealistic to assume that all hospitalized individuals have the same levels of HSBs. In addition, it should be noted that not all SARS-CoV-2 tests performed in hospitals under pandemic conditions were triggered by COVID-19-like symptoms; for example, some institutions mandated universal screening on admission.(41,42) Consequently, hospital-based TND studies that include all individuals tested regardless of the reason for testing,(13,55) would be expected to have biases similar to those discussed here. This study has numerous strengths but also carries limitations. Among the strengths, our simulation study was illustrated by DAGs, which helps to clarify the potential sources of bias. In addition, while this article focuses on the COVID-19 setting, the issues discussed here are also relevant to TND studies using the alternative approach in the context of other infectious diseases. This may be particularly true in scenarios of widespread testing, such as during epidemics or pandemics, where testing protocols and preferences are influenced by clinical and public health considerations, or in situations where self-testing for various infections is available. In terms of limitations, the strength of the relationships depicted in the DAG may vary. For example, the path between vaccination and testing (![Graphic][26]) may be weak in some contexts. Nevertheless, collider bias could still occur through these nodes because unobservable factors, including HSBs, could confound the association between *V* and *T*. Another limitation lies in the values assigned to the parameters in our simulations. Although we relied on existing literature, these values are still estimates. However, varying these parameter values supported our theory-based hypothesis that relevant biases may occur in some settings. Finally, our study focused on specific issues related to the inclusion of asymptomatic individuals in TND studies. However, we did not assess additional complexities, such as the possibility of differential measurement error in exposure, outcome, or covariate status based on testing rationale, or the nuances in data quality and challenges specific to their respective sources, such as clinical versus other settings.(21) ## Conclusions In conclusion, the inclusion of asymptomatic individuals in TND studies of COVID-19 VE may lead to collider stratification bias, uncontrolled confounding by HSBs, and differential misclassification of the outcome relative to exposure status. Researchers designing or applying the TND for VE estimation need to be aware of these potential biases. Further research is needed to identify additional design or analytic strategies to control for biases related to HSBs that may improve the validity and utility of the TND for estimating VE against infection. ## Funding This work was supported by the Canadian Institutes of Health Research Project Grant ECP-184178 (awarded to MES, DT, JM, CJ). MES holds the Canada Research Chair in Causal Inference and Machine Learning in Health Science. MC holds a Chercheur Boursier Junior 1 Award from the Fonds de recherche du Québec-Santé (FRQS). DT holds a Chercheur Boursier Junior 2 Award from FRQS. E-OB holds a Doctoral Training Award from FRQS. ## Conflict of Interest None. ## Disclaimer The views expressed in this article are those of the authors and do not necessarily reflect the official position of any agency of the authors’ institutions or funders. ## Data Availability Statement The code and a sample dataset can be found at the following link: [https://github.com/ortizbrizuela/](https://github.com/ortizbrizuela/) ## Supplementary Material ## Supplementary Text ## Supplementary Tables View this table: [Table S1:](http://medrxiv.org/content/early/2024/07/18/2023.11.16.23298633/T6) Table S1: Data generation process for the DAG-guided simulation study. View this table: [Table S2:](http://medrxiv.org/content/early/2024/07/18/2023.11.16.23298633/T7) Table S2: Bias in odds ratio (OR) estimates for symptomatic COVID-19 (*ORCOVID*) and SARS-CoV-2 infection (*ORinfect*) relative to their target parameters (*RRCOVID* and *RRinfect*, respectively) after individually varying the strength of one selected parameters (assuming perfect SARS-CoV-2 diagnostic tests). View this table: [Table S3:](http://medrxiv.org/content/early/2024/07/18/2023.11.16.23298633/T8) Table S3: Bias in odds ratio (OR) estimates for symptomatic COVID-19 (*ORCOVID*) and SARS-CoV-2 infection (*ORinfect*) relative to their target parameters (*RRCOVID* and *RRinfect*, respectively) after simultaneously varying the strength of two or three selected parameters (assuming perfect SARS-CoV-2 diagnostic tests). View this table: [Table S4:](http://medrxiv.org/content/early/2024/07/18/2023.11.16.23298633/T9) Table S4: Bias in odds ratio (OR) estimates for **measured** symptomatic COVID-19 (*ORCOVID*) and **measured** SARS-CoV-2 infection (*ORinfect*) relative to their target parameters (*RRCOVID* and *RRinfect*, respectively) after individually varying the strength of selected parameters (assuming symptomatic individuals are tested exclusively with NAATs and asymptomatic individuals are tested exclusively with antigen tests). View this table: [Table S5:](http://medrxiv.org/content/early/2024/07/18/2023.11.16.23298633/T10) Table S5: Bias in odds ratio (OR) estimates for **measured** symptomatic COVID-19 (*ORCOVID*) and **measured** SARS-CoV-2 infection (*ORinfect*) relative to their target parameters (*RRCOVID* and *RRinfect*, respectively) after individually varying the strength of two or three selected parameters (assuming symptomatic individuals are tested exclusively with NAATs and asymptomatic individuals are tested exclusively with antigen tests). ## Supplementary Figures ![](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/18/2023.11.16.23298633/F7/graphic-25.medium.gif) [](http://medrxiv.org/content/early/2024/07/18/2023.11.16.23298633/F7/graphic-25) ![](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/18/2023.11.16.23298633/F7/graphic-26.medium.gif) [](http://medrxiv.org/content/early/2024/07/18/2023.11.16.23298633/F7/graphic-26) Figure S1: Causal directed acyclic graphs for “alternative” test-negative design studies of COVID-19 vaccine effectiveness – highlighting the remaining open paths between vaccination status and the outcome infection () after restricting the study sample to tested individuals (). ## Acknowledgments A part of this work was presented at the Society for Epidemiologic Research (SER) conference held in Portland in June 2023, specifically during the oral abstract session entitled “Study Design – It’s All About the People”. A preprint version of this manuscript was published on medRxiv on November 16, 2023, and is available at [https://doi.org/10.1101/2023.11.16.23298633](https://doi.org/10.1101/2023.11.16.23298633). ## Appendix S1: A brief overview of causal directed acyclic graph theory and terminology A causal Directed Acyclic Graph (DAG) is a collection of “nodes” representing random variables, with their causal (temporal) relationships illustrated by unidirectional (directed) arrows, also known as “arcs.”1-3 In a DAG, no variable can be caused by itself, either directly or indirectly through other variables, making it “acyclic.” When an arc connects two nodes, the variable from which it originates is called the “parent,” and the variable to which it points is referred to as its “child.” Moreover, a “path” is any sequence of arcs connecting two nodes, regardless of their direction. A “directed path” is a path where all arcs point in the same direction, representing a causal path. In contrast, non-directed paths between two variables that lead to their association (as explained below) are called “biasing paths”. For example, the path ***A ← B → C*** is a non-directed (and biasing) path between *A* and *C*, also known as a “backdoor path” because it starts with an arc pointing to A and ends with an arrow pointing to *C*. In a directed path, variables in proximal positions are called “ancestors” of those in distal positions, and those in distal positions are “descendants” of those in proximal positions. Furthermore, nodes in any path can be classified as “colliders” or “non-colliders.” A “collider” in a path is a node with its preceding and subsequent arcs pointing at it (e.g., ***B*** is a collider in the path ***A → B ← C***). In contrast, a “non-collider” can be classified as a “mediator” if the variable occupies an intermediate position in the directed path, or as a “fork” if the preceding and subsequent arcs emerge from it. For example, ***B*** is a mediator in the path ***A → B → C***, while ***B*** is a fork in the path ***A ← B → C***. One of the main benefits of DAGs is that they allow users to evaluate potential sources of association between variables and, consequently, to identify possible sources of bias. In general, according to DAG theory, two nodes are expected to be statistically associated if any path between them is “open” (in this case, variables are categorized as “d-connected,” otherwise they are considered “d-separated”).1,3 A path between two nodes is “closed” if 1) it contains a collider (for which the analysis has not been conditioned on, nor on its descendants), or 2) if we condition our analysis on any non-collider within the path.1 For example, “confounding” (i.e., the presence of open backdoor paths) or “collider bias” (i.e., opening a non-causal path by conditioning on a collider or its descendants) are examples of sources of non-causal (biased) associations between two variables that can be detected using DAGs. Importantly, in order to attribute any found association between two variables in the DAG solely to open paths, we must assume that the DAG is missing no variable that affects two or more variables in it, that no variable that affects selection or is used for stratification that may be a collider is missing, and that there is no measurement or random error that could explain the association.3 We refer readers to other sources for a more comprehensive yet gentle introduction to DAG theory.1-3 ## Appendix S2: Identifiability of causal effects under outcome-dependent sample selection In studies where sampling depends on outcome status, such as traditional case-control studies or the test-negative design (TND), the risk of the outcome, the risk ratio, and the risk difference are subject to bias.4,5 Didelez et al.6 outlined two conditions that can be evaluated graphically using directed acyclic graphs (DAGs) necessary for both identifying and generalizing the conditional causal odds ratio (OR) in such circumstances: 1. The first condition — needed to identify the conditional causal OR within the study sample — requires that no backdoor path remains open between exposure and outcome (see **Appendix S1** for further details on DAG theory). This means that in the classical TND, after conditioning on health and healthcare-seeking behaviors (HSBs (H)) and the set of measured confounders (C), no backdoor path must remain open between the node vaccination status (V) and all components of the outcome medically attended and laboratory-confirmed symptomatic COVID-19 (/· S · T). This condition can be achieved within the classical TND by restricting the study sample to healthcare seekers (H = 1) and conditioning on the set of measured confounders (C) during the analysis. On the other hand, for this condition to be met in the alternative TND, no backdoor path must remain open between the vaccination status (V) and infection (/) after conditioning on HSBs (H) and the set of measured confounders (C). However, because the study sample is not limited to individuals with HSBs, it may not be possible to achieve this condition in the alternative TND. Therefore, while the conditional causal OR for medically attended and laboratory-confirmed symptomatic infection can be identified in the classical TND, the conditional causal OR for infection may remain unidentified in the alternative TND (our manuscript discusses numerous instances where conditional exchangeability on exposure is violated in the alternative TND). 2. The second condition — needed to generalize the conditional causal OR from the study sample to the entire population — requires the node for exposure status to be independent of the node selection, after conditioning on the outcome and a set of measured variables. In the classical TND, this means that the node representing an individual’s vaccination status () must be independent of the node representing selection into the study (, see DAG below), after conditioning on the outcome medically attended and laboratory-confirmed symptomatic COVID-19 (), HSBs (), and the set of measured confounders (). Using notation:. In the alternative TND, this condition translates into the node representing an individual’s vaccination status () being independent of the node selection into the study (), after conditioning on infection (), HSBs (), and the set of measured confounders (). Using notation: . In other words, based on our assumed DAG, only the conditional causal OR for medically attended and laboratory-confirmed symptomatic infection can be identified and generalized in the classical TND. Conversely, in the alternative TND, the conditional causal OR for infection may not be identifiable or generalizable to the broader population. ![Figure6](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/18/2023.11.16.23298633/F6.medium.gif) [Figure6](http://medrxiv.org/content/early/2024/07/18/2023.11.16.23298633/F6) **Abbreviations**: *C*: Confounders; *H*: health and healthcare-seeking behaviors; *I*: Infection status; *Ix*: Measured infection status; *S*: COVID-19-like symptoms; *Sel*: selection into the study sample; *V*: COVID-19 vaccination status; *T*: SARS-CoV-2 diagnostic tests. ## Footnotes * This manuscript has undergone and passed scientific peer review and is under technical review by the American Journal of Epidemiology and awaiting final acceptance. * Received November 16, 2023. * Revision received July 17, 2024. * Accepted July 18, 2024. * © 2024, Posted by Cold Spring Harbor Laboratory The copyright holder for this pre-print is the author. All rights reserved. The material may not be redistributed, re-used or adapted without the author's permission. ## References 1. 1.Cowling BJ, Sullivan SG. A concern over terminology in vaccine effectiveness studies. Euro Surveill 2018;23(10). 2. 2.Patel MM, Jackson ML, Ferdinands J. Postlicensure Evaluation of COVID-19 Vaccines. JAMA 2020;324(19):1939–40. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1001/JAMA.2020.19328&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 3. 3.Prugger C, Spelsberg A, Keil U, et al. Evaluating covid-19 vaccine efficacy and safety in the post-authorisation phase. BMJ 2021;375:e067570. [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiRlVMTCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiYm1qIjtzOjU6InJlc2lkIjtzOjE5OiIzNzUvZGVjMjNfMi9lMDY3NTcwIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMDcvMTgvMjAyMy4xMS4xNi4yMzI5ODYzMy5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 4. 4.Barouch DH. Covid-19 Vaccines – Immunity, Variants, Boosters. N Engl J Med 2022;387(11):1011–20. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMra2206573&link_type=DOI) 5. 5.Jackson ML, Nelson JC. The test-negative design for estimating influenza vaccine effectiveness. Vaccine 2013;31(17):2165–8. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.vaccine.2013.02.053&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23499601&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000318329400010&link_type=ISI) 6. 6.Chua H, Feng S, Lewnard JA, et al. The Use of Test-negative Controls to Monitor Vaccine Effectiveness: A Systematic Review of Methodology. Epidemiology 2020;31(1):43–64. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/EDE.0000000000001116&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=31609860&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 7. 7.Sullivan SG, Tchetgen Tchetgen EJ, Cowling BJ. Theoretical Basis of the Test-Negative Study Design for Assessment of Influenza Vaccine Effectiveness. Am J Epidemiol 2016;184(5):345–53. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kww064&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27587721&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 8. 8.Schnitzer ME. Estimands and Estimation of COVID-19 Vaccine Effectiveness Under the Test-Negative Design: Connections to Causal Inference. Epidemiology 2022;33(3):325–33. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/EDE.0000000000001470&link_type=DOI) 9. 9.Song S, Madewell ZJ, Liu M, et al. Effectiveness of SARS-CoV-2 Vaccines against Omicron Infection and Severe Events: A Systematic Review and Meta-Analysis of Test-Negative Design Studies. medRxiv 2023:2023.02.16.23286041. 10. 10.Chemaitelly H, Tang P, Hasan MR, et al. Waning of BNT162b2 Vaccine Protection against SARS-CoV-2 Infection in Qatar. N Engl J Med 2021;385(24):e83. [CrossRef](http://medrxiv.org/lookup/external-ref?access\_num=10.1056/NEJMOA2114114/SUPPL_FILE/NEJMOA2114114_DISCLOSURES.PDF&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 11. 11.Chemaitelly H, Yassine HM, Benslimane FM, et al. mRNA-1273 COVID-19 vaccine effectiveness against the B.1.1.7 and B.1.351 variants and severe COVID-19 disease in Qatar. Nat Med 2021;27(9):1614–21. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 12. 12.Tang P, Hasan MR, Chemaitelly H, et al. BNT162b2 and mRNA-1273 COVID-19 vaccine effectiveness against the SARS-CoV-2 Delta variant in Qatar. Nat Med 2021;27(12):2136–43. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 13. 13.Bruxvoort KJ, Sy LS, Qian L, et al. Effectiveness of mRNA-1273 against delta, mu, and other emerging variants of SARS-CoV-2: test negative case-control study. BMJ 2021;375:e068848. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiYm1qIjtzOjU6InJlc2lkIjtzOjE5OiIzNzUvZGVjMTVfMi9lMDY4ODQ4IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMDcvMTgvMjAyMy4xMS4xNi4yMzI5ODYzMy5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 14. 14.Skowronski DM, Setayeshgar S, Zou M, et al. Comparative Single-Dose mRNA and ChAdOx1 Vaccine Effectiveness Against Severe Acute Respiratory Syndrome Coronavirus 2, Including Variants of Concern: Test-Negative Design, British Columbia, Canada. J Infect Dis 2022;226(1):485–96. 15. 15.Ionescu IG, Skowronski DM, Sauvageau C, et al. BNT162b2 Effectiveness Against Delta and Omicron Variants of Severe Acute Respiratory Syndrome Coronavirus 2 in Adolescents Aged 12-17 Years, by Dosing Interval and Duration. J Infect Dis 2023;227(9):1073–83. 16. 16.Andrejko KL, Pry J, Myers JF, et al. Prevention of Coronavirus Disease 2019 (COVID-19) by mRNA-Based Vaccines Within the General Population of California. Clin Infect Dis 2022;74(8):1382–9. 17. 17.Hitchings MDT, Ranzani OT, Torres MSS, et al. Effectiveness of CoronaVac among healthcare workers in the setting of high SARS-CoV-2 Gamma variant transmission in Manaus, Brazil: A test-negative case-control study. Lancet Reg Health Am 2021;1:100025. 18. 18.Li XN, Huang Y, Wang W, et al. Effectiveness of inactivated SARS-CoV-2 vaccines against the Delta variant infection in Guangzhou: a test-negative case-control real-world study. Emerg Microbes Infect 2021;10(1):1751–9. 19. 19.Singh C, Naik BN, Pandey S, et al. Effectiveness of COVID-19 vaccine in preventing infection and disease severity: a case-control study from an Eastern State of India. Epidemiol Infect 2021;149:e224. 20. 20.Thiruvengadam R, Awasthi A, Medigeshi G, et al. Effectiveness of ChAdOx1 nCoV-19 vaccine against SARS-CoV-2 infection during the delta (B.1.617.2) variant surge in India: a test-negative, case-control study and a mechanistic study of post-vaccination immune responses. Lancet Infect Dis 2022;22(4):473–82. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/s1473-3099(21)00680-0&link_type=DOI) 21. 21.Shi X, Li KQ, Mukherjee B. Current Challenges With the Use of Test-Negative Designs for Modeling COVID-19 Vaccination and Outcomes. Am J Epidemiol 2023;192(3):328–33. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kwac203&link_type=DOI) 22. 22.Pearl J. Causal diagrams for empirical research. Biometrika 1995;82(4):669–88. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/biomet/82.4.669&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1995TT22800001&link_type=ISI) 23. 23.Hernán MA RJ. Chapter 2. Randomized experiments. Causal Inference: What If: Chapman & Hall/CRC, 2020:13–24. 24. 24.Hernán MA RJ. Chapter 6. Graphical representation of causal effects. Causal Inference: What If: Chapman & Hall/CRC, 2020:69–82. 25. 25.Vandenbroucke JP, Pearce N. Test-Negative Designs: Differences and Commonalities with Other Case-Control Studies with “Other Patient” Controls. Epidemiology 2019;30(6):838–44. 26. 26.Ciocanea-Teodorescu I, Nason M, Sjolander A, et al. Adjustment for Disease Severity in the Test-Negative Study Design. Am J Epidemiol 2021;190(9):1882–9. 27. 27.Wu F, Yuan Y, Li Y, et al. The acceptance of SARS-CoV-2 rapid antigen self-testing: A cross-sectional study in China. J Med Virol 2023;95(1):e28227. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/jmv.28227&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=36241424&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 28. 28.Centers for Disease Control and Prevention. CDC updates travel guidance for fully vaccinated people. [https://www.cdc.gov/media/releases/2021/p0402-travel-guidance-vaccinated-people.html](https://www.cdc.gov/media/releases/2021/p0402-travel-guidance-vaccinated-people.html). Created April 2, 2021. Updated May 19, 2024. Accessed April 13, 2023. 29. 29.Occupational Safety and Health Administration. Employer Rights and Responsibilities Following a Federal OSHA Inspection. U.S. Department of Labor. [https://www.osha.gov/sites/default/files/publications/OSHA4159.pdf](https://www.osha.gov/sites/default/files/publications/OSHA4159.pdf). Created November 4, 2021. Updated February 27, 2024. Accessed May 10, 2023. 30. 30.Government of Canada. Testing of vaccinated populations. Public Health Agency of Canada. [https://www.canada.ca/en/public-health/services/diseases/coronavirus-disease-covid-19/testing-screening-contact-tracing/testing-vaccinated-populations.html](https://www.canada.ca/en/public-health/services/diseases/coronavirus-disease-covid-19/testing-screening-contact-tracing/testing-vaccinated-populations.html). Published August 16, 2021. Updated July 22, 2023. Accessed May 10, 2023. 31. 31.Antonelli M, Penfold RS, Merino J, et al. Risk factors and disease profile of post-vaccination SARS-CoV-2 infection in UK users of the COVID Symptom Study app: a prospective, community-based, nested, case-control study. Lancet Infect Dis 2022;22(1):43–55. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S1473-3099(21)00460-6&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=34480857&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 32. 32.Antonelli M, Penfold RS, Canas LDS, et al. SARS-CoV-2 infection following booster vaccination: illness and symptom profile in a prospective, observational community-based case-control study. J Infect 2023. 33. 33.Grana C, Ghosn L, Evrenoglou T, et al. Efficacy and safety of COVID-19 vaccines. Cochrane Database Syst Rev 2022;12(12):CD015477. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/14651858.cd015477&link_type=DOI) 34. 34.Sah P, Fitzpatrick MC, Zimmer CF, et al. Asymptomatic SARS-CoV-2 infection: A systematic review and meta-analysis. Proc Natl Acad Sci U S A 2021;118(34). 35. 35.Dean NE, Hogan JW, Schnitzer ME. Covid-19 Vaccine Effectiveness and the Test-Negative Design. N Engl J Med 2021;385(15):1431–3. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMe2113151&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=34496195&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 36. 36.Infante-Rivard C, Cusson A. Reflection on modern methods: selection bias-a review of recent developments. Int J Epidemiol 2018;47(5):1714–22. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 37. 37.Didelez V, Kreiner S, Keiding N. Graphical Models for Inference Under Outcome-Dependent Sampling. Statistical Science 2010;25(3):368–87, 20. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1214/10-STS340&link_type=DOI) 38. 38.Hernán MA RJ. Chapter 3. Observational studies. Causal Inference: What If: Chapman & Hall/CRC, 2020:25–40. 39. 39.Rader B, Gertz A, Iuliano AD, et al. Use of At-Home COVID-19 Tests – United States, August 23, 2021-March 12, 2022. MMWR Morb Mortal Wkly Rep 2022;71(13):489–94. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.15585/mmwr.mm7113e1&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=35358168&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 40. 40.Peeling RW, Heymann DL, Teo YY, et al. Diagnostics for COVID-19: moving from pandemic response to control. Lancet 2022;399(10326):757–68. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0140-6736(21)02346-1&link_type=DOI) 41. 41.Stadler RN, Maurer L, Aguilar-Bultet L, et al. Systematic screening on admission for SARS-CoV-2 to detect asymptomatic infections. Antimicrob Resist Infect Control 2021;10(1):44. 42. 42.Talbot TR, Hayden MK, Yokoe DS, et al. Asymptomatic screening for severe acute respiratory coronavirus virus 2 (SARS-CoV-2) as an infection prevention measure in healthcare facilities: Challenges and considerations. Infect Control Hosp Epidemiol 2023;44(1):2–7. 43. 43.Dinnes J, Deeks JJ, Adriano A, et al. Rapid, point-of-care antigen and molecular-based tests for diagnosis of SARS-CoV-2 infection. Cochrane Database Syst Rev 2020;8(8):CD013705. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32845525&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 44. 44.Hohl CM, Hau JP, Vaillancourt S, et al. Sensitivity and Diagnostic Yield of the First SARS-CoV-2 Nucleic Acid Amplification Test Performed for Patients Presenting to the Hospital. JAMA Netw Open 2022;5(10):e2236288. 45. 45.Kucirka LM, Lauer SA, Laeyendecker O, et al. Variation in False-Negative Rate of Reverse Transcriptase Polymerase Chain Reaction-Based SARS-CoV-2 Tests by Time Since Exposure. Ann Intern Med 2020;173(4):262–7. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.7326/m20-1495&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 46. 46.Keogh RH, Shaw PA, Gustafson P, et al. STRATOS guidance document on measurement error and misclassification of variables in observational epidemiology: Part 1-Basic theory and simple methods of adjustment. Stat Med 2020;39(16):2197–231. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/SIM.8532&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32246539&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 47. 47.World Health Organization. United States of America: WHO Coronavirus (COVID-19) Dashboard. [https://covid19.who.int/region/amro/country/us](https://covid19.who.int/region/amro/country/us). Published September 1, 2021. Updated May 18, 2024. Accessed April 10, 2023. 48. 48.U.S. Census Bureau Population and Housing Unit Estimates for the United States. [https://www.census.gov/programs-surveys/popest.html](https://www.census.gov/programs-surveys/popest.html). Published September 1, 2021. Updated May 6, 2024. Accessed April 10, 2023. 49. 49.Mathieu E, Ritchie H, Ortiz-Ospina E, et al. A global database of COVID-19 vaccinations. Nat Hum Behav 2021;5(7):947–53. 50. 50.R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL [https://www.R-project.org/](https://www.R-project.org/). 51. 51.Uyeki TM, Bernstein HH, Bradley JS, et al. Clinical Practice Guidelines by the Infectious Diseases Society of America: 2018 Update on Diagnosis, Treatment, Chemoprophylaxis, and Institutional Outbreak Management of Seasonal Influenzaa. Clin Infect Dis 2019;68(6):e1–e47. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/cid/ciy745&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 52. 52.De Smedt T, Merrall E, Macina D, et al. Bias due to differential and non-differential disease– and exposure misclassification in studies of vaccine effectiveness. PLoS One 2018;13(6):e0199180. 53. 53.Benchimol EI, Smeeth L, Guttmann A, et al. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement. PLoS Med 2015;12(10):e1001885. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pmed.1001885&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26440803&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 54. 54.World Health O. Evaluation of COVID-19 vaccine effectiveness: interim guidance, 17 March 2021. Geneva: World Health Organization, 2021. 55. 55.Tan CY, Chiew CJ, Pang D, et al. Vaccine effectiveness against Delta, Omicron BA.1, and BA.2 in a highly vaccinated Asian setting: a test-negative design study. Clin Microbiol Infect 2023;29(1):101–6. ## Appendix References 1. 1.Hernán MA RJ. Chapter 6. Graphical representation of causal effects. Causal Inference: What If: Chapman & Hall/CRC; 2020: 69–82. 2. 2.Westreich D. Epidemiology by Design: A Causal Approach to the Health Sciences: Oxford University Press; 2019. 3. 3.Lash TL, VanderWeele TJ, Haneuse S, Rothman KJ. Modern Epidemiology. 4th_Edition ed: Lippincott Williams & Wilkins; 2021. p. 1–1174. 4. 4.Infante-Rivard C, Cusson A. Reflection on modern methods: selection bias-a review of recent developments. Int J Epidemiol 2018; 47(5): 1714–22. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 5. 5.Ciocanea-Teodorescu I, Nason M, Sjolander A, Gabriel EE. Adjustment for Disease Severity in the Test-Negative Study Design. Am J Epidemiol 2021; 190(9): 1882–9. 6. 6.Didelez V, Kreiner S, Keiding N. Graphical Models for Inference Under Outcome-Dependent Sampling. Statistical Science 2010; 25(3): 368–87, 20. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1214/10-STS340&link_type=DOI) 7. 7.Yang J, Gong H, Chen X, et al. Health-seeking behaviors of patients with acute respiratory infections during the outbreak of novel coronavirus disease 2019 in Wuhan, China. Influenza Other Respir Viruses 2021; 15(2): 188–94. 8. 8.Chilot D, Shitu K, Gela YY, et al. Factors associated with healthcare-seeking behavior for symptomatic acute respiratory infection among children in East Africa: a cross-sectional study. BMC Pediatr 2022; 22(1): 662. 9. 9.U.S. Census Bureau Population and Housing Unit Estimates for the United States. [https://www.census.gov/programs-surveys/popest.html](https://www.census.gov/programs-surveys/popest.html). Published September 1, 2021. Updated May 6, 2024. Accessed April 10, 2023. 10. 10.Mathieu E, Ritchie H, Ortiz-Ospina E, et al. A global database of COVID-19 vaccinations. Nat Hum Behav 2021; 5(7): 947–53. 11. 11.Burch AE, Lee E, Shackelford P, Schmidt P, Bolin P. Willingness to Vaccinate Against COVID-19: Predictors of Vaccine Uptake Among Adults in the US. J Prev (2022) 2022; 43(1): 83–93. 12. 12.World Health Organization. United States of America: WHO Coronavirus (COVID-19) Dashboard. [https://covid19.who.int/region/amro/country/us](https://covid19.who.int/region/amro/country/us). Published September 1, 2021. Updated May 18, 2024. Accessed April 10, 2023. 13. 13.Wong VW, Cowling BJ, Aiello AE. Hand hygiene and risk of influenza virus infections in the community: a systematic review and meta-analysis. Epidemiol Infect 2014; 142(5): 922–32. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1017/S095026881400003X&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 14. 14.Dai CL, Kornilov SA, Roper RT, et al. Characteristics and Factors Associated With Coronavirus Disease 2019 Infection, Hospitalization, and Mortality Across Race and Ethnicity. Clin Infect Dis 2021; 73(12): 2193–204. 15. 15.Schnitzer ME. Estimands and Estimation of COVID-19 Vaccine Effectiveness Under the Test-Negative Design: Connections to Causal Inference. Epidemiology 2022; 33(3): 325–33. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/EDE.0000000000001470&link_type=DOI) 16. 16.Hohl CM, Hau JP, Vaillancourt S, et al. Sensitivity and Diagnostic Yield of the First SARS-CoV-2 Nucleic Acid Amplification Test Performed for Patients Presenting to the Hospital. JAMA Netw Open 2022; 5(10): e2236288. 17. 17.Dinnes J, Deeks JJ, Adriano A, et al. Rapid, point-of-care antigen and molecular-based tests for diagnosis of SARS-CoV-2 infection. Cochrane Database Syst Rev 2020; 8(8): CD013705. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32845525&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 18. 18.Sah P, Fitzpatrick MC, Zimmer CF, et al. Asymptomatic SARS-CoV-2 infection: A systematic review and meta-analysis. Proc Natl Acad Sci U S A 2021; 118(34). 19. 19.Antonelli M, Penfold RS, Merino J, et al. Risk factors and disease profile of post-vaccination SARS-CoV-2 infection in UK users of the COVID Symptom Study app: a prospective, community-based, nested, case-control study. Lancet Infect Dis 2022; 22(1): 43–55. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S1473-3099(21)00460-6&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=34480857&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 20. 20.Antonelli M, Penfold RS, Canas LDS, et al. SARS-CoV-2 infection following booster vaccination: illness and symptom profile in a prospective, observational community-based case-control study. J Infect 2023. 21. 21.Grana C, Ghosn L, Evrenoglou T, et al. Efficacy and safety of COVID-19 vaccines. Cochrane Database Syst Rev 2022; 12(12): CD015477. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/14651858.cd015477&link_type=DOI) 22. 22.Rader B, Gertz A, Iuliano AD, et al. Use of At-Home COVID-19 Tests – United States, August 23, 2021-March 12, 2022. MMWR Morb Mortal Wkly Rep 2022; 71(13): 489–94. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.15585/mmwr.mm7113e1&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=35358168&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 23. 23.Siegler AJ, Hall E, Luisi N, et al. Willingness to Seek Diagnostic Testing for SARS-CoV-2 With Home, Drive-through, and Clinic-Based Specimen Collection Locations. Open Forum Infect Dis 2020; 7(7): ofaa269. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ofid/ofaa269&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32704517&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 24. 24.Jackson ML, Nelson JC. The test-negative design for estimating influenza vaccine effectiveness. Vaccine 2013; 31(17): 2165–8. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.vaccine.2013.02.053&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23499601&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000318329400010&link_type=ISI) 25. 25.Sullivan SG, Tchetgen Tchetgen EJ, Cowling BJ. Theoretical Basis of the Test-Negative Study Design for Assessment of Influenza Vaccine Effectiveness. Am J Epidemiol 2016; 184(5): 345–53. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kww064&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27587721&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2023.11.16.23298633.atom) 26. 26.Centers for Disease Control and Prevention. CDC updates travel guidance for fully vaccinated people. [https://www.cdc.gov/media/releases/2021/p0402-travel-guidance-vaccinated-people.html](https://www.cdc.gov/media/releases/2021/p0402-travel-guidance-vaccinated-people.html). Published April 2, 2021. Accessed April 13, 2023. 27. 27.Occupational Safety and Health Administration. Employer Rights and Responsibilities Following a Federal OSHA Inspection. U.S. Department of Labor. [https://www.osha.gov/sites/default/files/publications/OSHA4159.pdf](https://www.osha.gov/sites/default/files/publications/OSHA4159.pdf). Accessed May 10, 2023. 28. 28.Government of Canada. Testing of vaccinated populations. Public Health Agency of Canada. [https://www.canada.ca/en/public-health/services/diseases/coronavirus-disease-covid-19/testing-screening-contact-tracing/testing-vaccinated-populations.html](https://www.canada.ca/en/public-health/services/diseases/coronavirus-disease-covid-19/testing-screening-contact-tracing/testing-vaccinated-populations.html). Published August 16, 2021. Updated July 22, 2023. Accessed May 10, 2023. [1]: /embed/inline-graphic-1.gif [2]: /embed/inline-graphic-2.gif [3]: /embed/inline-graphic-3.gif [4]: /embed/inline-graphic-4.gif [5]: /embed/inline-graphic-5.gif [6]: /embed/inline-graphic-6.gif [7]: /embed/inline-graphic-7.gif [8]: /embed/inline-graphic-8.gif [9]: /embed/inline-graphic-9.gif [10]: /embed/inline-graphic-10.gif [11]: /embed/inline-graphic-11.gif [12]: /embed/inline-graphic-12.gif [13]: /embed/inline-graphic-13.gif [14]: /embed/inline-graphic-14.gif [15]: /embed/inline-graphic-15.gif [16]: /embed/inline-graphic-16.gif [17]: /embed/inline-graphic-17.gif [18]: /embed/inline-graphic-18.gif [19]: /embed/inline-graphic-19.gif [20]: /embed/inline-graphic-20.gif [21]: /embed/inline-graphic-21.gif [22]: /embed/inline-graphic-22.gif [23]: /embed/inline-graphic-23.gif [24]: /embed/inline-graphic-24.gif [25]: /embed/inline-graphic-25.gif [26]: /embed/inline-graphic-26.gif