Identifying and alleviating bias due to differential depletion of susceptible people in post-marketing evaluations of COVID-19 vaccines ======================================================================================================================================= * Rebecca Kahn * Stephanie J. Schrag * Jennifer R. Verani * Marc Lipsitch ## Abstract Recent studies have provided key information about SARS-CoV-2 vaccines’ efficacy and effectiveness (VE). One important question that remains is whether the protection conferred by vaccines wanes over time. However, estimates over time are subject to bias from differential depletion of susceptibles between vaccinated and unvaccinated groups. Here we examine the extent to which biases occur under different scenarios and assess whether serologic testing has the potential to correct this bias. By identifying non-vaccine antibodies, these tests could identify individuals with prior infection. We find in scenarios with high baseline VE, differential depletion of susceptibles creates minimal bias in VE estimates, suggesting that any observed declines are likely not due to spurious waning alone. However, if baseline VE is lower, the bias for leaky vaccines (that reduce individual probability of infection given contact) is larger and should be corrected by excluding individuals with past infection if the mechanism is known to be leaky. Conducting analyses both unadjusted and adjusted for past infection could give lower and upper bounds for the true VE. Studies of VE should therefore enroll individuals regardless of prior infection history but also collect information, ideally through serologic testing, on this critical variable. Vaccines are a critical tool for combatting the COVID-19 pandemic. Clinical trials and observational studies have provided key information about the vaccines’ efficacy and effectiveness (VE). One important question that remains to be answered is whether or not the protection conferred by vaccines wanes over time. However, estimates of effectiveness over time are subject to bias from differential depletion of susceptibles between vaccinated and unvaccinated groups. This bias occurs when individuals who are no longer at risk of infection due to protection from past infection are included in the analysis; assuming the VE is greater than zero, these individuals with prior infection are more likely to be unvaccinated than vaccinated. Therefore, over time, more uninfected and unvaccinated individuals who are not at risk of infection are included in the analysis, biasing VE estimates downward. This bias grows as infection spreads and makes the VE incorrectly appear to wane over time (i.e. spurious waning) (1–4). Although some studies attempt to restrict analysis to those without prior infection, often many past infections will go undetected or unreported, particularly for pathogens with a large proportion of asymptomatic or mild infections. Additionally, in a population with individuals who have heterogeneous risk of infection (for example due to occupational exposure or choice to wear a face covering), the riskiest individuals will be depleted preferentially among the unvaccinated group when the vaccine is effective, leading to the same bias downwards in VE, growing over time and thus seemingly showing waning of VE (1). Serologic testing for SARS-CoV-2 antibodies has the potential to help correct the first bias. By identifying non-vaccine antibodies (e.g. N-protein), these tests could be used to identify individuals with prior infection and exclude them from studies of VE over time. Likewise, adjustment for individual-level risk of infection (in practice, for proxies such as occupation or behavior) can help address the second bias. While each of these issues can in principle affect VE estimates and induce a spurious impression of waning VE, the magnitude of this bias under various assumptions about baseline VE is not clear, nor has it been shown before to our knowledge how adjustments can solve the problems. Here we examine the extent to which these biases occur under different scenarios and assess approaches to alleviate bias under various assumptions. ## Methods ### Network and epidemic We first create a network model of 20,000 individuals, similar to models described previously (2,5). The probability of connections between individuals in the network is calibrated in combination with the parameter for the probability of infection given contact to result in a reproduction number (R) of 1.25 or 1.50 (see Table 1 for a full list of parameters) (6). We seed an epidemic of a SARS-CoV-2-like pathogen with ten exposed individuals. Each day, each susceptible individual has a daily probability of infection from their infected connections in the network. A random half of the population is high risk, and the other half is low risk. High risk individuals have a daily probability of infection three times that of low risk individuals. This binary risk status is a simplified proxy for multiple factors that could affect individuals’ risks for infection, such as occupation, demographics, geography, or behavioral patterns (7–9). View this table: [Table 1.](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/T1) Table 1. Parameters and associated values used in network model simulations We assume that half of those who are infected become symptomatic and that people are infectious for seven days. We assume that symptomatic, pre-symptomatic, and asymptomatic infected individuals have the same level of infectiousness. After individuals recover, we assume that complete protection from natural immunity lasts for 90 days (10), after which individuals can be reinfected; we then assume recovered individuals’ susceptibility is 95% lower than those without prior infection, resulting in low numbers of reinfection during the study period examined in the simulations (11). It is unknown exactly how VE differs for recovered individuals, although there is evidence that vaccination further reduces previously infected individuals’ risk (12). For simplicity we assume vaccinated recovered individuals’ susceptibility is further decreased by the same amount as for vaccinated susceptible individuals. ### Scenarios We simulate random vaccination (to prevent unmeasured confounding) of 2500 individuals, or 12.5% of the population, on the first day of the simulation. Another 2500 unvaccinated individuals are also randomly selected for potential follow-up over the course of the simulations. We compare four primary scenarios (Table 2). In the first scenario, vaccine efficacy against susceptibility to infection (VES) is 0.90, and vaccine efficacy against progression to symptoms (VEP) is 0.5. These measures combine to give a vaccine efficacy against symptomatic disease (VESP), the primary outcome of most SARS-CoV-2 vaccine trials (13–16), of 0.95, under the formula *VE**SP* *= 1* − *(1* − *VE**S**)(1* − *VE**P**)*(17). These values are similar to those that have been observed in the trials (13,15) and initial observational studies (18–20) of the mRNA vaccines. In the second scenario, we assume VES and VESP are 0.7, similar to the findings from the Janssen vaccine trial (14). View this table: [Table 2.](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/T2) Table 2. Scenarios the network model evaluated to assess the potential for spurious waning In the first two scenarios, we assume the vaccine is “leaky”, meaning it reduces the probability of infection given contact to an equal degree, but not perfectly, in all vaccinated individuals (3). However, in the third scenario, to assess the impact of the vaccine mechanism, we model an all-or-nothing vaccine, meaning it protects a certain proportion of vaccinated individuals completely and provides no protection to the rest. In this scenario, VES and VESP are both 0.9. In supplementary scenarios, we also examine an all-or-nothing vaccine with lower VES and VESP,as well as a leaky vaccine with VESP = 0.95, similar to Scenario 1, but with lower VES and higher VEP. (21). Finally, in scenario 4, we examine a setting with a leaky vaccine with VESP = 0.95 in which some of the population has already been infected and recovered before the simulations and vaccination begin. We explore a range from 0-30% of individuals with prior infection under a higher R than in the other scenarios (R=2.0) to prevent herd immunity from prior infections from substantially slowing the epidemics before spurious waning can be observed. In these simulations, 20 individuals are exposed on the first day and 100 individuals are infectious (except in the simulations with 0 individuals previously infected). ### Analyses #### Test-negative design We then simulate sampling of cases, or individuals with COVID-19 (symptoms and positive virologic test), on a given day and a random 1:4 sample of controls (i.e. individuals without COVID-19), similar to a test-negative design (TND). We repeat this sampling for seven different time periods, every 25 days from day 75 to day 225, treating each day independently. Given the faster epidemics in scenario 4 with the higher R, we examine every 25 days from day 25 to day 150. We then estimate VESP -- the estimand that is in practice estimated in a standard TND, although the progression to symptoms aspect is not always acknowledged (22) -- using four analyses. We focus on VESP as it was the primary outcome in vaccine trials and due to potential biases that can arise in TNDs when estimating VE against all infection when vaccines affect disease severity (23). In the first analysis (baseline), we estimate VESP by calculating the odds ratio (OR), using data from all individuals sampled: ![Graphic][1], where D is disease (symptoms and a positive virologic test) and V is vaccine. In the second analysis, we estimate the OR using logistic regression, controlling for risk (i.e. the binary measure described above for increased or decreased susceptibility to infection). In the third analysis, we simulate serologic testing for non-vaccine antibodies (i.e. evidence of past infection) and then restrict the analysis to individuals who had not previously been infected. In the fourth analysis, we both restrict to those without evidence of previous infection and also control for risk. In the primary analyses, we assume perfect sensitivity and specificity of the serologic test for prior infection, but we relax these assumptions in sensitivity analyses. We examine lower sensitivity for cases and controls and lower specificity for cases only, as antibodies detected could reflect either current or prior infection. #### Cohort / randomized controlled trial design As a comparison to the TND, we repeat the same four analyses to estimate VE using a cohort design, where the time of symptomatic infection is known for the 5000 people under follow-up. We again examine different lengths of follow-up for this study design. We assume no unmeasured confounding: that is, no common causes of vaccination and infection, as would be true with adequate control for confounders. In practice, this study could be done using an electronic health records database using stratification, matching, or modeling for example to control for confounding factors such as occupation, age, insurance, and other factors affecting both vaccination and the likelihood of infection given vaccination. Because there is no unmeasured confounding and vaccination is random in these simulations, this design is comparable to a randomized controlled trial (RCT) in which all symptomatic cases are identified. #### Sensitivity analyses We vary key parameters of interest to examine their impact on the results. First, we vary the reduction in susceptibility conferred by past infection, using a value of 70% (24) reduction compared to the baseline parameter of 95% reduction. Next, we vary the proportion of the population that is high risk, examining a scenario in which only 10% of the population is high risk, with five times higher risk than lower risk individuals. Finally, we vary the proportion of infections that are symptomatic, using a higher value of 0.8 (23) compared to the baseline of 0.5. Ethics: This activity was reviewed by CDC and was conducted consistent with applicable federal law and CDC policy. ## Results In scenario 1 with high VES and VESP, we find that for most time points, all four TND analyses return estimates of VESP close to the true value of 0.95 (Figure 1). However, in the simulations with R=1.5, the first two analyses that do not exclude prior infection result in downward biases for days further from vaccination (i.e. as low as 0.89 200 days from vaccination). This bias occurs at later dates and in the higher transmission scenarios (i.e. when more cases have occurred) due to differential depletion of susceptibles between vaccinated and unvaccinated individuals over time; the bias is alleviated by excluding those with prior infection from the analysis. Similar results are found in an additional analysis with the same VESP but different VES and VEP (Figure S1). Note, in these and other simulations with high VE, when the number of cases is very low at either the beginning or end of the epidemic, imprecision can result in VE estimates of 1 (if all cases are by chance unvaccinated). ![Figure S1](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F8.medium.gif) [Figure S1](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F8) Figure S1 Vaccine efficacy against symptomatic disease for scenario 5 with a test-negative design. Columns are days since vaccination, and rows are values of R. Median and IQR of 100 simulations shown. Number of cases refers to the median number of people with COVID-19 included in that day’s analysis. Cumulative number refers to the median total number of cases of COVID-19 by that day since vaccination (denominator 5000). ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F1) Figure 1. Vaccine efficacy against symptomatic disease for scenario 1 with a test-negative design. Columns are days since vaccination, and rows are values of R. Median and IQR of 100 simulations shown. Number of cases refers to the median number of people with COVID-19 included in that day’s analysis. Cumulative number refers to the median total number of cases of COVID-19 by that day since vaccination (denominator 5000). In scenario 2, with a lower value of VES and VESP of 0.7, the first and second analyses that do not exclude prior infection are biased further downwards than in scenario 1 (i.e. as low as 0.36 225 days from vaccination when R = 1.5); this bias also occurs earlier than in scenario 1 and for both R values (Figure 2) because the epidemic is larger due to lower VE. In addition, in this scenario, the third analysis that excludes those with prior infection but does not control for risk has a more pronounced bias (lowest value of 0.64 compared to true VESP of 0.7 on day 225 when R=1.5) than in scenario 1 (lowest value of 0.94 compared to true VESP of 0.95 on day 200 when R=1.5). ![Figure 2](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F2.medium.gif) [Figure 2](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F2) Figure 2 Vaccine efficacy against symptomatic disease for scenario 2 with a test-negative design. Columns are days since vaccination, and rows are values of R. Median and IQR of 100 simulations shown. Number of cases refers to the median number of people with COVID-19 included in that day’s analysis. Cumulative number refers to the median total number of cases of COVID-19 by that day since vaccination (denominator 5000). In scenario 3, which models an all-or-nothing vaccine mechanism instead of a leaky mechanism, excluding prior infections results in a bias upward away from the true VESP of 0.9 (Figure 3), with some values approaching 1. This bias is more pronounced in the higher R simulations, on later days, and for lower values of VESP; for example, on day 200 in the R=1.5 simulations, the VESP from the analysis excluding prior infection and adjusting for risk is 0.84, compared to the true value of 0.7 (Figure S2) ![Figure S2](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F9.medium.gif) [Figure S2](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F9) Figure S2 Vaccine efficacy against symptomatic disease for scenario 6 with a test-negative design. Columns are days since vaccination, and rows are values of R. Median and IQR of 100 simulations shown. Number of cases refers to the median number of people with COVID-19 included in that day’s analysis. Cumulative number refers to the median total number of cases of COVID-19 by that day since vaccination (denominator 5000). ![Figure 3](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F3.medium.gif) [Figure 3](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F3) Figure 3 Vaccine efficacy against symptomatic disease for scenario 3 with a test-negative design. Columns are days since vaccination, and rows are values of R. Median and IQR of 100 simulations shown. Number of cases refers to the median number of people with COVID-19 included in that day’s analysis. Cumulative number refers to the median total number of cases of COVID-19 by that day since vaccination (denominator 5000). In scenario 4, we see that the degree of spurious waning bias increases with the number of cumulative cases *since vaccination* (Figure 4). This trend occurs because the bias is driven by differential depletion of susceptibles between vaccinated and unvaccinated individuals. In the simulations with 0% prior infection at the time of vaccination, the epidemic and vaccination begin simultaneously; thus, when evaluating VE at later dates, because the vaccine reduces risk of infection, those who have been infected prior to the date of interest are more likely to be unvaccinated, causing bias. Assuming prior infection doesn’t affect the decision to be vaccinated, in the simulations where the epidemic begins prior to vaccination, the distribution of vaccination status among those infected becomes more balanced because the infections prior to vaccination are expected to be evenly split between vaccinated and unvaccinated individuals. This is why the bias increases with cumulative cases since vaccination began, rather than with overall cumulative cases (i.e. before and after vaccination). The cumulative cases since vaccination is a function of many variables, including timing of vaccination relative to the epidemic, force of infection, and VE values. ![](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F4/graphic-6.medium.gif) [](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F4/graphic-6) ![](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F4/graphic-7.medium.gif) [](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F4/graphic-7) Figure 4 A) Vaccine efficacy against symptomatic disease for scenario 4 with a test-negative design. Columns are days since vaccination, and rows are the proportion infected before vaccination. Median and IQR of 100 simulations shown. Number of cases refers to the number of people with COVID-19 included in that day’s analysis. Cumulative number refers to the total number of cases of COVID-19 by that day since vaccination (denominator 5000). B) Median cumulative incidence proportion (symptomatic infection) over time since vaccination for the four values of the proportion recovered on vaccination day (black vertical line). Note the probability of symptoms for unvaccinated individuals is 50%. In the cohort study analysis, which under our assumptions is equivalent to an RCT, we find similar trends to those observed in the TND (Figures 5-7); however, the cohort studies that do not exclude those with prior infection are less biased than equivalent TND studies for each scenario. In scenario 1, with the highest VESP, the bias is negligible. The bias is smaller because in cohort studies and RCTs, those with past symptomatic infection are censored at the time of infection, meaning fewer people are incorrectly treated as still at risk in the analysis. ![Figure 5](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F5.medium.gif) [Figure 5](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F5) Figure 5 Vaccine efficacy against symptomatic disease for scenario 1 with a cohort/RCT study design. Columns are days since vaccination, and rows are values of R0. Median and IQR of 100 simulations shown. Cumulative number refers to the median total number of cases of COVID-19 by that day since vaccination (denominator 5000). ![Figure 6](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F6.medium.gif) [Figure 6](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F6) Figure 6 Vaccine efficacy against symptomatic disease for scenario 2 with a cohort/RCT study design. Columns are days since vaccination, and rows are values of R0. Median and IQR of 100 simulations shown. Cumulative number refers to the median total number of cases of COVID-19 by that day since vaccination (denominator 5000). ![Figure 7](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F7.medium.gif) [Figure 7](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F7) Figure 7 Vaccine efficacy against symptomatic disease for scenario 3 with a cohort study design. Columns are days since vaccination, and rows are values of R0. Median and IQR of 100 simulations shown. Cumulative number refers to the median total number of cases of COVID-19 by that day since vaccination (denominator 5000). In sensitivity analyses, we relax assumptions of perfect tests for prior infection and examine lower sensitivity for both cases and controls and lower specificity for cases. We find that while lower sensitivity results in slight downward biases of the estimates, which are more pronounced in scenario 2 than scenario 1 (Figures S3-S4), lower specificity for cases does not induce a bias (Figure S5). This is because imperfect specificity only reduces the sample size, but we assume it does not do so differentially by vaccination status. A study of breakthrough infections in Israel found antibody levels on day of diagnosis were not greatly impacted by the current infection, suggesting imperfect specificity may not be a large concern (25). ![Figure S3](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F10.medium.gif) [Figure S3](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F10) Figure S3 Vaccine efficacy against symptomatic disease for scenario 1 with imperfect sensitivity for the test for prior infection with a test-negative design. Columns are days since vaccination, and rows are values of R. Median and IQR of 100 simulations shown. Number of cases refers to the median number of people with COVID-19 included in that day’s analysis. Cumulative number refers to the median total number of cases of COVID-19 by that day since vaccination (denominator 5000). ![Figure S4](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F11.medium.gif) [Figure S4](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F11) Figure S4 Vaccine efficacy against symptomatic disease for scenario 2 with imperfect sensitivity for the test for prior infection with a test-negative design. Columns are days since vaccination, and rows are values of R. Median and IQR of 100 simulations shown. Number of cases refers to the median number of people with COVID-19 included in that day’s analysis. Cumulative number refers to the median total number of cases of COVID-19 by that day since vaccination (denominator 5000). ![Figure S5](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F12.medium.gif) [Figure S5](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F12) Figure S5 Vaccine efficacy against symptomatic disease for scenario 2 with imperfect specificity for cases for the test for prior infection with a test-negative design. Columns are days since vaccination, and rows are values of R. Median and IQR of 100 simulations shown. Number of cases refers to the median number of people with COVID-19 included in that day’s analysis. Cumulative number refers to the median total number of cases of COVID-19 by that day since vaccination (denominator 5000). Finally, in analyses that vary the parameters for reduction in susceptibility following infection and the proportion of the population at high risk, we find similar results to the baseline scenario (Figures S6-7). In analyses with a higher proportion symptomatic, we find less bias as expected given that a smaller proportion of cases will go undetected (Figure S8); we focus here on the cohort/RCT designs in which all symptomatic cases are identified and therefore the proportion symptomatic is a key parameter of interest. ![Figure S6](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F13.medium.gif) [Figure S6](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F13) Figure S6 Vaccine efficacy against symptomatic disease for scenarios 1-3 with a test-negative design, with lower relative reduction in susceptibility following infection compared to baseline parameters. Columns are days since vaccination, and rows are values of R. Median and IQR of 100 simulations shown. Number of cases refers to the median number of people with COVID-19 included in that day’s analysis. Cumulative number refers to the median total number of cases of COVID-19 by that day since vaccination (denominator 5000). ![Figure S7](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F14.medium.gif) [Figure S7](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F14) Figure S7 Vaccine efficacy against symptomatic disease for scenarios 1-3 with a test-negative design, with lower proportion of the population at high risk compared to baseline parameters. Columns are days since vaccination, and rows are values of R. Median and IQR of 100 simulations shown. Number of cases refers to the median number of people with COVID-19 included in that day’s analysis. Cumulative number refers to the median total number of cases of COVID-19 by that day since vaccination (denominator 5000). ![Figure S8](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/07/2021.07.15.21260595/F15.medium.gif) [Figure S8](http://medrxiv.org/content/early/2021/09/07/2021.07.15.21260595/F15) Figure S8 Vaccine efficacy against symptomatic disease for scenarios 1-3 with a cohort/RCT design, with higher proportion symptomatic compared to baseline parameters. Columns are days since vaccination, and rows are values of R. Median and IQR of 100 simulations shown. Number of cases refers to the median number of people with COVID-19 included in that day’s analysis. Cumulative number refers to the median total number of cases of COVID-19 by that day since vaccination (denominator 5000). ## Discussion We find that in scenarios with high baseline VE, differential depletion of susceptibles creates minimal bias in VE estimates and in the time trend of these estimates; therefore, there is little suggestion of spurious waning from comparing later to earlier VE estimates. While it is important to control for known predictors of risk, estimates that do not account for prior infection status will likely not be far off from the truth. In fact, without knowledge of the vaccine mechanism (i.e. leaky or all-or-nothing), it may be better to not condition on prior infection status: if the vaccine is leaky, the baseline estimates may be slightly underestimated, but if the vaccine is all-or-nothing, the adjusted estimates will overestimate the true VE. This upward bias occurs because excluding people with past infection with an all-or-nothing vaccine removes people for whom the vaccine did not work at all and focuses the analysis on those for whom the vaccine may be effective; with leaky vaccines, the individuals who are removed in the adjusted analysis are random (after accounting for risk factors). Evaluating how the estimates from different analyses change over time could give potential insight into the type of vaccine mechanism. Because the bias from failing to exclude prior infection in the analysis of a leaky vaccine with high initial VE is expected to be small under the null hypothesis of no waning VE, if the VE appears to wane substantially, this finding is likely not entirely due to bias. If true waning occurs, spurious waning bias may become a more relevant consideration (as in scenarios with a lower baseline VE); that is, estimates may reflect a combination of real and spurious waning. Six month efficacy results from the mRNA vaccine trials show mixed findings regarding waning, with Moderna showing consistent efficacy over time (26) and Pfizer’s estimates slightly declining (27). It is challenging to disentangle if this decline is due to lower effectiveness against variants, true waning, spurious waning, or some combination of these factors; given the minimal bias found in our RCT-like analysis of vaccines with high VE (Figure 5), our findings suggest the decline is likely not due to spurious waning alone. Similarly, spurious waning is likely not the only cause of the declines in effectiveness observed in Israel, given the high effectiveness estimated when vaccines were first given (19,20), the magnitude of the declines and that they occurred following a period of low incidence (28). If baseline VE is lower, the bias over time for leaky vaccines is larger and ideally should be corrected if the mechanism is known to be leaky. However, leaky and all-or-nothing mechanisms are two extremes; in reality, vaccines will fail to take in some individuals due to improper handling or injection so most vaccines are leaky vs. nothing. By examining both mechanisms, our analyses show the range of possible biases. In the absence of other sources of bias, conducting analyses both unadjusted and adjusted for past infection could give lower and upper bounds for the true VE. Studies of VE over time should therefore enroll individuals regardless of prior infection history but also collect information on this critical variable for use in the analysis; when possible, prior infection status should be assessed using serology as even an imperfect serologic test will improve sensitivity over self-report alone. This study has several limitations. First, we make many simplifying assumptions in the model. For example, we assume all individuals are grouped into one large community and do not examine the potential impact of geographic heterogeneity. Other studies have shown epidemic dynamics due to differences in geography are important to control for in vaccine (29) and serologic (5) studies. We also assume perfect sensitivity and specificity of virologic tests, as implications of these parameters have been explored in detail previously (30,31). There are many potential biases in studies of vaccine effectiveness, which are described in detail in World Health Organization guidance (30); here we focus specifically on spurious waning bias from differential depletion of susceptibles. While we incorporate heterogeneity in risk of acquiring infection, we do not model differences in risk of transmitting infection (e.g. due to host factors). Second, using serologic tests to identify prior infection is subject to error from imperfect test characteristics and waning of antibodies over time. However, we find only small biases in VE estimates from imperfect sensitivity, and information on past infection can also be obtained through self-report or medical records. Third, as described above, we assume random vaccination and no unmeasured confounding; the strategies discussed here alone do not address most other sources of potential confounding, which are important to account for in analyses, particularly given that vaccine rollout in some cases prioritized those at highest risk to receive vaccines first. Fourth, we simulated epidemics with higher R values than much of the United States experienced during most of the pandemic to uncover scenarios where spurious waning might be of concern (32). These values should not affect the conclusions from the simulations, as we find that the main determinant of the extent and magnitude of bias is the proportion of the population that has been infected since vaccination, which is influenced by a combination of several factors, including R values, the point in the epidemic trajectory when the vaccine was introduced, and prevalence of high vs low risk individuals in the population. Finally, we assume no true waning or other reasons for decreased effectiveness, such as new variants; future research should explore methods for disentangling these potential explanations for observed declines in effectiveness over time. Assessing duration of protection from COVID-19 vaccines is important for anticipating future dynamics of this pandemic. Here we have outlined circumstances under which bias can arise in these estimates and identified approaches to alleviate these biases. ## Data Availability Code is available on github. [https://github.com/rek160/spurious-waning](https://github.com/rek160/spurious-waning) ## Funding This work was supported by the U.S. National Cancer Institute Seronet cooperative agreement U01CA261277. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. ## Conflicts of interest Dr. Lipsitch reports consulting/honoraria from Bristol Myers Squibb, Sanofi Pasteur, and Merck, as well as a grant through his institution, unrelated to COVID-19, from Pfizer. He has served as an unpaid advisor related to COVID-19 to Pfizer, One Day Sooner, Astra-Zeneca, Janssen, and COVAX (United Biomedical). Dr. Kahn discloses consulting fees from Partners In Health. ## Acknowledgments We thank Rachel Slayton and Matt Biggerstaff for helpful comments and discussion. ## Footnotes * *Disclaimer: The findings and conclusions in this report are those of the author(s) and do not necessarily represent the official position of the Centers for Disease Control and Prevention.* * Received July 15, 2021. * Revision received September 6, 2021. * Accepted September 7, 2021. * © 2021, Posted by Cold Spring Harbor Laboratory This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license ## References 1. 1.Lipsitch M, Goldstein E, Ray GT, et al. Depletion-of-susceptibles bias in influenza vaccine waning studies: how to ensure robust results. Epidemiol. Infect. 2019;147:e306. 2. 2.Kahn R, Hitchings M, Wang R, et al. Analyzing Vaccine Trials in Epidemics With Mild and Asymptomatic Infection. Am. J. Epidemiol. 2019;188(2):467–474. 3. 3.Smith PG, Rodrigues LC, Fine PEM. Assessment of the Protective Efficacy of Vaccines against Common Diseases Using Case-Control and Cohort Studies. International Journal of Epidemiology. 1984;13(1):87–93. ([http://dx.doi.org/10.1093/ije/13.1.87](http://dx.doi.org/10.1093/ije/13.1.87)) [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ije/13.1.87&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=6698708&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F07%2F2021.07.15.21260595.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1984SF93200014&link_type=ISI) 4. 4.Lewnard JA, Tedijanto C, Cowling BJ, et al. Measurement of Vaccine Direct Effects Under the Test-Negative Design. Am. J. Epidemiol. [electronic article]. 2018;187(12). ([https://pubmed.ncbi.nlm.nih.gov/30099505/](https://pubmed.ncbi.nlm.nih.gov/30099505/)). (Accessed August 5, 2021) 5. 5.Kahn R, Kennedy-Shaffer L, Grad YH, et al. Potential Biases Arising from Epidemic Dynamics in Observational Seroprotection Studies. Am. J. Epidemiol. [electronic article]. 2020;([http://dx.doi.org/10.1093/aje/kwaa188](http://dx.doi.org/10.1093/aje/kwaa188)) 6. 6.Meyers LA, Pourbohloul B, Newman MEJ, et al. Network theory and SARS: predicting outbreak diversity. J. Theor. Biol. 2005;232(1):71–81. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jtbi.2004.07.026&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15498594&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F07%2F2021.07.15.21260595.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000225304900008&link_type=ISI) 7. 7.Chin T, Kahn R, Li R, et al. US-county level variation in intersecting individual, household and community characteristics relevant to COVID-19 and planning an equitable response: a cross-sectional analysis. BMJ Open. 2020;10(9):e039886. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiYm1qb3BlbiI7czo1OiJyZXNpZCI7czoxMjoiMTAvOS9lMDM5ODg2IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDkvMDcvMjAyMS4wNy4xNS4yMTI2MDU5NS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 8. 8.CDC. Science Brief: Community Use of Cloth Masks to Control the Spread of SARS-CoV-2. 2021;([https://www.cdc.gov/coronavirus/2019-ncov/science/science-briefs/masking-science-sars-cov2.html](https://www.cdc.gov/coronavirus/2019-ncov/science/science-briefs/masking-science-sars-cov2.html)). (Accessed May 31, 2021) 9. 9.Kissler SM, Kishore N, Prabhu M, et al. Reductions in commuting mobility correlate with geographic differences in SARS-CoV-2 prevalence in New York City. Nat. Commun. 2020;11(1):1–6. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41467-019-13889-6&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=31911652&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F07%2F2021.07.15.21260595.atom) 10. 10.CDC. Interim Guidance on Ending Isolation and Precautions for Adults with COVID-19. 2021;([https://www.cdc.gov/coronavirus/2019-ncov/hcp/duration-isolation.html](https://www.cdc.gov/coronavirus/2019-ncov/hcp/duration-isolation.html)). (Accessed May 20, 2021) 11. 11.Abu-Raddad LJ, Chemaitelly H, Coyle P, et al. SARS-CoV-2 antibody-positivity protects against reinfection for at least seven months with 95% efficacy. EClinicalMedicine. 2021;35:100861. 12. 12.Cavanaugh AM. Reduced Risk of Reinfection with SARS-CoV-2 After COVID-19 Vaccination — Kentucky, May–June 2021. MMWR Morb. Mortal. Wkly. Rep. [electronic article]. 2021;70. ([http://www.cdc.gov/mmwr/volumes/70/wr/mm7032e1.htm](http://www.cdc.gov/mmwr/volumes/70/wr/mm7032e1.htm)). (Accessed August 12, 2021) 13. 13.Baden LR, El Sahly HM, Essink B, et al. Efficacy and Safety of the mRNA-1273 SARS-CoV-2 Vaccine. N. Engl. J. Med. 2021;384(5):403–416. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa2035389&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F07%2F2021.07.15.21260595.atom) 14. 14.Sadoff J, Gray G, Vandebosch A, et al. Safety and Efficacy of Single-Dose Ad26.COV2.S Vaccine against Covid-19. N. Engl. J. Med. [electronic article]. 2021;([http://dx.doi.org/10.1056/NEJMoa2101544](http://dx.doi.org/10.1056/NEJMoa2101544)) 15. 15.Polack FP, Thomas SJ, Kitchin N, et al. Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine. N. Engl. J. Med. 2020;383(27):2603–2615. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa2034577&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F07%2F2021.07.15.21260595.atom) 16. 16.Voysey M, Costa Clemens SA, Madhi SA, et al. Single-dose administration and the influence of the timing of the booster dose on immunogenicity and efficacy of ChAdOx1 nCoV-19 (AZD1222) vaccine: a pooled analysis of four randomised trials. Lancet. 2021;397(10277):881–891. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/s0140-6736(21)00432-3&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33617777&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F07%2F2021.07.15.21260595.atom) 17. 17.Halloran ME, Elizabeth Halloran M, Longini IM, et al. Design and Analysis of Vaccine Studies. New York, NY: Springer; 2010. 18. 18.Thompson MG, Burgess JL, Naleway AL, et al. Interim Estimates of Vaccine Effectiveness of BNT162b2 and mRNA-1273 COVID-19 Vaccines in Preventing SARS-CoV-2 Infection Among Health Care Personnel, First Responders, and Other Essential and Frontline Workers — Eight U.S. Locations, December 2020–March 2021. MMWR. Morbidity and Mortality Weekly Report. 2021;70(13):495–500. ([http://dx.doi.org/10.15585/mmwr.mm7013e3](http://dx.doi.org/10.15585/mmwr.mm7013e3)) [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.15585/MMWR.MM7013E3&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F07%2F2021.07.15.21260595.atom) 19. 19.Dagan N, Barda N, Kepten E, et al. BNT162b2 mRNA Covid-19 Vaccine in a Nationwide Mass Vaccination Setting. N. Engl. J. Med. 2021;384(15):1412–1423. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMOA2101765&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F07%2F2021.07.15.21260595.atom) 20. 20.Haas EJ, Angulo FJ, McLaughlin JM, et al. Impact and effectiveness of mRNA BNT162b2 vaccine against SARS-CoV-2 infections and COVID-19 cases, hospitalisations, and deaths following a nationwide vaccination campaign in Israel: an observational study using national surveillance data. Lancet. 2021;397(10287):1819–1829. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/s0140-6736(21)00947-8&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F07%2F2021.07.15.21260595.atom) 21. 21.Regev-Yochay G, Amit S, Bergwerk M, et al. Decreased Infectivity Following BNT162b2 Vaccination. 2021;([https://papers.ssrn.com/abstract=3815668](https://papers.ssrn.com/abstract=3815668)). (Accessed May 20, 2021) 22. 22.Sullivan SG, Ej TT, Cowling BJ. Theoretical Basis of the Test-Negative Study Design for Assessment of Influenza Vaccine Effectiveness. Am. J. Epidemiol. [electronic article]. 2016;184(5). ([https://pubmed.ncbi.nlm.nih.gov/27587721/](https://pubmed.ncbi.nlm.nih.gov/27587721/)). (Accessed May 24, 2021) 23. 23.Foppa IM, Haber M, Ferdinands JM, et al. The case test-negative design for studies of the effectiveness of influenza vaccine. Vaccine [electronic article]. 2013;31(30). ([https://pubmed.ncbi.nlm.nih.gov/23624093/](https://pubmed.ncbi.nlm.nih.gov/23624093/)). (Accessed May 20, 2021) 24. 24.Hansen CH, Michlmayr D, Gubbels SM, et al. Assessment of protection against reinfection with SARS-CoV-2 among 4 million PCR-tested individuals in Denmark in 2020: a population-level observational study. Lancet. 2021;397(10280):1204–1212. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0140-6736(21)00575-4&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33743221&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F07%2F2021.07.15.21260595.atom) 25. 25.Bergwerk M, Gonen T, Lustig Y, et al. Covid-19 Breakthrough Infections in Vaccinated Health Care Workers. N. Engl. J. Med. [electronic article]. 2021;([http://dx.doi.org/10.1056/NEJMoa2109072](http://dx.doi.org/10.1056/NEJMoa2109072)) 26. 26.Moderna Reports Second Quarter Fiscal Year 2021 Financial Results and Provides Business Updates. ([https://investors.modernatx.com/news-releases/news-release-details/moderna-reports-second-quarter-fiscal-year-2021-financial](https://investors.modernatx.com/news-releases/news-release-details/moderna-reports-second-quarter-fiscal-year-2021-financial)). (Accessed August 5, 2021) 27. 27.Thomas SJ, Moreira ED, Kitchin N, et al. Six Month Safety and Efficacy of the BNT162b2 mRNA COVID-19 Vaccine. medRxiv. 2021;2021.07.28.21261159. 28. 28.Goldberg Y, Mandel M, Bar-On YM, et al. Waning immunity of the BNT162b2 vaccine: A nationwide study from Israel. medRxiv. 2021;2021.08.24.21262423. 29. 29.Kahn R, Hitchings M, Bellan S, et al. Impact of stochastically generated heterogeneity in hazard rates on individually randomized vaccine efficacy trials. Clin. Trials. 2018;15(2):207–211. 30. 30.Larremore DB, Wilder B, Lester E, et al. Test sensitivity is secondary to frequency and turnaround time for COVID-19 screening. Science Advances. 2021;7(1):eabd5393. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1126/sciadv.abd5393&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33219112&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F07%2F2021.07.15.21260595.atom) 31. 31.Borremans B, Gamble A, Prager KC, et al. Quantifying antibody kinetics and RNA detection during early-phase SARS-CoV-2 infection by time since symptom onset. 2020;([https://elifesciences.org/articles/60122](https://elifesciences.org/articles/60122)). (Accessed May 20, 2021) 32. 32.covidestim: COVID-19 nowcasting. ([https://covidestim.org/](https://covidestim.org/)). (Accessed May 26, 2021) 33. 33.Buitrago-Garcia D, Egli-Gany D, Counotte MJ, et al. Occurrence and transmission potential of asymptomatic and presymptomatic SARS-CoV-2 infections: A living systematic review and meta-analysis. PLoS Med. 2020;17(9):e1003346. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pmed.1003346&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32960881&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F07%2F2021.07.15.21260595.atom) 34. 34.McAloon C, Collins Á, Hunt K, et al. Incubation period of COVID-19: a rapid systematic review and meta-analysis of observational research. BMJ Open. 2020;10(8):e039652. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiYm1qb3BlbiI7czo1OiJyZXNpZCI7czoxMjoiMTAvOC9lMDM5NjUyIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDkvMDcvMjAyMS4wNy4xNS4yMTI2MDU5NS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) [1]: /embed/inline-graphic-1.gif