Abstract
Researchers frequently employ difference-in-differences (DiD) to study the impact of public health interventions on infectious disease outcomes. DiD assumes that treatment and non-experimental comparison groups would have moved in parallel in expectation, absent the intervention (“parallel trends assumption”). However, the plausibility of parallel trends assumption in the context of infectious disease transmission is not well-understood. Our work bridges this gap by formalizing epidemiological assumptions required for common DiD specifications, positing an underlying Susceptible-Infectious-Recovered (SIR) data-generating process. We demonstrate that popular specifications can encode strict epidemiological assumptions. For example, DiD modeling incident case numbers or rates as outcomes will produce biased treatment effect estimates unless untreated potential outcomes for treatment and comparison groups come from a data-generating process with the same initial infection and equal transmission rates at each time step. Applying a log transformation or modeling log growth allows for different initial infection rates under an “infinite susceptible population” assumption, but invokes conditions on transmission parameters. We then propose alternative DiD specifications based on epidemiological parameters – the effective reproduction number and the effective contact rate – that are both more robust to differences between treatment and comparison groups and can be extended to complex transmission dynamics. With minimal power difference incidence and log incidence models, we recommend a default of the more robust log specification. Our alternative specifications have lower power than incidence or log incidence models, but have higher power than log growth models. We illustrate implications of our work by re-analyzing published studies of COVID-19 mask policies.
Significance Statement Difference-in-differences is a popular observational study design for policy evaluation. However, it may not perform well when modeling infectious disease outcomes. Although many COVID-19 DiD studies in the medical literature have used incident case numbers or rates as the outcome variable, we demonstrate that this and other common model specifications may encode strict epidemiological assumptions as a result of non-linear infectious disease transmission. We unpack the assumptions embedded in popular DiD specifications assuming a Susceptible-Infected-Recovered data-generating process and propose more robust alternatives, modeling the effective reproduction number and effective contact rate.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study was funded by the Centers for Disease Control and Prevention through the Council of State and Territorial Epidemiologists (NU38OT000297-02)
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
S.F. and A.M.B. designed and performed research; S.F. analyzed data; and S.F. and A.M.B. wrote the manuscript.
The authors declare no competing interests.
↵‖ Note that in the special case of constant exponential growth, , this condition reduces to .
Data Availability
All data produced are available online and from previously published work