Abstract
Growing evidence on higher transmissibility of novel variants of the SARS-CoV-2 coronavirus is raising alarm in many countries. We provide an early assessment of population-level effects from confirmed cases of SARS-CoV-2 variants of concern (VOC) on 7-day incidence rates in 401 German cities and regions. Estimates reveal that the 204 cities and regions with at least one confirmed VOC case by February 4, 2021 have, on average, a 15% higher 7-day incidence rate after VOC emergence compared to cities and regions without confirmed cases. Effects are considerably larger (40% and more) for sub-sample estimates of regions with high VOC counts. Considering time heterogeneity in the estimations further shows that VOC effects on incidence rates grow over time.
Newly emerging variants of concern (VOC) of the SARS-CoV-2 coronavirus, including the British (B.1.1.7.), South African (B.1.351) and Brazilian (P.1.) mutations, are a new threat for public health systems around the globe. This is mainly due to the fact that first epidemiological evidence points to an increased transmissibility of VOC compared to previously circulating strains of SARS-CoV-2 (Volz et al., 2021; Tang et al., 2021).
Regions and countries first affected by these VOC experienced a significant increase in infection cases in late 2020 and early 2021 – even with high levels of social distancing in place. Quantitative estimates point to a higher human-to-human infection probability of the British variant of between 50-70% (Volz et al., 2021; Davies et al., 2020). There is also early evidence for higher death rates in the UK (Iacobucci, 2021). Seen from a pandemic perspective, the rapid local spread of novel virus strains enhances the risk of international disease transmission (Du et al., 2021). By mid-January 2021, all three SARS-CoV-2 VOC were confirmed in Germany, mainly imported through travelers returning from countries with significant VOC spread.
We provide estimates of the epidemiological effects of these novel strains on the 7-day incidence rate, i.e. the number of newly reported SARS-CoV-2 infections in the last seven days per 100,000 population, in German cities and regions (NUTS-3 level). The 7-day incidence rate is the main indicator used for disease surveillance and public health decisions in Germany. While we employ a similar identification strategy as in (Volz et al., 2021), our empirical setup differs from previous analyses mainly by the fact that we observe the within-country epidemic VOC spread at a very initial stage. While there are several challenges on the estimation approach, which we cannot fully rule out at this stage, our analysis provides important novel insights for countries in a similar situation to Germany, where VOC are not yet dominant but spreading (Robert Koch Institute, 2021a). Our findings may help to evaluate the epidemiological risks from VOC.
Available data on VOC spread in Germany is still very limited and prior virological analyses have mainly been based on ad-hoc sampling procedures from selective laboratories (Robert Koch Institute, 2021a). To assess the role of VOC for local infection dynamics in German NUTS-3 regions, we use an event database that collects information on confirmed cases of the three most concerning VOC (B.1.1.7., B.1.351, P.1) together with their reporting dates in a public crowd-sourcing project (Römer et al., 2021). Case documentation is based on newspaper and public health reports. We have cross-checked data consistency by retracing individual cases and their timing from the documented sources and have further validated the data by comparing them with aggregate reports from the Robert Koch Institute (RKI), which is in charge of disease control in Germany.
By February 4, in 204 out of 401 NUTS-3 regions at least one case of a SARS-CoV-2 VOC infection had been confirmed and the number of affected regions had been growing exponentially by the end of January (Figure 1A). Figure 1C displays the spatial distribution in form of cumulative absolute confirmed VOC cases by February 4, 2021. The map reveals that Flensburg and a cluster of three cities/regions (Cologne, Leverkusen and Düren) in North-Rhine Westphalia (NRW) are among the 5% most affected regions in Germany.
A first inspection of incidence rates in German regions with confirmed VOC cases nourishes concerns that virus mutations already drive local infection dynamics. One prime example is the 90,000-inhabitant city of Flensburg (Schleswig-Holstein). While local incidence rates were constantly below the German average in 2020, the 7-day incidence rate drastically increased in January 2021 (see Figure 1B). According to local health authorities, new infections mainly happened at illegal, i.e. against prevailing lockdown rules, parties on New Year’s Eve. On January 24, 2021, for several SARS-CoV-2 infections related to these parties, but also unrelated instances, the British B.1.1.7. VOC was confirmed. By February 4, the number of VOC cases had grown to 146 in Flensburg.
We merge event data on confirmed VOC with daily SARS-CoV-2 infections (by the onset of symptoms) for each of the 401 NUTS-3 regions provided by Robert Koch Institute (2021b)and further regional characteristics as documented in Mitze et al. (2020).We estimate the infection effects of VOC for two case studies (Flensburg and cluster of NRW cities) by means of the synthetic control method. We also compare all regions with at least one confirmed VOC case by means of difference-in-difference regressions and a complementary panel event study. We provide an extended method section in the SI Appendix including a description of identification challenges and how we approach them.
Our analysis needs to be interpreted carefully for the following reasons: First, VOC data are limited as we only know the day of reporting but not the de facto arrival of virus mutations in a region. Second, in the international comparison, VOC testing rates are still relatively low in Germany and we thus do do not know the latent degree of VOC diffusion at the regional level. Third, we estimate the link between confirmed VOC cases and the 7-day incidence rates, which allows us to answer whether VOC are linked to an increase in regional incidence rates or not; but this only provides an indirect measure of whether VOC are more infectious than previously existing virus strains. Finally, we cannot control for overall SARS-CoV-2 test intensities at the regional level. These may be higher in regions with a confirmed VOC and may thus upwardly bias the development of incidence rates relative to comparison regions if more tests reveal more infections.
Results
Synthetic control method (SCM)
Figure 2 shows SCM-based treatment effects for single (Flensburg) and multiple treated units (cluster of NRW cities/regions) together with 90% confidence intervals. In both cases, we set the start of the treatment period to January 5, 2021, which is at least one week before the confirmation of the first VOC cases in all four treated regions. Because we use infection data recorded by symptom onset, we argue that starting the treatment period at least one week before the first confirmed VOC case is sufficient to account for incubation times (Lauer et al., 2020). Evaluated against the counterfactual infection development, the observed 7-day incidence rate for Flensburg becomes significantly larger relative to the synthetic control group right after treatment start. This timing is in line with our prior expectations, knowing that early infections occurred at illegal parties on December 31, 2020.
We measure effect size as percentage difference in outcomes scaled to 100 in the last pre-treatment period. VOC spreading through illegal parties in Flensburg hence points to a tripling in the incidence rate (20 days after treatment) compared to the counterfactual situation. The obtained results are found to be robust to alterations in the donor pool for Flensburg’s synthetic control group. In all cases we have ensured that only regions without a confirmed VOC case enter the donor pool. The SCM results for the NRW cluster in Figure 2B indicate, however, that Flensburg may be a very exceptional case since illegal parties may have served as super-spreading events for VOC infections. Treatment effect estimates for the NRW cluster report an increase in the 7-day incidence rate by approx. 40% (after 20 days). These lower VOC effects come closer to results from earlier studies (Volz et al., 2021; Davies et al., 2020).
Difference-in-difference estimation (DiD)
To comprehensively assess the VOC effects across German regions, Table 1reports results from DiD estimations, which consider all regions with a confirmed VOC case as part of the treatment group. The Notes to Table 1contain a short description and the SI Appendix gives details of our estimation approach including our controls. DiD allows us to identify average effects over regional units with VOC cases over the treatment period. Given uncertainty about the exact timing of treatment start, we report treatment effects for different time windows: in the baseline specification, we set the start to the date of the first VOC reporting in the region; we then extend the treatment period to 7, 14, and 21 days before this date to capture latent transmission from an imported VOC case.
In all reported DiD specifications in Table 1, we find a significant epidemiological effect of VOC on the development of the 7-day incidence rate. For the overall sample covering all regions with at least one confirmed VOC case, the estimates in Panel A point to an average increase in the incidence rate by 15 cases per 100,000 population. Evaluated against the average 7-day incidence rate of comparison regions during the treatment period (109), this amounts to an almost equal percentage increase (14%). Effects become larger for sub-sample estimates limiting treated regions to those with a VOC case confirmed before January 22 in Panel B (i.e. those regions for which we observe at least 14 days of treatment) and to regions with at least 9 confirmed VOC cases in Panel C (i.e. regions belonging to the top 10 percentile of absolute VOC counts). In the latter case, we find an increase in the incidence rate of 37% on average (40 additional cases per 100,000 population relative to 109 in comparison regions). The extension of treatment periods by 7, 14 and 21 days results in lower effect estimates and thus points to the fact that infection dynamics mainly start around the date of VOC confirmation.
Panel event study (PES)
To further explore the extent of dynamically evolving infection effects we translate the DiD framework in to a PES setup. Estimated daily treatment effects in Figure 3 show that 20 days after treatment start the 7-day incidence rate in treated regions has almost doubled compared to their last pre-treatment observation. In the replication files (will be available with published version), we show that: first, effects are largest for the sub-sample of regions with at least 9 confirmed VOC cases, which is in line with the DiD estimates; second, effects are smaller for the sub-sample of regions with a VOC before January 22. These results may indicate that tracing and containment strategies by local health authorities worked for the first imported VOC cases but became less effective over time. Importantly, the daily treatment effects visualized in Figure 3 also provide evidence for the absence of early anticipation effects from latent confounding events in treated regions and thus validate our panel regression approach.
Discussion
This early assessment of the epidemiologial effects of SARS-CoV-2 variants of concern (VOC) in Germany has served two purposes: First, it has provided novel insights into the initial dynamics of VOC spread within a country after international importation. Second, it has contributed to the literature estimating the infection effects of newly emerging virus strains as a means of assessing the relative transmissibility of VOC compared to previously existing virus strains. On average, we find that VOC emergence is associated with an approx. 15-40% increase in the 7-day incidence rate. However, considering that effects dynamically build up over time, we also find that incidence rates may double or even triple at the local level.
As we have stressed, our results are limited in several dimensions. When assessing the country-wide epidemiological effects of VOC, we also have to take into account that the share of VOC on total SARS-CoV-2 infections is considered to be low in Germany (around 6% between January 22 and January 29, 2021, (Robert Koch Institute, 2021a)). However, we argue that our estimation results are even more alarming regarding the expected future infection effects from VOC: firstly, because the share of VOC on total SARS-CoV-2 infections may still be low in many regions with at least one confirmed VOC case in Germany and, secondly, the existence of latent VOC cases in comparison regions may lead to an underestimation of effects. Obviously, a major concern of our early assessment is that we do not have full information on VOC spread. Until such information is available, we hope that our early assessment can fill the eminent knowledge gap regarding the epidemiological effects of VOC and inform health policy authorities about the need for swift actions to control local transmission (Grubaugh et al., 2021).
Data Availability
Data is public. Replication files will be available with published version
Acknowledgments
We greatly appreciate helpful comments from Falk Laser. Johannes Rode acknowledges the support of the Chair of International Economics at Technische Universität Darmstadt.
SI-Appendix
Extended Methods
Synthetic control method (SCM)
We use SCM to analyse two case studies for a single treated unit (Flensburg) and multiple treated units (cluster of 3 cities/regions [Cologne, Leverkusen and Düren] in North Rhine-Westphalia). In all four case study regions a SARS-CoV-2 variant of concern (VOC) has been confirmed by genome sequencing, which is used to identify treatment status. VOC are defined as SARS-CoV-2 mutations from the British B.1.1.7, the South African B.1.351 and the Brazilian P.1 variants. The objective of SCM is to compare the development of the 7-day incidence rate (SARS-CoV-2 infections per 100,000 population over the last seven days) after treatment start in the two sets of treated region(s) vis-à-vis a synthetic control groups selected from a donor pool of 197 comparison regions without any confirmed VOC case during the entire sample period December 15, 2020 to February 4, 2021. We compute daily treatment effects for a maximum of 31 days and express them as percentage difference to the last pre-treatment observation of the scaled outcome variable (to 100), see Figure 2 in the main text.
We motivate the use of SCM as one element of our empirical identification strategy because the estimation approach has been shown to be a flexible and robust estimation tool that has previously been applied to COVID-related research, for instance, to study the effect of face masks on SAR-CoV-2 infection numbers in Germany (Mitze et al., 2020) and lockdown effectiveness for a counterfactual of Sweden (Cho, 2020) and the USA (Friedson et al., 2021).The key identification approach of SCM is to establish a counterfactual that mimics a situation in which the treatment in treated regions (here: the emergence of VOC cases) would not have taken place. This is implemented by means of creating a synthetic control group consisting of the donor pool of comparison regions and by comparing the outcomes of treated units and the synthetic control after the start of the treatment. The match between treated regions and the synthetic control group is done through a minimum distance approach for a set of predictor variables evaluated along their pre-treatment values for treated regions and those in the donor pool. This ensures that pre-treatment differences in trends of the outcome variable are leveled. A formal description of the estimation approach and the underlying assumptions for effect identification are given in Abadie & Gardeazabal (2003); Abadie et al. (2010); Cavallo et al. (2013).
For our purpose of estimating the epidemiological effects of emerging VOC in German regions, we adopt and extend the data and estimation setup applied in Mitze et al. (2020).For both SCM applications, we set the start of the treatment period to January 5, 2021 and identify treatment effects of VOC throughout January and early February. In all four cases, the reporting of the first confirmed VOC case took place at least one week after the start of the treatment period (Cologne: January 12, 2021, Leverkusen: January 18, 2021, Düren: January 23, 2021, Flensburg: January 24, 2021). This time lag between the start of the treatment and the reporting of the first VOC case should ensure that latent transmission effects are captured in the estimation. For instance, in the case of Flensburg, VOC infections could be traced back to illegal parties on December 31, 2020 (Ove, 2021).Considering a median incubation time of 5 days for SARS-CoV-2 infections (Lauer et al., 2020), we can thus expect that first VOC effects become visible in the data from January 5, 2021 onwards. This is before the first VOC case was confirmed through genome sequencing on January 24, 2021 for Flensburg.
Data on reported SARS-CoV-2 infections are taken from the Robert Koch Institute (2021b).For our empirical analysis we use aggregate case numbers for each NUTS-3 region and day tracked on the basis of symptom onset for individual cases rather than the reporting date by local health authorities. This allows us to estimate the transmission timing of SARS-CoV-2 at the regional population level more precisely. We aggregate the data across age groups. Data on confirmed cases of the three novel VOC (B.1.1.7., B.1.351, P.1) together with their reporting dates are gathered from a public crowd-sourcing project (Römer et al., 2021),which bases case documentation on newspaper and public health reports. We have cross-checked data consistency by retracing individual cases and their timing from the documented source information and have conducted additional online searches for selected cases.
In the specification of SCM estimation, we account for the autoregressive dynamics of infections by including the 7-day incidence rate and the absolute number of cumulative SARS-CoV-2 infections during the last 3 weeks before treatment start as time-varying predictors. Other time-varying predictors are the average daily temperature for each region during the last 2 weeks and changes in average daily mobility during the last 2 weeks before treatment start. Changes in average daily mobility per region are measured relative to a 2019 (pre-COVID) benchmark period. We use data on daily temperatures from Deutscher Wetterdienst (2021) and data on mobility changes from Statistisches Bundesamt (2021).We further include time-constant cross-sectional predictors characterizing regional demographic structures and the regional health care system as in Mitze et al. (2020) based on data from the INKAR online database of the Federal Institute for Research on Building, Urban Affairs and Spatial Development (INKAR, 2021).We use the latest year available in the database, which is 2017. Employed cross-sectional predictor variables include population density (Population/km2), regional settlement structure (categorial dummy), the share of highly educated population (in %), the share of female in population (in %), the average age of female and male population (in years), old- and young-age dependency ratios (in %), the number of physicians per 10,000 of population and pharmacies per 100,000 of population.
We conduct all SCM estimations in STATA using the SYNTH (Abadie et al., 2020)and SYNTH_RUNNER (Galiani & Quistorff, 2017) packages. Confidence intervals (CIs) are calculated from one-sided pseudo p-values obtained on the basis of comprehensive placebo-in-space tests. The latter tests calculate pseudo-treatment effects for all regions in the donor pool treating each of the regions as if it would have received the treatment of a confirmed VOC case by or after January 5, 2021. One-sided p-values are then calculated of the share of placebo-treatment effects that are larger than the observed treatment effects for treated regions and thus indicate the probability that the increase in the number of SARS-CoV-2 infections was observed by chance given the distribution of pseudo-treatment effects in the donor pool. To account for differences in pre-treatment match quality of the pseudo-treatment effects, only donors with a good fit in the pre-treatment period are considered for inference. Specifically, we do not include placebo effects in the pool for inference if the match quality of the control region, measured in terms of the pre-treatment root mean squared prediction error (RMSPE), is greater than 10 times the match quality of the treated unit (Cavallo et al., 2013).
Robustness.– We mainly perform robustness tests by changing the composition of the donor pool. First, we exclude donor regions that were selected in the baseline SCM estimation. The idea behind this analysis is to preclude unintended selection effects resulting from latent VOC transmissions captured in the overall infection dynamics of donor regions in the pre-treatment period. Second, we reduce the pool of donor regions to those NUTS-3 regions which are located in the same federal state as the treated regions (Schleswig-Holstein for Flensburg and North Rhine-Westphalia for Cologne, Leverkusen and Düren). This approach should minimize differences in public health measures, which are mainly decided under the authority of individual federal states in Germany. While mentioned but not reported in the manuscript in detail, we include all robustness tests in the replication files (will be available with published version).
Difference-in-difference estimation (DiD)
To investigate average and dynamic treatment effects for the entire group of treated regions with at least one confirmed VOC case, we additionally run a series of panel regressions in a DiD and Panel event study (PES) framework. The sample period for the panel regressions includes the time period between November 1, 2020 and February 4, 2021. This ensures that we cover all confirmed VOC cases (in the currently best possible way) together with a sufficient pre-treatment period for each region of at least 3 weeks. As for the case of SCM, we use the 7-day incidence rate as key outcome variable, where the timing of infection is measured in terms of symptom onset rather than reporting by local health authorities. We also use the same set of time-varying predictor variables as described above; cross-sectional predictors for the set of NUTS-3 regions are not included as we account for NUTS-3 region fixed effects in the panel regressions.
Specifically, we run DiD regressions as two-way fixed effects model of the following general form In equation 1,IRi,t is the 7-day incidence rate observed for NUTS-3 region i and day t; the variable V OCi,t is our main treatment indicator, which takes values of 1 from the day onwards for which the first VOC case was confirmed in the region. The coefficient δ measures the effect of the presence of a VOC on the overall SARS-CoV-2 incidence rate. For the case of a higher transmissibility of VOC compared to previously existing strains of the virus, we expect that the 7-day incidence rate increases in treated regions relative to comparison regions. We additionally test for the presence of a time window of latent transmission effects prior to the first reporting of a VOC (given that genome sequencing to identify mutations may take up to 2 weeks). This is done by moving forward the date when the treatment dummy V OCi,t takes values of 1 for treated regions by 7, by 14 and by 21 days, respectively. Importantly, these extended treatment specifications do not test for early anticipation effects caused by latent confounding factors (this is done in the Panel Event Study), but averages estimated effects over a longer treatment period to capture potential latent transmission effects prior to the first VOC confirmation as, for instance, identified for Flensburg in the SCM estimations.
It is important to control for factors that potentially confound the link between VOC and the overall SARS-CoV-2 incidence rate at the regional level. The set of confounding factors, which we can directly account for, is included in the variable vector Xi,t−j. Specifically, similar to the SCM application, we control for the number of SARS-CoV-2 cases in region i during the last, the second last and the third last week. We also include a spatially lagged variable covering the number of SARS-CoV-2 cases in region i’s (direct) spatial neighbors during the last, the second last and the third last week. A spatial lag is important because infections can easily spread from one region to another region nearby, e.g., due to commuting or general mobility (Kosfeld et al., 2020).Spatial association between regions is measured through first-order contiguity, i.e. whether regions share a common border or not. We row standardize the resulting spatial weights matrix.
Further, we control for the average temperature and the relative change in average daily mobility at t − 1, at t − 7 and at t − 14 in region i. Controlling for mobility is important because lower mobility can be disease mitigating (Xiong et al., 2020; Schlosser et al., 2020).Including mobility effectively controls for lockdown measures implemented during the sample period and how people follow the rules. In addition, we include linear and quadratic time trends for four different region types (RegionT yper(i))classified on the basis of the region’s settlement structure including region type 1 (large district-free cities, kreisfreie Städte), type 2 (urban regions, Landkreise), type 3 (rural regions, Landkreise) and type 4 (sparsely populated regions, Landkreise), i.e. R = 4. The classification of region types follows the definition of the Federal Institute for Research on Building, Urban Affairs and Spatial Development (INKAR, 2021).τt are time fixed effects for each day in the sample, which for instance control for daily changes in infection levels similar across regions. µi controls for time-constant region fixed effects, which could, e.g., be caused by region-specific SARS-CoV-2 testing intensities. ϵi,t is the model’s error term. We cluster standard errors at the NUTS-2 level (each of the 401 NUTS-3 regions belongs to one of the 38 NUTS-2 regions in Germany). We estimate δ and Γ using the REGHDFE package (Correia, 2019)in STATA, which allows us to control for τt, µi, γr and ψr.
Robustness.– Besides the full sample covering all treated regions, we also conduct estimations with sub-samples. First, we focus on those experiencing treatment early on (before January 22, 2021) to observe at least 14 days of treatment after the first confirmed VOC case for each treated region. Second, we study regions with a relatively high number of VOC reported cases. Here, we restrict the treated regions to the top-10 percentile of VOC counts, which corresponds to at least 9 VOC cases per region. The idea of this sub-sample is to test for treatment effect difference resulting associated with confirmed VOC counts rather than the presence of at least one VOC case.
Panel event study (PES)
The estimation of the PES differs from the two-way fixed effects DiD specification mainly in the way that it accounts for the staggered emergence of a VOC in treated regions throughout the sample period. This allows us to identify dynamic treatment effects over time. Dynamic treatment effects may arise for different reasons: First, they could reflect early anticipation effects prior to the treatment start due to latent VOC transmissions or, second, they could result from other unobserved confounding factors systematically affecting incidence rates in treated regions around the treatment start. Thus, it is important to test for such early anticipation effects. The absence of statistically significant estimates for the latter but significant treatment effects could, accordingly, be interpreted in favor of our empirical identification strategy.
Moreover, we may expect that infections effects do not immediately occur after the re-porting of the first VOC case but potentially build up over time at the regional population level. This may particularly be the case if public health authorities can only imperfectly trace and mitigate VOC-related disease spread. In this case, the estimation of average treatment effects on treated (ATTs) as shown in equation 1 may potentially underestimate the dynamics of SARS-CoV-2 infections subject to confirmed VOC cases. By including sufficient lag and lead terms in the estimation framework for the timing of treatment start, we can identify dynamic treatment effects.
The staggered nature of treatment start in different treated regions can be incorporated into the panel regression approach by translating the model from a specification in absolute time t (as shown in equation 1) to a specification that measures time for each region relative to treatment start. Together with the recognition of potential heterogeneity in the strength of treatment effects over time, the PES setup allows us to precisely estimate the impact of the passage of a treatment (here: VOC emergence in a region) that occurs at different times in different spatial units. A more formal presentation of the PES approach together with estimation challenges is given in Athey & Imbens (2018); Goodman-Bacon (2018)among others. Prior COVID-related PES applications have dealt, for instance, with the infection effects from school re-opening in Germany (Isphording et al., 2020),university students traveling during the U.S. spring break (Mangrum & Niekamp, 2020)or mass protests from the Black Lives Matter movement (Dave et al., 2020).
In the implementation of the PES approach, we include the same set of covariates and fixed effects as in the case of the two-way fixed effects DiD estimation. We also cluster standard errors at the NUTS-2 level. We set the maximum number of pre-treatment leads to 10 days and the maximum number of lags after the treatment start to 20 days. Further effects from leads/lags before (after) this range are accumulated to a single coefficient. To allow for an easy comparison of the SCM and PES results, we express all reported effects as percentage change relative to the observed 7-day incidence rate in the last pre-treatment period.
Robustness.– Besides the full sample covering all treated regions, we estimate the effects for sub-samples of regions. First, we focus on those experiencing treatment early on (before January 22, 2021 to observe at least 14 days of treatment after the first confirmed VOC case). Second, we study regions with a relatively high number of VOC reported cases. We restrict the treated regions to the top-10 percentile of VOC counts, which corresponds to at least 9 VOC cases per region. Third, we combine the first and the second approach and focus on regions with early treatment and at least 9 VOC cases.
We conduct the PES estimations in STATA. The analysis builds on the EVENTDD package (Clarke & Schythe, 2021).We document the full analyses in the replication files (will be available with published version), particularly the mentioned robustness tests.
Footnotes
E-mail: rode{at}vwl.tu-darmstadt.de