Abstract
Objective The late 2019 Covid-19 disease outbreak has put the health systems of many countries to the limit of their capacity. The most affected European countries are, so far, Italy and Spain. In both countries (and others), the authorities decreed a lockdown, with local specificities. The objective of this work is to evaluate the impact of the measures undertaken in Spain to deal with the pandemic.
Method We estimated the number of cases and the impact of lockdown on the reproducibility number based on the hospitalization reports.
Results The estimated number of cases shows a sharp increase until the lockdown, followed by a slowing down and then a decrease after full quarantine was implemented. Differences in the basic reproduction ratio are also very significant, dropping from de 5.89 (95% CI: 5.46-7.09) before the lockdown to 0.48 (95% CI: 0.15-1.17) afterwards.
Conclusions Handling a pandemic like Covid-19 is very complex and requires quick decision making. The large differences found in the speed of propagation of the disease show us that being able to implement interventions at the earliest stage is crucial to minimise the impact of a potential infectious threat. Our work also stresses the importance of reliable up to date epidemiological data in order to accurately assess the impact of Public Health policies on viral outbreak.
1. Introduction
By late 2019 an outbreak of Covid-19 disease -caused by SARS-Cov-2 virus- started in the region of Hubei (China), more specifically in the city of Wuhan. Since then, the disease has spread all over the world, being declared pandemic by the World Health Organization (WHO) on 2020 March 11th. The rapid propagation of the virus around the world has stressed the health systems of many countries to their limit. In Europe, the most affected countries in terms of number of detected cases, hospitalizations and deaths are, to the date, Italy and Spain. This unprecedented situation for the public health systems of these countries forced decision makers to act very quickly in order to minimize the impact of the disease and to avoid collapse. In Spain, the emergency state (estado de alarma) was declared on 2020 March 14th and was hardened with mandatory home confinement except for vital sectors workers (including health professionals, food supply, etc.) on March 30th. Under this situation, it is urgent to assess the efficiency of social distancing and other non-pharmaceutical interventions undertaken to control the pandemics of Covid-19 (Niud and Xu 2020; Anderson et al. 2020). One of the main challenges in evaluating the impact of these actions is that data are only partially available, as many of the cases are asymptomatic or with mild symptoms (Huang et al. 2020), and shortage of testing kits prevent testing all patients with possible Covid-19 symptoms. Therefore, the number of cases might be severely underestimated. This issue is common in epidemiology and several methods have been recently proposed to address it (see (Fernández-Fontelo et al. 2016, 2019)) under specific circumstances. Another issue is that, when no massive viral testing of the population is performed, new cases are only detected once symptoms appear, at a variable delay after contamination. This complicates the assessment of the impact of interventions to reduce/slow down the spread of a virus.
Here, we aimed at evaluating the impact of the Covid-19 related non-pharmaceutical interventions undertaken in Spain over the basic reproduction ratio R0, considering three periods of time in 2020: No intervention (until March 13th), emergency state (March 16-30th and April 13th-15th) and mandatory confinement (March 31st to April 12th). We inferred retrospectively the number of contamination in each Spanish region (Comunidad Autónoma, or CCAA) from the patterns of hospitalizations, deaths and detected cases. The mandatory confinement situation was softened on 2020/04/13 (2020/04/14 on some regions) by allowing non-essential sectors to return to work, so a projection of the situation in the next few weeks was also included. Additionally, the results are compared to a scenario where no measures are undertaken.
2. Methods
We modelled the dynamics of infection with a discrete-time SIR model with time-varying R0 defined for each region r and day d. We also assumed the stock of susceptible population S(r, d) is almost constant through time at S(r, d) ≈ Nd, (i.e. a small percentage of the population is contaminated) as is probably valid up to now. The dynamics of infected people I(r, d) in region r at day d thus varies as:
R0 is a stochastic variable whose expected value is defined by the ongoing social distancing measures (no measure, state emergency or mandatory confinement). Formally, the prior over R0 is defined as a Gaussian Processes independently defined between regions. The prior mean is defined by the distancing measures, while the prior covariance isSquared Exponential covariance K with variance σ2 = 0.12 and length l = 1 day (Bishop 2006). Such prior allows to capture differences in the spread between different regions as well as temporary change of R0 within a region (Gaussian Processes enforce that these fluctuations are smooth in time). The variance term σ2 was set so it is unlikely that the infected population goes down in a single day by a proportion larger than the recovery rate γ. Based on the mean 20 days of contagious before recovery (duration of viral shedding (Zhou et al. 2020)), we set γ = 0.05day−1.
We assumed that the value of infected people at the beginning of the period studied (20th of February 2020) was drawn from a log-normal distribution with mean xi(r) and variance 1, where xi is a parameter specific to each CCAA. This variability allowed to captured initial variations in the spread of the epidemics at the beginning of the period of study.
The number of new cases per day is N(c, d) = I(r, d) − I(r, d − 1) + γI(r, d − 1) = I(r, d) − (1 − γ)I(r, d − 1). This true number of cases however cannot be observed directly. Here we use a latent-state approach: we estimated the evolution of the number of true cases in each CCAA based on the recorded accumulated number of detected cases, hospitalizations and deaths provided by Instituto de Salud Carlos III (https://portalcne.isciii.es/covid19/). We used estimates for the proportion of cases pC that are detected by the Spanish Health system, the proportion of cases pH that are hospitalized, and the lethality rate pD, as the well as the distribution of latency between infection and this three types of events li(d) (figure 1). We used the following estimates: 15% of contaminated persons are hospitalized; 1% die from disease (Lauer et al. 2020; Zhou et al. 2020); 30% of cases get detected (this latter number was defined arbitrarily, since the number of detected cases is roughly twice the number of hospitalizations). Note that these percentages affect the estimated number of true cases by a scaling factor, but do not affect the estimation of reproduction parameters. We used estimates of the distribution of duration of infection-to-detection, infection-to-hospitalization based on published literature (Lauer et al. 2020; Zhou et al. 2020). Lauer and colleagues describe that the incubation period can be captured by log-normal distribution with median 5.1 days. To capture the extra time from the apparition of symptoms to case detection/hospitalization, we increased the μ parameter (to yield an increase in median time of 50% for case detection and 100% for hospitalization, that is μ = 2.03 and μ = 2.314, respectively) while keeping the same dispersion parameter σ. For infection-to-death, we used the same distribution as in (Flaxman et al. 2020), i.e. the sum of two gamma distributions for infection-to-detection and detection-to-death with overall mean 23.9 days. Thus the expected numbers of events yi(r, d) in a certain CCAA is given by convolving the pattern of new cases with the distribution of infection-to-event. with wd = pi(li(d) − (1 − γ)li(d − 1)). We assumed that the actual number of events recorded at that day and time was drawn from a negative binomial distribution with mean E[yi(r, d)] as defined above and parameter r = 2 (Flaxman et al. 2020). The reports provide accumulated time series, so in principle new events correspond to the difference between two successive days. However new events are sometimes reported later than on the day of occurrence (especially during weekends). We estimated that 30% of events were reported on the following day, 10 % two days later. Assigning such proportion of events to one or two days before they are reported allowed to smooth the event time series. Finally, for the case of 2 CCAA (Madrid and Castilla y La Mancha, until April 12th for the latter), the reported number of hospitalizations was not cumulative but corresponded to the current number of hospitalized patients related to Covid-19. For Castilla y La Mancha, we found that we could recover the reported cumulative number of hospitalizations on April 12th by assuming that the duration of hospitalization is distributed uniformly between 5 and 15 days. We used this rule to estimate the cumulative number of hospitalizations for this CCAA, and applied it similarly for Madrid.
There is a debate about the reliability of these data. It is believed that hospitalizations in the most reliable of these indicators,1 as many cases go undetected (some patients are asymptomatic or suffer mild symptoms; saturation of health systems have led to testing only more severely affected patients in some CCAA), and some deaths are not integrated in the official count because of the lack of viral charge testing. There is indeed a large variability between the fraction of hospitalizations per reported case between CCAA: it is 31% in Galicia but 82% in Comunidad de Madrid. This difference is more likely to be due to differences in detecting and reporting cases rather than in the true proportion of infected people requiring hospitalization. It is also more difficult to assess reliably the number of cases from death reports as the latency from infection to death is long and can be very variable across individuals.
Parameters of the model, including the value of R0 in the different conditions, were estimated from the data using an Expectation-Maximization algorithm (see Appendix 5 for details).
3. Results
We present the number of Covid-19 cases in all 19 CCCAA inferred from the pattern of hospitalizations in Figure 2. The estimated cases display a sharp increase until the lockdown followed by a plateau, and then a decrease. We estimated the R0 before state of emergency, during state emergency and during enforced lockdown. R0 was found to drop from 5.89 (95% CI: 5.46-7.09) to 1.86 (95% CI: 1.10-2.63) after state of emergency, and down to 0.48 (95% CI: 0.15-1.17) after full lockdown (Figure 3). We estimate a total number of 0.871 million Covid-19 cases in Spain by April 15th 2010, including 0.294 active cases, 0.559 recovered and 0.018 deceased (see Figure 4 for breakdown by CCAA).
Estimates were similar when we used detected cases rather than hospitalizations. Using detected case reports, we estimated the R0 to be 6.91 (95% CI: 6.75-7.39) before the state of urgency, 2.22 (95% CI: 1.92-2.74) during the state of emergency and 0.85 (95% CI: 0.5-1.05) during the full lockdown (Figure 5). The estimate for the cumulative number of cases were 0.823 million overall in Spain (0.351 million active).
Because deaths occur after a long and variable interval after contamination, we could not reliably estimate the value of R0 separately for the state emergency and full lockdown measures, so we simply estimated R0 before and after state of emergency. We estimated the R0 to be 6.48 before the lockdown (95% CI: 5.5-7.51), and 0.49 afterwards (95% CI: 0.16 - 1.57) (Figure 6). The estimate for the cumulative number of cases were 2.82 millions overall in Spain (0.72 millions active).
Finally, we used the reproducibility parameters estimated from hospitalizations to evaluate the death toll in different scenarios (Appendix B). First, we estimated retrospectively how many deaths have been avoided by the state of emergency (and temporary mandatory confinement). We estimate that, without state of emergency, the new coronavirus would have cause 0.180 million deaths by April 15th (Figure 7, left panels), i.e. 10 times the number of deaths actually reported. In that scenario, Spain would have been approaching herd immunity by mid-April.
We also performed two prospective analyses for the period April 16th-May 15th, assuming either that the state of emergency is maintained as currently in place, or the stricter mandatory confinement with closure of all non-essential activities is implemented again (Figure 7, right panels). The analysis predicts a total of 23.900 deaths (95% CI: 23.560-24.239) in the former case, and 21.423 (95% CI: 21.298-21.548) in the latter. It should be noted however that only the latter predicts a gradual extinction of cases, while our analyses predict that the state of emergency would lead to a new surge of cases.
4. Discussion and limitations
We found similar estimates of reproducibility number and the proportion of the Spanish population contaminated by the new coronavirus, whether they were estimated from hospitalization numbers or detected cases. Both estimates from case reports and hospitalizations suggest that only mandatory quarantine achieved R0 lower than 1, while R0 during state of emergency before non-essential services were shut down was estimated to be well beyond 1. This predicts that the opening of the non-essential services by April 13 may lead to a new surge of cases. Based on this empirical study, mandatory confinement is the only measure that effectively reduces the number of contaminations.
Estimated based on death reports differed considerably from those based on either hospitalizations or detected cases. This suggests that some of the assumptions and data our modeling is based on may not be accurate (although commonly used in previous studies), and shows that this can induce very large biases in the estimation of the propagation of the new coronavirus in Spain. Since hospitalizations reports are believed to be more reliable than either reported cases or deaths reports, we further comment on the results obtained with these reports.
Our approach is very similar to a study by an Imperial College team published last week inferring the impact of non-pharmaceutical measures (including lockdown) on propagation of the new coronavirus in 11 European countries (Flaxman et al. 2020). Both studies rely on fitting a model of infection dynamics to observed data (here hospitalizations). This contrasts with other approaches based on fitting a curve to the observed time series, (e.g. for patterns of fatalities in United States (Team and Murray 2020) or patterns of cases in China (Zhao et al. 2020)), or to model simulations studies that capture how the pattern of contacts in different scenarios (with or without social distancing measures) affect virus propagation (Lin et al. 2020; Hellewell et al. 2020; Ferguson et al. 2020; Lai et al. 2020). Other studies have also estimated the number of cases from the reported deaths, assuming a fixed duration from infection to death.2
Our modeling approach include stochasticity in the reproducibility number in each area. This notably allows to capture distinct trajectories of infection in different areas. We noted a negative correlation across CCAA between the proportion of infected people at the time of lockdown and the subsequent increase in infection (Figure 8, Pearson coefficient: r = −0.46, p = 0.048). This could be due to a series of factors. First, the communities with largest proportion of cases could start developing herd immunity, hence limiting the propagation of the infection (which seems unlikely given that in the most affected CCAA, only a few percent have been infected according to our estimations). It could also be the result of local policies taken before the national lockdown in the most affected CCAA, or a better compliance of lockdown and social distancing in most affected areas. Another factor could be the migration from the most affected (especially Madrid) to less affected region before lockdown was implemented,3 or some distortions in the reporting (under-reporting) of cases in saturated health systems.
Our study has several important limitations. First, as noted previously, it is not clear how reliable is the data the modeling is based on: both epidemiological reports, and infectious estimate probability and latency of symptoms, case detection, hospitalization and death). It should be stressed that the unreliability of epidemiologic data (with changes of criterion of inclusion along time and between CCAA) induces important biases which impedes an accurate estimation of the impact of political measures on the propagation of the new coronavirus. A faster and more reliable tracking of the epidemics could be performed if cases were reported in a systematic way dated by the onset of symptoms rather than detection, as the incubation period has been well characterized.
Second, the model captures the number of new cases each day as a proportion from the pool of infected people in the same area. It does not take into account how the age distribution in each area affects that each infection leads to hospitalizations or death, nor how the probability of infecting depends on the days from infection as in (Flaxman et al. 2020). Nor do it take into account mobility between regions, whose impact is believed to be more important at the initiation of the epidemics. It is also worth noting that in locked down environment where most contacts are compartmentalized in households, it is possible that at beginning the virus continues to spread rapidly within households, but less so between households. As immunity was not taken care here, R0 may decrease significantly more without further policies, after this first wave of within-household contamination is over. Finally, we only modeled impact of lockdown, not of other measures which were taken too close apart (banning public events, closing schools, etc.), and simultaneously in most regions, so it is not possible here to disentangle their effects precisely.
In conclusion, the greatest interest should be focused on the trends in R0 found in this work, while concrete numerical predictions should be taken with extreme caution, as there is a lack of reliable epidemiological data we can base the models on.
Data Availability
The study used publically available data.
Acknowledgements
The authors thank J. Barbosa and P. Puig for useful comments on the manuscript.
Appendix A: Parameter estimation procedure
Parameters of the model θ include the prior mean of R0 for the three different lockdown conditions (β0, β1, β2), as well the expected value of the log-infected population at initial point xi(r). Parameters were fitted from the data by Maximum Likelihood estimation, using an Expectation-Maximization procedure (Bishop 2006). The procedure also allowed to recover the posterior distribution of true cases p(I|y; θ) for each day and CCAA. We convert the infected population to the log-scale, defining x(r, d) = logI(r, d):
This can be turned into:
In vectorial terms, we have x = T[x(0, r), R0(r,:)], where T is an upper triangular matrix of 1 that implements the summing operation. Since both xi(r) and R0(r,:) have multivariate normal prior distribution, the prior over x is normal itself with mean μr = T[xi(r), Φβ] and covariance where Φ is an D-by-3 indicator matrix indicating the lockdown state for each day, and is block diagonal with submatrices and K. In the Expectation step, we estimate the posterior distribution over log-infected population using a Laplace approximation p(x|y, θ) ≈ N(m, V). We first identified the maximum-a-posteriori variable m through gradient search, and then computed V as the inverse of the negative of the hessian joint-log-probability evaluated at m.
Parameters were updated in the M-step by maximizing the objective function analytically. Confidence intervals for parameters were estimated using parametric bootstrapping using 20 bootstraps. All analyses were implemented in Matlab with custom codes, which will be uploaded on a public repository upon publication of the manuscript.
Appendix B: Projection analyses
We performed three projection analyses: two projective analyses for period April 16th – May 15th and one retrospective analysis for the period March 14th-April 15th. For the projective analysis, we simulated the model with a R0 whose prior mean was set to the estimated value either for the state of emergency or for mandatory confinement. We found that we could better predict the death patterns up to April 15th by sampling infection-to-death duration from a log-normal distribution with median 15 days and dispersion parameter 0.5 rather than the one previously, so we used this distribution to predict the patterns of deaths from the number of simulated cases. We ran 100 different simulations with the same parameters and averaged the results of death over simulations.
We applied a similar approach to predict retrospectively the number of deaths if the state of emergency would not have been declared, by applying a mean R0 for the whole period until April 15th equal to our estimate for our pre-state of emergency period. However, we simulated a full SIR model, since in this case we could no longer approximate the Susceptible population as being nearly constant in time.