Abstract
Background As governments across Europe have issued non-pharmaceutical interventions (NPIs) such as social distancing and school closing, the mobility patterns in these countries have changed. It is likely different countries and populations respond differently to the same NPIs and that these differences are reflected in the epidemic development.
Methods We build a Bayesian model that estimates the number of deaths on a given day dependent on changes in the basic reproductive number, R0, due to changes in mobility patterns. We utilize mobility data from Google mobility reports using five different categories: retail and recreation, grocery and pharmacy, transit stations, workplace and residential. The importance of each mobility category for predicting changes in R0 is estimated through the model.
Findings The changes in mobility have a large overlap with the introduction of governmental NPIs, highlighting the importance of government action for population behavioural change. The grocery and pharmacy sector is estimated to account for 97 % of the reduction in R0 (95% confidence interval [0·79,0·99]).
Interpretation Our model predicts three-week epidemic forecasts, using real-time observations of changes in mobility patterns, which can provide governments with direct feedback on the effects of their NPIs. The model predicts the changes in a majority of the countries accurately but overestimates the impact of NPIs in Sweden and Denmark and underestimates them in France and Belgium.
Funding Financial support: Swedish Research Council for Natural Science, grant No. VR-2016-06301 and Swedish E-science Research Center. Computational resources: Swedish National Infrastructure for Computing, grant No. SNIC-2019/3-319.
Introduction
In December 2019 a new coronavirus (COVID-19) emerged in Wuhan, China. China implemented a quick strategy of suppression by locking the Wuhan province down on January 231, and implementing social distancing procedures nationwide, with a successful outcome 2. Still, the virus rapidly spread across the world through our increasingly interconnected flight network, and shortly arrived in Europe. In February 2020 the number of cases started to increase rapidly in some European countries. To limit the spread of the virus, European countries introduced non-pharmaceutical interventions (NPIs) similar to China’s. These NPIs include social distancing, school closures, limiting international travel and lockdown3. All of these NPIs result in behavioural changes, which can be traced through mobility data from tracking the location of mobile phones.
Google recently released a time-limited sharing of mobility data4 from across the world as represented by summary statistics to combat COVID-19. The mobility data is measured in 6 different sectors: retail and recreation, grocery and pharmacy, parks, transit stations, workplace and residential. The effects of the government-issued NPIs are manifested through changes in these patterns, which are utilized in our model.
It is likely that different countries and populations respond differently to the same NPIs, why it is important to consider the effect of NPIs countrywise. By using real-life mobility data to model changes in the basic reproductive number, R0, the effects to NPIs across different countries can be modelled more accurately. The mobility data utilised here have some uncertainties and lack details but are the best openly available data source for tracking a population’s movement in all eleven studied countries. Governments can, in collaboration with telephone companies, obtain much more fine-grained data, enabling them to evaluate the effect of the NPIs in more detail.
After an initial rapid spread in China, control measures proved very successful to stop the spread both in China5 and in other parts of the world6,7. However, there is still a risk for subsequent spread upon lifting of these restrictions7,8. There is therefore an urgent need both for understanding and tracking the effects of governmental interventions and their removals. Large scale testing could provide valuable information about the effects of interventions, however, these are expensive, sometimes inaccurate and might violate privacy rights. In contrast, the use of large scale data from anonymous tracking of mobile phones is inexpensive and easily available.
Recently, a group from Imperial College released a report7 that estimates the effects of NPIs on R0. Their model is the basis for the model presented here. Their report had a large impact on how the UK government changed its intervention strategy9. A limitation of their model is the assumption that each intervention has the same impact in all countries, ignoring cultural and sociological differences. In contrast, by utilizing country specific mobility data in a Bayesian framework 10, we estimate the impact of each change in mobility pattern on R0. The resulting information provides an easy, straightforward way for governments to analyze if NPIs are working and to what extent. We show that in a three-week forecast our method provides a smaller mean error than the model from Imperial College.
Methods
The model is trained on data from 30 days before the day after each country has observed 10 deaths in total up to (and including) 29 March, and then used to simulate a three week forecast from 30 March to 19 April.
Model basis
Our model is based on the model used in the recent report7 from Imperial College London (ICL). The ICL report tries to estimate the impact of NPIs on the basic reproductive number (R0) in the same 11 countries modelled here. The main difference between the ICL model and the current one is the modelling of the impact on R0. The ICL team estimated the basic reproductive number at day t in country m (Rt,m) as a function of the NPI indicators Ik,t,m in place at day t in country m as: where I=1 when intervention k is implemented at day t in country m and α the impact of each intervention.
Here, we estimate Rt,m to be a function of the relative change in mobility pattern for each country: where I1-5,t,m is the relative mobility in retail and recreation, grocery and pharmacy, transit stations, workplace and residential sectors respectively at day t in country m. The residential mobility parameter has a negative sign as an increase there is assumed to lower R0. We assume that the impact of each relative mobility change has the same relative impact across all countries and across time. Alpha is set to be gamma distributed with mean 0 5 and standard deviation 1. We did not include the data for the mobility category “Parks” as this data displayed much noise and cyclic peaks, as would be expected with varying weather4. The prior for R0 is set to:
The value of 2.79 is chosen from the median value of a recent analysis of 12 modelling studies 11, and the normal distribution from 2.
The relative mobility is modelled as the relative value change compared to a mobility baseline estimated by Google4. The baseline is the median value, for the corresponding day of the week, during the 5-week period of 2020-01-03 to 2020-02-06. For the days for which no mobility data is available, the values were set to 0. The mobility data for the forecast (and days beyond the date for the last available mobility data) was set to the same values as the last observed days. The dates for the interventions were taken from the ICL report7, whose initial efforts were crowdsourced.
Infection model
As the number of deaths in each country is likely to be the most accurate COVID-19 related data, we use this as the core of the model, being the posterior in the Bayesian simulations. The number of deaths in country m at day t is modelled as a negative binomial distribution with mean and variance accordingly:
The expected number of deaths, dt,m, at day t in country m is given by: where πm is the infection to death distribution in the country m given by a combination of the infection to onset distribution (Gamma(5.1,0·86)) and onset to death distribution (Gamma(18·8,0·45)) (combined with mean 23.9 days and standard deviation 0·45 days) times the infection fatality rate (ifr) 7,12: πm is discretized in steps of 1 day accordingly:
The ifrs are taken from previous estimates of the population at risk is about 1 %13 and adjusted for the predicted attack rate in the age group 50-59 years of age, assuming a uniform attack rate7,8,12, chosen due to having the least predicted underreporting in analyses of data from the Chinese epidemic 12. The number of deaths today is thus dependent on the cumulative number of cases from the previous days, weighted by the country-specific infection to death distribution.
The number of cases acquired at day τ in country m, cτ,m is modelled with a discrete renewal process 14,15:
, where gτ−t ∼ Gamma(6·5, 0·62) (mean 6·5 days, standard deviation 0·62) is the serial interval distribution used to model the number of cases.
gs is discretized in steps of 1 day accordingly:
The number of cases today is thus dependent on the cumulative number of cases from the previous days, weighted by the serial interval distribution, times R0 at day t.
Just as in the ICL report7, we assume the starting point for the infection was 30 days before the day after each country has observed 10 deaths in total. From this assumed starting point, we initialize our model with 6 days 2 of cases drawn from an Exponential(0·03) distribution, which are inferred in the Bayesian posterior distribution (Dt,m).
The implications on R0 due to relative mobility variations were estimated simultaneously for all countries in a hierarchical Bayesian framework using Markov-Chain Monte-Carlo (MCMC)10 simulations in Stan16. The death data17 used in the form of the number of deaths per day is from ECDC (European Centre of Disease Control), available and updated daily. We ran the model with eight chains, using 4000 iterations (2000 warm-up), as in the earlier work7,16. The parameter specifics of the simulation are available in the code (see below).
MCMC Convergence
MCMC simulations are considered to converge when the Rhat statistics (a metric for comparing the variance between pooled and within-chain inferences) reach one18. A histogram of Rhat statistics for the modelled parameters in all simulation runs were constructed and analyzed. We also made sure no divergent transitions were observed by setting the adapt delta in the sampler (see code).
Leave One Country Out Analysis
Since all countries are in different stages of their epidemics, different amounts of data are available for each country. To analyze how the model is influenced by different countries, we fit models using data from all countries except one using all 11 combinations19. We then estimate the importance of each mobility parameter in the leave-one-country-out analysis.
The relative difference in each mobility parameter provides an estimate of how each country affects R0 and thus the number of cases and deaths as well. Furthermore, the Pearson correlation coefficients for the mean R0 across all time points are calculated for each country in the different runs when all other 10 have been left out (see Figure S5).
Forecast validation
To ensure the forecasts are reliable, we leave out three weeks of data (30 March - 19 April) and fit a model using data from the beginning of the epidemic up to the date for the beginning of the left-out data. We then evaluate the model with one week intervals from the 30th of March to the 19th of April. We evaluate by the average error and the average percent error (average error÷Σ observed deaths) during each of the three weeks, comparing with simulations obtained from the ICL model. We should note here that the ICL model does not converge for three-week predictions using 4000 iterations (see Figure S2).
Code
All code is freely available at https://github.com/patrickbryant1/COVID19.github.io/ under the GPLv3 license.
Results
Estimating the cumulative number of cases, the number of deaths per day and changes in the basic reproductive number, R0
In Figure 1, for Italy and Sweden, and Figure S1, for all eleven modeled countries, our estimates of cumulative cases, daily deaths and the basic reproductive number R0 are shown. We simulate a three week forecast from 30 March to 19 April using data up to 29 March from the European Centre of Disease Control (ECDC) in the form of number of deaths per day, and relative mobility data estimated by Google4. According to the model, most countries appear to have their epidemic under control (April 19) (Table 1). The most successful country in terms of reducing R0 is Italy (R0≈0·19) and the least is Sweden (R0≈2·02).
From Figure S1, it can be seen that in all countries the interventions have some positive effect, decreasing the estimated R0 between the epidemic start and March 29. It can be noted that during the development of the epidemic, R0 displays a wide range of values. In some countries, the mean of the estimated R0 displays a rapid increase to values as high as 15, coupled with an increase in mobility (primarily) to grocery and pharmacies exactly when the interventions are put into force (see Figures 1 and S1).
The estimated number of deaths for up to three weeks after the model is trained, have a good correspondence with the observed number (Figures 1, S1 and Table 2). Compared with the Imperial College London (ICL) model7, our model displays both lower errors and less uncertainty (see Figures 2, S2 and Table S1). The average absolute errors over the 11 countries in the number of deaths are lower across all three weeks (week 1: 68 vs 158, week 2: 119 vs 488, and week 3: 113 vs 1497 for ours and the ICL-teams respectively).
Comparing mobility data across countries
When overlaying the implementation dates of the NPIs with the mobility data, it is clear that governmental decisions have a very large impact on the populations in the 11 modelled countries (see Figure S1). Most countries display very similar relative changes in their mobility patterns, with mobility in retail and recreation, grocery and pharmacy, transit stations and workplace decreasing and mobility in the residential category increasing.
Most countries have similar relative changes across the sectors (Figure S1). The ones that display smaller relative changes (Denmark, Norway and Sweden) also display smaller reductions in R0, which is a natural consequence of our model, as it assumes that changes in R0 are directly related to changes in mobility. The mobility patterns in Sweden display barely half of the relative changes compared with France, Spain, and Italy, and the reduction in R0 is therefore smaller in Sweden.
The importance of mobility sectors for modelling changes in R0
Analyzing the importance of each mobility parameter for predicting the reduction in R0 shows that the grocery and pharmacy sector appears to be the clearest indicator for R0 change (see Figure 3). The grocery and pharmacy parameter is estimated to account for 97 % of the reduction in R0 with a narrow confidence interval (CI). The residential parameter seems important as well, which would be expected, but the confidence interval displays a large uncertainty.
Model validation
The means and CIs for the mobility parameters (see Figure S4) are almost identical in the leave-one-country-out analysis (LOO) analysis. A very wide CI is observed for Italy in the grocery and pharmacy sector though, emphasizing the importance of the Italian data. The variable R0 values in the LOO analysis show high Pearson correlations, with Italy and especially the United Kingdom displaying lower correlations (see Figure S5). Italy and the United Kingdom correlate quite badly with each other. One of 4000 iterations ended with a divergence (0·025 %) when France and Spain were excluded. A histogram of Rhat statistics for the modelled parameters in all simulations for the main analysis is displayed in Figure S6.
Discussion
Our model makes it clear that the non-pharmaceutical interventions (NPIs) introduced by governments across Europe have had substantial effects on both mobility patterns and in preventing the spread of COVID-19. By tracking the relative change in mobility in the grocery and pharmacy sector it is possible to account for 97 % of the reduction in the basic reproductive number, R0, in our model. This information provides an easy, straightforward way for governments to analyze if NPIs are working and to what extent.
Why the grocery and pharmacy sector has the biggest impact on estimated changes in Ro is not clear. It is possible that this sector enables contacts between different communities, but this requires further analysis to be fully understood. Since R0 is strongly dependent on the changes in mobility, rapid changes in mobility leads to rapid changes in R0. This has drastic consequences to the estimated development of the epidemic in a country.
However, changes in R0 will not manifest in the number of deaths per day until about three weeks later (the mean value in the gamma distribution for infection to death is 23·9 days, see methods section). Therefore, we provide a three-week forecast. The estimates have a good correspondence with the observed numbers in most countries (see Figures 2 and Table 2), and compared with the ICL-model, our model displays both lower errors and less uncertainty (Figures 2, S2 and Tables 2,S1). It can also be noted that the ICL model overpredicts the number of deaths in all countries at the end of the estimate.
The estimated number of cases has great uncertainty across all countries. It should be noted here that one limitation of our model is that it does not take herd-immunity effects into account, which should be reached when around 60-80 % of the population is infected 20, but it is unlikely that sufficiently high infection has been reached yet for this to have a significant effect. Another limitation of the model is the assumption that the impact of each relative mobility change has the same relative impact across all countries and across time. Likely both more detailed mobility data and intermixing patterns need to be considered, metrics that are not available.
The number of cases are also highly dependent on having the correct infection-fatality-rate (ifr). This quantity is only modelled for the age group 50-59 years and does thereby not take into account the attack rates for the whole of each country’s population (see methods section). If a country managed to avoid the elderly being infected, that would lower the ifr 21, which could explain prediction differences to some extent.
The model validation, both by a leave-one-country-out analysis and by predicting a three week forecast, ensures the model’s robustness. The countries where the errors stand out are Denmark and Sweden, with over-predicted estimates, and Belgium and France, under-predicted. We note that these two pairs of countries are close both geographically and culturally22,23, possibly explaining the systematic differences. The differences may also be caused by differences in reporting between the countries24,25. For instance, on April 5 more than 2000 deaths were reported in France, due to sudden inclusion of potential COVID-19 attributed deaths in nursing homes occuring at earlier dates 26.
Here, we present a model to estimate the effects of public interventions on the spread of COVID-19 that does not assume that interventions have identical effects in different geographical and cultural settings. In contrast, our model uses observational data of mobility patterns in five environments to estimate changes in the transmission rate. Our model creates the possibility to track rapid changes in spread, right now and predict their consequences three weeks ahead in time. This enables governments to use anonymous real-time data to adjust their policies. We do foresee that such models will become incrementally more powerful as more detailed mobility data becomes available in the future.
Data Availability
All code is freely available at https://github.com/patrickbryant1/COVID19.github.io/ under the GPLv3 license. Data and future predictions will be made available at https://covid19.bioinfo.se/
Contributors
PB designed and implemented the study. PB generated all figures and wrote the initial draft of the manuscript, which was further edited, reviewed, revised and approved by both authors. AE provided the initial extraction of the mobility data (before it was made available in .csv format).
Declaration of interests
We declare no competing interests.
Availability
All code is freely available at https://github.com/patrickbryant1/COVID19.github.io/ under the GPLv3 license.
Data and future predictions will be made available at https://covid19.bioinfo.se/
Supplementary Material
Figures
Tables
Acknowledgements
We acknowledge Claudio Bassot’s contribution by sharing both the Imperial College London report and the Google mobility data. Without this information, this study would not be possible. We are also grateful to various colleagues and friends that contributed to the discussion. Finally we thank the authors of the Imperial College London report to make their data and model freely available.