Abstract
Assessing the impact of mobility on epidemic spreading is of crucial importance for understanding the effect of policies like mass quarantines and selective re-openings. High mobility between areas contribute to the importation of cases, affecting the spread of the disease. While many factors influence local incidence and making it more or less homogeneous with respect to other areas, the importance of multi-seeding has often been overlooked. Multi-seeding occurs when several independent (non-clustered) infected individuals arrive at a susceptible population. This can give rise to autonomous outbreaks that impact separate areas of the contact (social) network. Such mechanism has the potential to boost local incidence and size, making control and tracing measures less effective. In Spain, the high heterogeneity in incidence between similar areas despite the uniform mobility control measures taken suggests that multi-seeding could have played an important role in shaping the spreading of the disease. In this work, we focus on the spreading of SARS-CoV-2 among the 52 Spanish provinces, showing that local peaks of incidence and mortality strongly correlate with mobility occurred in the early-stage weeks from and to Madrid, the main mobility hub and where the initial local outbreak unfolded. These results clarify the higher order effects that mobility can have on the evolution of an epidemic and highlight the relevance of its control.
INTRODUCTION
The COVID-19 epidemic, firstly detected in the Hubei province in China, reached the status of pandemic on March 11, 2020 [1] and currently involves at least 187 countries around the entire world with almost 3.5 million cases and more than 240, 000 deaths as of the beginning of May 2020 [2].
In Europe, SARS-CoV-2 severely hit Italy, the first country to report local transmission, in mid-February [3] and, by the end of the month, other European countries like Spain were already experiencing sustained local outbreaks [3]. Madrid was the first region to confirm large numbers of local contagions and, on March 6, 2020, the city already reached 100 confirmed cases. This led to the closure of public buildings and different activities, like, for instance, universities, which brought the lateral effect of getting people who reside in the city for work or study to leave and go back to their province of origin. As most major capitals in the world, Madrid attracts workers and students from all the country, a large fraction of daily commuters from neighboring provinces and weekly travelers from all provinces. Among these for example Soria, one of the most affected provinces by COVID-19 per capita, which counts with a scarce population and density but a large number of weekly commuters with the capital. On March 14, given the fast spread of the virus in the country, the Spanish Government declared the Alarm State (Estado de Alarma) [4]. Such State allows the Government to impose restrictions on economic activities and citizens mobility. The restrictions included a reduction of public transport services like buses, metros and trains around the 50% of the usual capacity, the closure of buildings and activities related to the public sector and the limitation of individual mobility in the whole country. People could leave home only for essential needs as grocery shopping, visiting the doctor and pharmacies, or going to work. Mobility between provinces was limited to working reasons. This compressed the closing of many economic activities, especially those related to the service sector, letting only some exceptions like logistics and transport, supermarkets, pharmacies, health-related centers, construction, agriculture and industry to keep operating. On March 29, given the growth in the number of new cases, the Government declared the closing of all non-essential activities. After April 13, 2020, the restriction measures have started to be progressively lifted but we will center on the initial stages of the spreading.
In this work, combining detailed human mobility data with epidemiological reports, we study the relationship between inter-city mobility flows and COVID-19 incidence. Our working hypothesis is that large mobility flows lead to multiple seeding events resulting in several local outbreaks and, finally, to a higher incidence. Although the role of human mobility in shaping epidemic dynamics has been extensively considered [5-14], even recently for COVID-19 [15-21], the effect of multiple seeding due to mobility has received sensibly less attention; with only few theoretical [22, 23] and applied exceptions [24-27]. To test this assumption, we focus on the change in mobility among the province of Madrid - where the first sustained outbreak was recorded- and the other provinces in Spain the week before the onset of the local outbreaks. The same analysis is then performed on the mortality instead of the incidence, attesting the robustness of our approach. Our results confirm that trips between Madrid and other provinces occurred one week before the onset explain the 67% of the peak incidence for each area. Moreover, we find that a model that takes into account trips, population size and onset time accounts for the 77% of the SARS-CoV-2 incidence and mortality during the peak in each area of the country.
DATA AND METHODS
We take as basic geographical units Spanish provinces (see Fig. 1), which are 50 territorial divisions and 2 autonomous cities in the Northern coast of Africa (Ceuta and Melilla). They correspond to the NUTS 3 statistical areas according to the European nomenclature.
The characterization of inter-province mobility has been inferred from anonymized records of telecommunication network activity by Kido Dynamics SA (www.kidodynamics.com). The dataset contains approximately 13 million devices with unique daily mobility patterns, each being composed of a series of trips and visits. A trip is defined as a sequence of records per device that are only compatible with displacements of at least 5 km/h in the past 1 hour window. Trips are aggregated by origin (destination) province according to the location of the antennas where the first (last) records occurred. Complementary to trips, a stay is defined as any sequence of consecutive records that do not constitute trips. Stays are aggregated by province according to the location of the antennas where all their records occurred. Each device is assigned a residence province according to the most common location of its stays outside office hours. All the stays outside a device’s residence area are then labeled as “visits”. Figure 1 shows, for instance, the number of residents from the different provinces with stays in Madrid. Mobility patterns in Spain display significantly different behaviors during the weekend and weekdays. Thus, we classify the visits of a given device to a given province as “weekender” if more than half of the device’s visits to that province during the period of study occurred during the weekend.
Sequence of labeled trips and visits are then aggregated daily into a total number of trips per province of residence and province of origin/destination, and total number of visits per province and province of residence, both for weekenders and for non-weekenders. Finally, inspired by differential privacy methods [28] and to avoid any potential disclosure of personally-identifiable mobility patterns, a small unbiased Laplacian noise with scale parameter of 5 (i.e. variance of 50) is added to all aggregated values. After adding the noise, any values below 10 are discarded from the sample as additional preventive measure. Such aggregated trip and stays information are the basis for the mobility analysis presented in this work.
Regarding the number of cases, the incidence and the mortality at the peak, the data has been collected from official health services releases where available. The list of links is at Ref. [29].
RESULTS
We define a group of key features of the spreading directly from the epidemic curves of incidence and mortality per day from the beginning of the epidemic to April 13, 2020, which is when testing policies changed, rising sensibly the coverage of registered cases. A first word of warning is required: the data of both curves are noisy due to changes in the testing rhythms. It is easy, for instance, to detect the effect of weekends with a clear slowing down of testing and reporting. A process of smoothing is, hence, applied to obtain more reliable estimates on the trends of the epidemic. To do this, we take a running average of three days assigning the value to the central point.
Once the curves have been smoothed-out (see Fig. 7 below), we record the magnitude of the peak, the times between the local onsets (when the incidence is larger than 7/100, 000) and the respective peaks in every province (∆τ), and the time between the local onset and March 6 (∆tM), which is when Madrid reached 100 confirmed cases. Four provinces (Almeriia, Cadiz, Huelva, Las Palmas and Sevilla) have few detected cases and do not reach the established onset, thus we have excluded them from the first analysis on times. The list of onset and peak times for the incidence can be found in Table I.
On the mobility side, we define the total trips divided by the province population (intended as the sum of inflow and outflow) between Madrid and each province one week before the local onset. For example, if the onset in province A occurred on March 15, 2020, the trips counted are those registered between March 1 and 7. We checked the correlations between the local epidemic metrics and the mobility in different time ranges but one week before the local onset produces the best results. For the sake of completeness, we have also considered the mobility of a single day seven days before the local onset.
We start our analyses by confirming some correlations of the incidence that one could a priori expect. As can be seen in Fig. 2a, the height of the incidence peak is connected to the time between the local onset and peak in every province. This is natural, but it testifies that the outbreaks behave similarly across provinces once local contagion ignites. An analogous, even clearer relation, can be observed between the maximum incidence and the delay of the local onset from the arrival of Madrid at 100 cases (Fig. 2b). As can be observed in Figs. 2c and d, the time of the local onset is highly related to the total trips from/to Madrid the week before, showing that mobility was the main driver of the epidemic expansion. The main outliers here are La Rioja and Alava, which started the onset right after Madrid without having such high level of trips per capita with the capital. This is due to a super-spreading event registered in a wedding that strongly boosted local contagion in both neighboring provinces almost at the same time that the epidemic started to ramp up in Madrid [30]. Finally and more importantly for the hypothesis of the relevance of multiseeding, Figs. 2e and f show the connection between the height of the incidence peak and the trips from/to Madrid weighted by the local population of the province with a high level of correlation. There are some outliers as the provinces neighboring Madrid (Avila, Cuenca, Guadalajara and Toledo), which fall within the functional metropolitan area of the capital and thus behave similarly to Madrid itself. This explains the flattening of the trends on the right side of the plots.
The picture is maintained if one focuses on the mortality instead of on the incidence. The local onset for the mortality is established when every province reaches 1/1000 000 (see Fig. 8 and Table I below). Differently from before, the height of the peak of mortality is not so strongly connected to the time between the local mortality onset and the peak (see Fig. 3a). The peak of mortality correlates better with the time difference between the local onset and March 6 (Fig. 3b). The mobility for the analysis is the same as that taken for the incidence, one week before or 7 days before the onset of the incidence in every province. The results of plotting the time difference between the local mortality onset and March 6 are shown in Fig. 3c (for the one week mobility) and in Fig. 3d for a single day mobility. In general, the higher the mobility between the province and Madrid, the faster casualties are observed. This relation is valid for all provinces except for those with a strong overlap with the metropolitan area of Madrid, which once again are outliers in these plots. As shown in Figs. 3e and f, the mobility the week before the local incidence onset from/to Madrid in trips per capita of the province correlates with the peak in mortality. This further contributes to the hypothesis that the arrival of multiple-seeds can increase the local size of the epidemic and, consequently, the mortality.
We have analyzed so far total trips from/to Madrid but one person can be responsible for more than one of these trips. The status of each single individual against the pandemic is an intrinsic variable and, therefore, to improve the results it is necessary to consider the stays instead of the trips. This allows us to analyze the number of stays of residents in Madrid in the different provinces and, the other way around, the number of residents of those provinces that visited Madrid and came back home. In all the cases and as above, we will take as basis the mobility one week before the onsets in incidence in the different provinces. Out of different types of mobility that occur between provinces and the capital, we separate it into two classes: the visits of weekenders to the provinces and visits of weekday travelers. In Fig. 4, one can observe the correlation between the height of the local incidence and mortality and the stays per capita of weekenders and weekday travelers in the different provinces one week before the corresponding onsets. Neighboring provinces overlapping with Madrid metro area have higher commuting levels (weekday visits) to the capital with respect to others. This is understandable given that for them the trips are internal mobility. Such effect brings the saturation in the curves of Fig. 4b and c. On the other side, weekender stays from/at Madrid are not affected by distance as much as for weekday visitors, hence the saturation for neighboring provinces disappears. By this analysis, we can understand that those provinces where a higher relative peak of contagions and deaths occurred exhibit higher visits to/from the capital per capita the week before the onsets and, therefore, a higher inflow of seeds. Note that we do not find differences in considering residents of Madrid visiting the provinces or province residents traveling to Madrid, the effect is the same.
Besides the number of stays per capita, we can analyze as well the duration of the visits to check whether longer stays correlate better with the magnitude of the peaks. However, changes in the correlation are so small that the information given by the duration of stays is negligible. It is thus not so relevant to separate the visitors either by origin or stay duration.
To further understand the relevant variables contributing to the heights of the local peaks of incidence and mortality, we perform a multivariate analysis including five variables: logarithm of total stays per capita one week before the incidence onset, density of population, distance from Madrid, population of every province and onset times. Note that the onset time is an intrinsic epidemic variable and as such it can depend on the others (e.g., mobility). For those provinces that did not reach the established onset, here we consider the peak day as a proxy for the onset. As shown in Figure 5, mobility per capita alone explains the 67% (incidence) and 68% (mortality) of the peak heights variance. By adding all the variables for each province, the explained variances increase to and , respectively.
By using the multivariate fitted parameters, we can reconstruct the estimated incidence and mortality peaks for each province and compare the statistical model predictions and the actual values. As illustrated in Figure 6, both statistical models (for incidence and mortality) give good accuracy in the estimation of the real peaks for each province.
Finally, the Spanish Ministry of Health has just released the first results of a study on the prevalence of antibody response in the population (https://www.mscbs.gob.es/gabinetePrensa/notaPrensa/pdf/13.05130520204528614.pdf). The analysis tested 60,983 individuals from different provinces, age groups and socio-economic contexts in a sample obtained with the same mechanisms as the census. The results are thus expected to be representative of the extension of the pandemia in the population. We have repeated the same analysis presented so far with this new data (see Supplementary Material, Figs. 9, 10, and 11 below). The mobility in terms of stays per capita explains 64% of the variance of the antibody prevalence (), while the full multivariate model accounts for 68% and the comparison between the statistical model and the data yields . Despite the data of the prevalence analysis has been obtained much later than the peaks, these results strongly confirm the scenario we have found of dependence between the peaks (mortality and incidence) and the mobility.
CONCLUSIONS
Understanding the impact of mobility on the early stages of spreading of an epidemic is key to design efficient public health responses. In this work, we address the question of how relevant is multi-seeding for the epidemic indicators in a population. The relation between a larger inflow of seeds and the epidemic size, heights of the incidence and mortality peaks is easy to hypothesize from a theoretical perspective. Is it, however, a reality? To test it, we perform a simple correlation analysis in the context of the present COVID-19 pandemic in Spain. The first local outbreak was declared in Madrid, attaining 100 confirmed cases on March 6, 2020. From then on, cases were detected in all the 52 provinces and autonomous cities of the country. The confinement measures were taken in a uniform way across the country with a first lockdown on Friday March 14, 2020, a stricter lockdown on March 29. The incidence peaks for most of the provinces occurred after the first lockdown and before the second. The fact that the lockdowns are uniform allows us to perform the analysis on multi-seeding in equal conditions. We find a clear relation between the mobility from/to Madrid and the heights of the peaks of incidence and mortality. Furthermore, the kind of mobility that better correlates is the one exhibited by weekenders one week before the local onset and, typically, before the confinement measures. The results are further confirmed by obtaining similar correlations between the prevalence of antibodies in the population measured in an independent study and the mobility. This supports the policies intended to target non-essential trips first and, only later, commuting flows (weekday travelers).
It is not surprising that a larger mobility to and from the area with the first outbreak accelerates the local onsets. But if the flow of trips per capita are high and frequent, the probability of importing more cases increases. Multi-seeds impact in different zones of the local social/contact network. The protection that bottlenecks in such contact network structure could bring is, thus, deactivated, which leads in turn to an enhancement of the contagions and a larger epidemic size with respect to a single or few seeds scenario. More cases imply larger mortality. These are theoretical arguments of the effect of multi-seeding. The main contribution of our work is to offer empirical evidence of such connections between multi-seeding and the severity of the epidemic at the population level. We hope that this study will highlight the importance of mobility in an epidemic situation, which goes beyond a first direct relation between arrival times and inflow of trips, and help stakeholders and decisionmakers to design more efficient responses.
Data Availability
The information on mobility are available upon request with the permission of Kido Dynamics SA. The data on incidence and mortality can be downloaded from [29] and the one on the antibody prevalence from https://www.mscbs.gob.es/gabinetePrensa/notaPrensa/pdf/13.05130520204528614.pdf.
DATA AVAILABILITY STATEMENT
The information on mobility are available upon request with the permission of Kido Dynamics SA. The data on incidence and mortality can be downloaded from [29] and the one on the antibody prevalence from https://www.mscbs.gob.es/gabinetePrensa/notaPrensa/pdf/13.0513
ACKNOWLEDGEMENTS
M.M. is funded by the Conselleria d’Innovacio, Recerca i Turisme of the Government of the Balearic Islands and the European Social Fund with grant code FPI/2090/2018. M.M., S.M. and J.J.R. also acknowledge funding from the project Distancia-COVID (CSIC-COVID-19) of the CSIC funded by a contribution of AENA, from the Spanish Ministry of Science, Innovation and Universities, the AEI and FEDER (EU) under the grant PACSS (RTI2018-093732-B-C22) and the Maria de Maeztu program for Units of Excellence in R&D (MDM-2017-0711).