Abstract
Environmental factors, including seasonal climatic variability, can strongly impact on spatio-temporal patterns of infectious disease outbreaks, but relationships between Covid-19 dynamics and climate remain controversial. We assessed the impact of temperature and humidity on the global patterns of Covid-19 early outbreak dynamics during January-March 2020. Here we show that Covid-19 growth rates peaked in temperate regions of the Northern Hemisphere with mean temperature of ∼5°C, and specific humidity of 4-6 g/m3 during the outbreak period, while they were lower both in warmer/wetter and colder/dryer regions. Relationships between Covid-19 and climate were robust to the potential confounding effects of air pollution and socio-economic variables, including population size, density and health expenditure. The strong relationship between local climate and Covid-19 growth rates suggests the possibility of seasonal variation in the spatial pattern of outbreaks, with temperate regions of the Southern Hemisphere becoming at particular risk of severe outbreaks during the austral autumn-winter.
One Sentence Summary Temperature and humidity strongly impact the variation of the growth rate of Covid-19 cases across the globe.
Host-pathogen interaction dynamics can be significantly affected by environmental conditions, either directly, via e.g. improved pathogen transmission rates, or indirectly, by affecting host susceptibility to pathogen attacks 1. In the case of directly transmitted diseases, such as human influenza, multiple environmental parameters such as local temperatures and humidity impact on virus survival and transmission, with significant consequences for the seasonal and geographic patterns of outbreaks 2-6. The recently discovered coronavirus SARS-CoV-2 is the aethiological agent of Covid-19, a pandemic zoonosis causing severe pneumonia outbreaks at a global scale 7. Up to April 2020, Covid-19 cases were reported in more than 180 countries and regions worldwide 8, though the global patterns and the early dynamics of Covid-19 outbreaks appeared highly variable. Some countries have been experiencing limited growth and spread of Covid-19 cases, while others were suffering widespread community transmission and fast, nearly exponential growth of infections 8. Understanding the environmental drivers of early growth rates is pivotal to predict the potential severity of disease outbreaks (i.e. the disease impact in the absence of containment measures) 9,10. Given the importance of environmental conditions on the transmission of many pathogens, we tested the hypothesis that the severity of Covid-19 outbreaks across the globe was affected by spatial variation of key environmental factors, such as temperature, humidity 5,11-16, and air pollution [fine particulate matter17; see Supplementary Methods], controlling for major socio-economic characteristics of affected countries 18. We then evaluated the potential seasonal variation in the risk of severe Covid-19 outbreaks at a global scale.
Relying on a publicly available dataset 8, we computed the daily growth rates r of cumulative growth curves of confirmed Covid-19 cases (Covid-19 growth rate hereafter) for 79 countries/regions (hereafter regions). We calculated the mean daily growth rate during the exponential phase of the growth curve19 (Fig. S1) for all those regions for which a minimum of 25 cases were reported, and for which at least 10 days had elapsed after reaching this minimum threshold (see Methods). Variation at these early epidemic growth rates should best reflect the impact of local environmental conditions on disease spread, considering that in most regions local authorities adopted unprecedented containment measures immediately after the detection of outbreaks to mitigate pathogen spread and community transmission20,21. Indeed, the exponential growth phase lasted on average 7.48 d and was usually followed by a deceleration of growth, likely as a progressive effect of active containment actions (Fig. S1). To better highlight the possible effects of climate before containment actions could be effective, we restricted the analyses to data reported up to March 21, 2020. This was the mean date (SD = 6 days) when 61 governments established countrywide lockdowns 22, often adopting strict containment measures even in absence of large numbers of reported cases23. Among the socio-economic factors potentially affecting SARS-CoV-2 transmission dynamics during outbreaks, we considered human population size, population density, per capita health expenditure and age structure (see Methods).
Covid-19 growth rates showed high variability at the global scale (Fig. 1A). The observed daily growth rate during the exponential phase was on average 0.28 (SD = 0.13), and ranged from 0.06 (West Bank and Gaza, Singapore) to 0.72 (Denmark). The highest growth rates were observed in temperate regions of the Northern Hemisphere, although fast growth also occurred in some warm climates, notably in Brazil, Indonesia and the Philippines, suggesting that no area of the world is exempt from SARS-Cov-2 infection risk.
A) Global patterns of mean daily growth rate r; the size of dots is proportional to the observed r-values. The background shows the spatial projections of growth rates according to mean March temperatures for the period 2015-2019. Predictions are based on the best-fitting model relating r to mean temperature of the outbreak period (Table S2a), keeping the other model covariates at their mean value. B-D: Spatial projections of growth rates according to mean June, September, and December temperatures. See Fig. S4 for estimates of uncertainties of projections. Shaded areas have conditions outside the calibration range of models, thus uncertainty of projections is particularly high 52.
Covid-19 growth rate was strongly related to a combination of climatic and socio-economic variables (Table S1). The best-fitting mixed model suggested that growth rate is non-linearly related to spatial variation in mean temperature of the outbreak period (Fig. 2A, Tables S2-S3; R2 = 0.44). Growth rate peaked in regions with mean temperature of ∼5°C, and decreased both in warmer and colder climates (F1,72 = 18.9, P < 0.001; Fig. 2A, Table S2). Furthermore, growth rate was faster in regions with large human population size (F1,74 = 27.5, P < 0.001) and high health expenditure (F1,52 = 8.9, P = 0.004; Fig. S2). Models not including temperature and population size showed very limited support (Table S1), suggesting that these variables were major drivers of spatial variation of Covid-19 growth rate. Human population density and air pollution were never included in models with high support (Table S1), suggesting that they play a relatively minor role in determining Covid-19 growth rates, at least at the coarse spatial scale of this study. Results were robust to different approaches for the calculation of growth rate, to the inclusion of countries experiencing outbreaks in late March (up to March 31, 2020), to the exclusion of specific countries that could introduce biases, and to the use of alternative socio-economic variables (Supplementary results; Tables S1-S2).
Graphs show the partial regression plots 53 from best-fitting models of Covid-19 mean daily growth rates (at the country/region level; data up to March 21, 2020) in relation to A) mean temperature and B) mean specific humidity of the outbreak period (Tables S3-S5). Shaded areas are 95% confidence band.
Temperature and specific humidity of the outbreak period showed a strong, positive relationship across regions (Fig. S3), thus they could not be included as predictors in the same model. When we repeated the analyses including humidity instead of temperature, Covid-19 growth rate varied significantly and non-linearly with humidity, peaking at ∼4-6 g/m3 (Fig. 2B, Tables S3-S4; R2 = 0.41). This model also confirmed the faster growth in regions with largest populations and highest health expenditure, but showed a slightly poorer fit to the data compared to the model including temperature. Again, models were robust to the use of different approaches (Supplementary results; Tables S3-S4).
The decrease of Covid-19 growth rate in warm and humid climates can be explained by two non-exclusive processes. First, coronavirus persistence outside the organisms decreases at high temperature, medium-high humidity, and under sunlight, even though they can survive several hours at temperatures >30° C 5,24, implying that high ambient temperatures are not enough to quickly inactivate the infection. Second, host susceptibility can be higher in cold and dry environments, for instance because of a slower mucociliary clearance, or a decreased host immune function under harsher conditions6,12. SARS-CoV-2 is frequently transmitted indoor25, where microclimate is generally different from outdoor conditions. Nevertheless, climatic variation likely affects immune response and susceptibility12, and indoor absolute humidity generally matches the outdoor humidity2. Thus, even if outdoor climate does not correspond to the conditions under which viruses are mostly transmitted, it allows accurate predictions of outbreaks of other respiratory illnesses3,12, supporting the relevance of our results. Laboratory experiments on other pathogens showed a linear decrease in virus transmission and survival at temperatures increasing from 5 to 30° C2,6,24. Yet, non-linear relationships, with high probability of Covid-19 occurrence in regions experiencing intermediate temperature, has been suggested by independent correlative analyses14. The non-linear relationships between Covid-19 growth rates and climate (Fig. 2) was mostly driven by some regions experiencing extreme, cold conditions (Supplementary results), and may be explained by complex interplays between climate-related changes in human host social behavior, changes in host susceptibility to the virus, or changes in virus survival and transmission patterns.
The clear relationship between Covid-19 growth rate and climate suggests that seasonal climatic variation may affect the spatial spread and severity of Covid-19 outbreaks14, as observed for other virus-caused diseases3,6,12. To display potential seasonal changes in Covid-19 growth rates, we projected our best model under the average temperature conditions of representative months (Fig. 1A-D). The projected Covid-19 growth rates based on March temperatures showed very favorable conditions for disease spread in most temperate regions of the Northern Hemisphere, and matched well with the observed spatial distribution of Covid-19 growth rates during the January-March global outbreak (Fig. 1A). The expected seasonal variation in temperatures during the next months could results in slightly less suitable conditions for Covid-19 spread in these areas, while disease spread could accelerate in large areas of the Southern Hemisphere, including south America, south Africa, eastern Australia and New Zealand, and at the high latitudes of the Northern hemisphere14 (Fig. 1). Nevertheless, uncertainty in projections was substantial (Fig. S4) and, in absence of severe containment actions, projected growth rates remained consistently high (daily projected r ≥ 0.15) in most areas of the world, including many tropical countries (Fig. 1).
The impact of climate on Covid-19 spatial patterns and growth rates is controversial, with some studies suggesting that this disease has a reduced effect in warm climates 11,13,14,16, and others detecting a much stronger impact for socio-economic factors 18,26,27. Considering locations at the sub-national scale 11,14 can improve the probability of detecting effects of climate, as it avoids that very large countries, with heterogeneous climate and disease dynamics, are considered as a single point. For instance, in March 2020 the average temperature across the whole Canada was in was around -15°C, but most of Covid-19 cases occurred in Southern provinces, where temperatures are ∼10°C higher. Furthermore, we used an objective approach to identify the exponential phase of outbreaks 19. This allows focusing on early phases of the outbreaks 13,16,21, and maximizes the detection of its drivers before containment actions become effective, blurring possible effects of climate on disease spread. Analyses of relationships between Covid-19 and climate have also been criticized because SARS-CoV-2 shows a substantial rate of undocumented infections28, and high frequency of undocumented cases in some regions (e.g. in Africa) could affect conclusions9,29. In many countries, reported positives largely refer to tested individuals showing Covid-19 symptoms that require hospitalization. Therefore, even though our models cannot capture the (unknown) dynamics of undocumented infections, they provide key information on the geographical variation in the risk of occurrence of symptomatic SARS-CoV-2 infections. Furthermore, our analyses were not based just on the number of cases, but rather on the growth rate within each region26, and the pattern (Fig. 1) was mostly based on variation across areas of China, Middle East, Europe and North America. Results were robust to the removal of countries with low health expenditure, or those that adopted very early containment actions (supplementary results).
Containing Covid-19 outbreaks is undoubtedly one of the biggest challenges governments will have to face in the near future. Our spatially-explicit analysis suggests that, at least in some parts of the world, containment efforts could benefit from the interplay between pathogen spread and local climate. Nevertheless, the huge variation of Covid-19 growth rates among regions with similar climate highlights that diverse and complex social and demographic factors, as well as stochasticity, may strongly contribute to the severity of Covid-19 outbreaks. The potential socio-economic drivers of Covid-19 outbreak are many18,26. Even if we did not try to model the spatial spread of the disease across regions, we included several parameters representing account socio-economic factors. The positive relationship with human population size might be explained by multiple, non-exclusive processes including an easier control of early outbreaks in regions with small populations, or the occurrence of more trade and people exchanges in the most populated regions, resulting in multiple infection routes and faster spread18,26,27. Furthermore, growth rate was faster in regions with higher health expenditure, possibly because of more efficient early reporting and/or faster diagnosis of Covid-19 cases. However, the different socio-economic factors were strongly correlated (Table S6). For instance, areas with high health expenditure are also inhabited by more people older than 65 years, and a linear combination of human population and health expenditure predicts very well international trade of goods and services (Methods). Assessing the specific impact of these factors is challenging and was beyond the aim of this study. Nevertheless, the role of climate remained consistent even when controlling for different combinations of socio-economic factors, suggesting that unaccounted processes should not bias our results and implying that climate can contribute to explain variability in global patterns of Covid-19 growth rates. It is also possible that future analyses based on more recent, expanded datasets will not reveal major climatic effects on Covid-19 growth rates, because the worldwide enforcement of severe containment actions strongly limits the natural spread potential of the disease, thus fading associations between climate and disease dynamics14. If this is the case, we might instead expect an increasing impact of socio-economic variables, which are related to Covid-19 spread management, compared to climatic effects on Covid-19 outbreak dynamics. This prediction is supported by the comparison between datasets representing different time intervals (supplementary results).
Provided that SARS-Cov-2 will continue to circulate among human hosts, the observed effects of climate variables on Covid-19 outbreak dynamics suggests the potential for this disease to become seasonal, with the temperate regions of both hemispheres being most at risk of severe infections during the autumn-winter months. Nevertheless, in the absence of containment actions growth rates can be substantial even in warm climates, (Fig. 1). Stringent containment measures remain thus pivotal to mitigate the impacts of SARS-Cov-2 infections worldwide20,21.
Materials and methods
Covid-19 dataset
We downloaded the time series of confirmed Covid-19 cases (cumulative growth curves) from the Johns Hopkins University Center For Systems Science and Engineering (JHU-CSSE) GitHub repository (https://github.com/CSSEGISandData/Covid-19/) 8. JHU-CSSE reports, for each day since January 22, 2020, all confirmed Covid-19 cases at the country level or at the level of significant geographical units belonging to the same country, which we defined here as ‘regions’ (e.g. US states, or China and Canada provinces), whenever separate Covid-19 cases data for these regions were available. The cumulative growth curves were carefully checked and obvious reporting errors (a few occurrences of temporary decreases in the cumulative number of cases) were corrected. Data from the Hubei region of China were excluded as the JHU-CSSE dataset does not report cases before January 22, and by that time the epidemic was already largely spreading in Wuhan and nearby municipalities (with 444 confirmed cases for the entire Hubei region), implying that the early epidemic growth curve was missing. Results of analyses performed including the Hubei data were highly consistent with the ones presented in the main text. For some countries with large extent and federal organization (Brazil, India, Russian Federation), the JHU-CSSE dataset did not report information at the sub-national level. For these countries, we relied on different sources to retrieve data on temporal variation of Covid-19 confirmed cases at the sub-national level (Table S6). For Quebec, we limited our analysis to Southern Quebec data (i.e. health regions south of 49°N), since in the northernmost health regions (extending to sub-polar areas up to 62° N) nearly no Covid-19 cases were detected before March 21 (see Table S6 for data sources). In all the other cases we maintained the original country/region information adopted by the JHU-CSSE. The datafile included confirmed Covid-19 cases up to March 31, 2020. From this dataset, we selected data for all those countries/regions in which local outbreaks were detected up to March 21 and up to March 31, 2020.
The onset of a local Covid-19 outbreak event was defined as the day when at least 25 confirmed cases were reported in a given country/region. Visual inspection of growth curves showed that, in most cases, below this threshold the reporting of cases was irregular or growth was extremely slow for prolonged periods. We then calculated the daily growth rate r of confirmed Covid-19 cases for each country/region after reaching the 25 confirmed cases threshold following the approach proposed by Hall et al. 19. The method iteratively fits growth curves on successive intervals of a minimum of 5 data points to identify the exponential phase of a cumulative growth curve, and returns the lag phase, and the onset and end of the exponential growth phase. The lag phase, characterized by very slow growth, is followed by the exponential phase (Fig. S1). Typically, cumulative growth curves of Covid-19 cases begin with exponential growth in the early phases, which begins to decelerate within ca. 10 days of its beginning (e.g. Fig. S6; see also ref. 21). This pattern is similar to what has been documented for earlier phases of other major infectious disease outbreaks 30. We thus restricted the analyses to those countries/regions for which at least 10 days of data after the outbreak onset were available.
We computed the mean daily growth rate during the exponential phase as r = [ln(n casesday end exp. phase) - ln(n casesday start exp. phase)] / (day end exp. phase – day end exp. phase). We also computed the maximum daily growth rate rmax during the exponential phase according to ref. 19. Lag and exponential phase duration, and rmax were computed through the R package ‘growthrates’ 19. Mean and maximum daily growth rates were strongly positively correlated (Pearson’s correlation coefficient, r = 0.96, n = 79 countries/regions). Similarly, growth rates estimated up to March 21 and up to March 31 were strongly positively correlated (mean growth rate r: r = 0.99; maximum growth rate rmax: r = 0.99; n = 79 countries/regions), indicating that our growth rate estimates for a given country/region were highly consistent irrespective of the method and/or the selected temporal interval used for calculations.
The JHU-CSSE dataset does not report information on timing of containment measures, and these may be highly heterogeneous among countries/regions23. The method proposed by Hall et al. 19 should partially overcome issues related to the effects of containment measures on natural disease growth rates because these effects would show up as a deceleration of exponential growth, while the method focuses on the exponential growth phase only. The Hall et al. method works on a fixed time window of a minimum of 5 days, so theoretically if deceleration occurs before 5 days after reaching the 25 cases threshold, we could obtain less reliable estimates of growth rates. However, given the estimated time spanning between virus infection and the onset of symptoms 31, infected people might spread the virus for 5 days undetected in absence of preventive control measures, and visual inspection of growth curves (e.g. Fig. S6) did not show evident decelerations before 5 days after outbreak onset. We therefore assumed that estimated growth rates achieved during the exponential phase represent proxies of Covid-19 spread potential in a completely susceptible host population, before stringent containment measures became effective21. To confirm the robustness of this assumption, we repeated analyses by excluding countries with specific containment measures32,33, and obtained highly consistent results (Supplementary results). We highlight that our main analyses were limited to cases recorded before March 21, and containment measures established in this period were generally defined after a country experienced an exponential growth phase.
To reduce heterogeneity across countries/regions, we excluded from analyses those countries/regions with less than 100,000 inhabitants, or with total surface <1000 km2 and less than 1 million inhabitants (as of March 21, only San Marino was excluded; as of March 31, further excluded countries/regions were Andorra, Denmark/Faroe Islands, Liechtenstein, Malta, United Kingdom/Channel Islands, US/District of Columbia). Overall, our final dataset included information on 79 countries/regions up to March 21, 2020, and on 189 countries/regions as of March 31, 2020 (Table S5).
Environmental and socio-economic variables
We considered two climatic variables that are known to affect the spread of viruses: mean air temperature and specific humidity (water vapor pressure), which is a measure of absolute humidity. Previous studies showed that, for coronaviruses and influenza viruses, survival is generally higher at low temperature and low values of absolute humidity 2,5,6,12,24. We downloaded hourly values of temperature and 2-m dewpoint temperature at the 0.25° spatial resolution from the ERA5 hourly database (https://doi.org/10.24381/cds.adbb2d47); we then calculated specific humidity using the ‘humidity’ package in R 34. For each country/region, we calculated the mean monthly values for temperature (°C) and specific humidity (g/m3) during a 30-days interval, including the day of the end of the exponential phase and the preceding 29 days (Fig. S1).
Besides climate, it has been proposed that other environmental parameters may affect variation of Covid-19 outbreak severity. Air pollution, especially fine atmospheric particulate, may enhance the persistence, transmission and effects of coronaviruses 17,35,36. We therefore extracted values of annual concentration (µg/m3) of ground-level fine particulate matter (PM2.5) for 2016 from the NASA Socioeconomic Data and Applications Center 37, and calculated the mean abundance of PM2.5 for each country/region.
Among socio-economic predictors, we considered mean human population density 38 (population density hereafter, expressed in inhabitants/km2), total population size 38, and per capita government health expenditure (health expenditure hereafter) (in US$; average of 2015-2017 values downloaded from the World Health Organization database at https://apps.who.int/nha/database and from http://documents.worldbank.org/ for West Bank and Gaza). Health expenditure was available at country-level only, hence regions within countries were assigned the same health expenditure value.
Furthermore, the growing indicates that elderly people are more susceptible to develop severe Covid-19 symptoms 39, we also obtained for each country/region included in the dataset an estimate of the proportion of the population aged 65 or older (population 65+). Estimates of population 65+ were based mainly on the most recent available data retrieved from the United Nations website, but were integrated with sub-national level data for Australia, Brazil, Canada, China, India, and the US (Table S6). International trade is an additional process that has been proposed to affect COVID-19 dynamics 18. Following ref. 18, we downloaded data on the 2017 total imports of goods and services (current US$) from the World Bank Database (https://data.worldbank.org/indicator). However, total import of goods and services was strongly related to a linear combination of total population size and health expenditure (linear model on log-transformed values: R2 = 86). Furthermore, per-capita imports of goods and services was strongly related to health expenditure (Pearson’s correlation, r = 0.88). The strong correlation among these values hampers to include them in the same regression model 40, and makes it difficult identifying the role of a specific factor. The aim of this study was not teasing apart the effect of distinct socio-economic processes, but assessing the potential role of climate after taking them into account. Our results were robust to the inclusion of different socio-economic values, as climatic variables remained in the top-ranking model independently of the inclusion of alternative variables. We performed all spatial analyses using the raster package in R 41.
Statistical analyses
We used linear mixed models (LMMs) to relate the global variation of r and rmax to the five environmental predictors (temperature and humidity of outbreak month; population size; population density; health expenditure; PM2.5; population 65+). Country was included as a random factor to take into account potential non-independence of growth rates from regions belonging to the same country. Non-linear relationships between climatic factors and ecological variables are frequent, and have also been proposed for relationships between SARS-CoV-2 occurrence and climate 14,42, and in exploratory plots we detected a clear non-linear relationship between r-values and climate. Therefore, for climatic variables, we included in models both linear and quadratic terms. Humidity, population density, population size, health expenditure and PM2.5 were log10-transformed to reduce skewness and improve normality of residuals.
We adopted a model selection approach to identify the variables most likely to affect the global variation of Covid-19 growth rate 43. We built models representing the different combinations of independent variables, and ranked them on the basis of Akaike’s Information Criterion (AIC). AIC trades-off explanatory power vs. number of predictors; parsimonious models explaining more variation have the lowest AIC values and are considered to be the “best models” 43. AIC can select overly complex models, thus we considered a complex model only if it showed AIC less than the AIC of its simpler, nested models 43,44. For each candidate model, we calculated the Akaike weight ωi, representing the probability of the model given the data 45.
Model selection analyses can be heavily affected by collinearity among predictors. In our dataset, temperature and humidity showed a very strong positive correlation (Fig. S3 and Table S7); furthermore, population density was strongly positively related to PM2.5, and health expenditure was positively related to population 65+ (Table S7). Therefore, temperature and humidity, and health expenditure and population 65+, could not be considered together in the same models 40,46. Population density and PM2.5 were never included in top models (Tables S1-S3), thus their collinearity should not bias our conclusions. All other predictors showed weak correlations 40 (Table S7). We therefore repeated the model selection for different combinations of predictors. First, we considered temperature, PM2.5, population size, population density and health expenditure as predictors. Then we repeated the analysis using humidity instead of temperature. Finally, we repeated model selection using population 65+ instead of health expenditure.
LMMs were fitted using the lmer function of the lme4 R package 47, while tests statistics were calculated using the lmerTest package 48. The generalized R2 (variance explained by fixed effects) and semi-partial R2 for individual predictors were computed from fitted LMMs using the r2glmm package 49. To confirm that spatial autocorrelation did not bias the outcome of our analyses, we calculated the spatial autocorrelation (Moran’s I) of the residuals of best-fitting models using the EcoGenetics R package 50 at lags of 1000 km up to a maximum distance of 5000 km. Model residuals did not show significant spatial autocorrelation at any lag (in all cases, |Moran’s I| < 0.15 and P > 0.2), suggesting that spatial autocorrelation was not a major issue 51.
Data Availability
All the relevant data are submitted along with the manuscript
Author contributions
The authors jointly a conceived the work, analyzed data and wrote the manuscript
Competing interests
None
Data and materials availability
All relevant data have been submitted as supplementary file
Acknowledgments
we than the colleagues of the DISPArati group, especially G. Scarì, for insightful and stimulating discussions. We also thank M. Venegoni, G. Venegoni, V. Longoni, J. G. Cecere and L. Serra for commenting on previous draft of our manuscript.