Abstract
Seasonality plays an essential role in the dynamics of many infectious diseases. In this study, we use statistical methods to show how to detect the presence of seasonality in a pandemic at the beginning of the seasonal period and that seasonality strongly affects SARS-coV-2 transmission. We measure the expected seasonality effect in the mean transmission rate of SARS-coV-2 and use available data to predict when a second wave of the COVID-19 will happen. In addition, we measure the average global effect of social distancing measures. The seasonal force of transmission of COVID-19 increases in October in the Northern hemisphere and in April in the Southern hemisphere. These predictions provide critical information for public health officials to plan their actions to combat the new coronavirus disease and to identify and measure seasonal effects in a future pandemic.
1. Introduction
During the COVID-19 pandemic, many public authorities made their decisions based on predictions drawn from epidemiological compartmental models. The most famous of these models is also one of the simplest, the basic SEIR model. It can be seen as a qualitative epidemic model, as it is useful to understand the qualitative behavior of the dynamics of an epidemic. However, using such a simple model to make quantitative predictions mainly for long term variables, such as the total size of epidemics, seems like an oversimplification.1
The real world COVID-19 pandemic is a complex phenomenon in which many other factors must be considered to obtain qualitative understanding and quantitative predictive power. The exponential growth of a SEIR model is better suited to model a single epidemic in a homogeneous closed system like a small town. The dynamics of interconnected open systems with several cities and several countries, as we have in a pandemic, requires more sophisticated models such as meta-population models or agent-based models. 5 The existence of several subgroups with considerably different epidemiological characteristics, such as mobility, makes the homogeneous assumption obsolete. The social distancing measures used to decrease the transmission rates around the world added more difficulty to these predictive models. Many other factors also appear to influence the dynamics of the COVID-19 pandemic, but there is a well-known essential epidemiological phenomenon that is lacking in most of the models used by scientists and health officials so far: COVID-19 seasonality.
Many infectious diseases and particularly viral infectious diseases with respiratory transmission have a seasonal pattern of transmission at some level.2,3 This implies that there is a period of the year when the transmission rate is the highest and major epidemics are observed in this period in contrast with a complementary period of the year when the transmission rate is significantly lower.
Influenza viruses, pneumonia, rotavirus, cholera, measles, dengue and other coronavirus viruses are some of the many infectious diseases where seasonality has an important effect on the transmission rate.2–10
Seasonality should not be confused with temperature. Although temperature is an important factor positively correlated with seasonality, there are many other factors that also influence this complex phenomenon. Climate factors, host behavior factors and biological factors, among others, can be associated with seasonal forces: precipitation, human mobility, school calendar, immunity and many others. 1,2,5,6 In addition, these factors vary from place to place and therefore must be put in a relative perspective. For example, an average temperature of 15 degrees Celsius is associated with winter days in tropical areas, on the other hand, it is associated with summer days in higher latitudes. Hence, when quantifying the relationship between temperature and seasonal transmission force with data from various locations, careful analysis is necessary.4
Respiratory syndromes have a common pattern in which the transmission rate is typically higher in autumn and winter and lower in spring and summer.3,4 Note that the months with the highest number of cases are reversed in the Northern and Southern hemispheres.
When considering the seasonal forces in modeling the dynamics of the disease, we usually have that the β transmission rate will be given not by a constant positive real number as in the basic SEIR model, but by a non-constant function of time βt. For clarity, consider that βt assumes only two values βmax and βmin, where βmax > βmin >0. That is, we divide the year into two periods, one where the transmission force is greater, with a transmission rate βmax, and another period where the transmission force is lower with βmin transmission rate. We can also associate other epidemiological parameters with βmax and βmin such as the basic reproduction number R0. If R0 is measured during a high season period where β(t) = βmax, then we will obtain R0 = R0max. However, if R0 is measured outside the seasonal period, β(t) = βmin and we will obtain R0=R0min, where R0max > R0min.
The seasonality is fundamental in the dynamics of seasonal diseases and it is vital for long-term forecasts.1 Estimating the βt transmission function by obtaining estimates for the seasonal period and for βmax and βmin parameters or for associated parameters such as R0max and R0min is crucial for the public health authorities in planning and preventing a seasonal disease. 1In particular, second wave forecasts are notably influenced by seasonality (see details in the supplementary appendix).
2. Results
We show below that COVID-19 transmission is highly affected by seasonality. We initially estimated that COVID-19 global mean seasonal period coincides with the mean seasonal period of other respiratory syndromes, particularly with the mean seasonal H1N1 period, which runs from the October to March in the Northern hemisphere and from April to September in the Southern hemisphere.10 In these seasonal periods, the transmission rate is higher and larger epidemics are expected. Formally, we call the time interval where β = βmax of seasonal period. We define the seasonal moment of reversal as the beginning of the seasonal period, that is, the expected moment when the β increases from βmin to βmax.
However, there is some natural variability in data from endemic diseases. Thus, we also use data from another pandemic, where we can find epidemiological data from several countries in a synchronized way. The 2009 H1N1 pandemic data for Northern and Southern hemispheres are shown in Figure 1, taken from the World Health Organization database.11 We can see that the October was indeed a seasonal moment of reversal for the H1N1 pandemic, as we can see an increase in the number of cases in the Northern hemisphere during this month. Thus, we estimate this to be the next moment of reversal for the COVID-19 pandemic and we set April 15 and October 15 as approximate dates when the COVID-19 transmission rate changes in both hemispheres.
From the curve of the Southern hemisphere, we observed a consistent increase in the number of cases since the beginning of the data, in week 17, corresponding to the second half of April. From the curve of the Northern hemisphere, we can see that week 41, which corresponds to the second week of October. Although, in week 36, the first of September, it can be already seen an increase in the number of cases. Likewise, we can expect a second seasonal period for COVID-19 in the Northern hemisphere starting in September in some countries and a general spread in October. Graphics obtained from WHO website. 12
It is important to emphasize that this global seasonal period is an average of the seasonal periods of the countries around the world. The seasonal period varies from one location to another and the global seasonal period can be seen as an expected value for a country chosen at random. Therefore, in some places the transmission rate will increase before this expected period, while in others it will increase after the expected seasonal period.
In addition, it is very important to distinguish the seasonal period where transmission is greatest from the epidemic period, where the number of cases is highest. They are highly correlated with seasonal diseases, but the seasonal period usually starts earlier and influences the epidemic period although the former does not determine the latter. Many other factors influence the size and duration of epidemic periods, such as the proportion of susceptible populations. The seasonal period begins when epidemics accelerate, that is, we look at the variation in the number of cases and not at the number of cases themselves. Meanwhile, in the epidemic period we look at the number of cases itself. Typically, seasonal periods start a few weeks before the epidemic periods and end a few weeks later, especially when the proportion of susceptible population is small.
Before obtaining estimates for the seasonal effect on the COVID-19 transmission, we first estimate the effect of the social distancing measures. In this pandemic, social distancing measures have been widely adopted, influencing the transmission rate of Sars-Cov2 in most countries in the world. As we expected the reversal of seasonality to occur in April, it is crucial to take into account the effect of social distancing measures to properly estimate seasonal effects.
Measuring the effect of Social Distancing
In addition to its intrinsic importance, we estimate the effect of global social distancing measures in order to discriminate from the seasonal effect because coincidentally the social distancing measures were taken at the end of March which is very close to April where we expect the reversal moment of seasonality.
For each one of the 50 countries with the greatest epidemics from March 1 to May 1, we have collected the dates when they began to adopt social distancing interventions. This data was obtained from two different websites for each country from various sources on the internet. Details can be found in the supplementary appendix.
The average start date for social distancing measures was March 19, with a standard deviation 6.6 days. The countries in the Northern and Southern hemispheres had similar starting dates, with both means on March 19. We take the mean effect of social distancing (MESD) as the difference between the slopes for periods of 10 days before and after the adoption of the measures. Formally, let B be the slope of the regression line from the mean rate of cases from March 17 to March 26. Let A be the slope of the regression line from the mean rate of cases from March 27 to April 5. The one week gap from March 19 and the beginning of this interval is due to the fact that there is a delay between the adoption of a control measure and its impact on reported cases, as the disease has a median incubation period of 4 days and there are some days of delay between the laboratory test and its result. The total delay varies between countries but we consider 7 days as a rough estimate of the total delay. The length of the interval is 10 days because the social distancing measures started very close to the expected seasonality period that should be somewhere in April. Therefore, we must consider it as small as possible to avoid confusion between the social distancing effect and seasonality effect. We define MESD = A-B.
Figure 2 shows the graph of the global average COVID-19 case rates with the regression lines before and after March 19. We obtain slope estimates and 95% confidence intervals given by B= 0.2143 (CI=[0.1720, 0.2567]), A=0.0379 (CI= [−.0135, 0.0893]). MESD= −.1765, which represents a relative reduction of 82.3%. Note that A and B are not independent. The closer the regression lines are from each other we can suppose more positively correlated A and B are. As the upper limit 0.0893 for A is less than the lower limit 0.1720 for B we reject the hypothesis that A = B at 95% confidence level. Thus, there is consistent statistical evidence that the social distancing measures have decreased, at least for the short term, the global average growth rate of COVID-19’s cases with an estimated relative reduction of 82.3% in the speed of growth.
The black curve shows the global mean rate of cases per 100k inhabitants. The left red line is the linear regression line for a period of 10 days immediately before the effects of social distancing measures were expected to appear. The right red line is the linear regression line for a 10 days period starting one week after March 19, the average start date for social distancing measures.
To ensure that the seasonality effect is not confusing this analysis, we do the same analysis for the Northern and Southern hemispheres to see if there is a different behavior in both groups. For the Northern hemisphere B=0.2610, A = 0.0431 and MESD= −.2178 which represents a relative reduction of 83.5%. For the Southern hemisphere, B=0.0681, A = 0.0130 and MESD= −.0552, which represents a relative reduction of 81.0%. Hence, the qualitative behavior was the same in both hemispheres in the periods immediately before and after the adoption of social distancing measures and we conclude that the reduction in the global growth rate at the end of March was not due to seasonality. For more details, see the supplementary material.
Note: Remember that the union of Northern and Southern groups does not form the Global group because we have added Argentina and New Zealand to the Southern group although they are not in the 50 largest epidemics for the measured period.
Measuring the seasonality effect
We are now ready to estimate the effect of seasonality on COVID-19 transmission. First, we consider the seasonal effect for each hemisphere as the variation in the slope for the mean polled rate at the expected moment of seasonal reversal, which we estimate as April 15. Likewise, as we did to obtain the effect of social distancing, we define the mean seasonal effect in the Northern hemisphere (MSEN) to be A-B, where B is now the slope of the regression line of the mean rate of cases from March 27 to April 5. Let A be the slope of the regression line of the mean rate of cases from April 16 to May 1. Likewise, we define the mean seasonal effect in the Southern hemisphere (MSES).
Figure 3 below gives a clear picture of how the mean daily rate of cases changed in different directions just after the estimated moment of seasonal reversal. To quantify this difference, we obtain the estimates of the slopes of the Northern hemisphere and 95% confidence intervals given by B=0.0478 (CI=[−.0138, 0.1094]), A= −.0586 (CI=[−.0957, −.0215]). MSEN= −.1064, which represents a relative reduction of 222.5%. We interpret a relative reduction greater than 100% as a reduction which changes a positive slope to a negative one. Note that A and B are not independent. The closer the regression lines are from each other we can suppose more positively correlated they are. Since the upper limit −.0215 for A is less than the lower limit −.0138 for B, we reject the hypothesis that A = B at 95% confidence level. There is consistent statistical evidence that the mean seasonal effect in the Northern hemisphere is smaller than zero (MSES < 0).
black curves show the mean rate of cases per 100k inhabitants for countries in the Northern and Southern hemispheres, respectively, from March 5 to May 1. Left red lines are the linear regression lines immediate before the expected seasonal reversal moment and right red lines are linear regression lines immediate after the expected seasonal reversal moment for both hemispheres.
For the Southern hemisphere, slope estimates and 95% confidence intervals are given by B=0.0130 (CI=[−.0081, 0.0341]), A= 0.1089 (CI=[0.0500, 0.1678]). MSES = 0.0959, which represents a relative increase of 740.3%. As the upper limit 0.0341 for A is less than the lower limit 0.0500 for B we reject the hypothesis that A = B with 95% confidence level. There is consistent statistical evidence that the mean seasonal effect in the Southern hemisphere is greater than zero (MSES > 0).
Consider the plausible hypothesis that the social distancing effect could potentially influence a greater drop in the slope if we have considered a period of time greater than the 10 days in its definition. As a consequence part of the decrease in the slope of the Northern hemisphere that we are attributing to the seasonal effect could be given by the effect of social distancing measures. However, assuming this is true, then a similar decreasing effect would be occurring in the Southern hemisphere and the absolute seasonality effect would be even greater. This is a contradiction since we would have a very small seasonal effect in one hemisphere and a very large seasonal effect on the other.
In addition, consider the hypothesis that a fatigue effect could be explaining the increase in Southern mean rate. The average time interval between the start of social distancing measures (March 19) and the start of the data time interval that we used to measure the seasonal effect (April 16) is less than a month. It seems unlikely that the effect of fatigue could be responsible for increasing the rate of cases in the Southern hemisphere and in such a short time. Besides, we would again have an opposite effect in the Northern hemisphere which seems a contraction. Another important possible confounding factor, the social-economic factor is analyzed in detail in the supplementary material. These and other factors that could possibly affect rates in reported data such as heterogeneous transmission and increased rate of testing typically produce a similar effects across countries, mostly in large groups such as Northern and Southern hemispheres. Although quantitative differences are expected, similar qualitative effect is expected in both hemispheres for these factors.
Seasonality, on the contrary, produces opposite effects in Northern and Southern hemispheres as the ones we observe in COVID-19 pandemic data from the end of April. Hence, there is sufficient statistical evidence that points to a consistent seasonality effect in the COVID-19 pandemic.
To confirm that the difference between the growth speeds from the end of April and the end of March is due to seasonality and not to confounding effects, we run a multiple linear regression analysis. We take as response variable Y the seasonality effect (A-B) for each country. The explanatory variables are the seasonal factor XHP, the social distancing factor XSD and the income factor XIC. We set XHP as the indicator variable if the country belongs to the Northern hemisphere, XSD as the discrete score varying from 0 to 2 according to the level of social distancing measures adopted (low/none, moderate, high/lockdown) and XIC as the country growth domestic product per capita (GDP). After re-scaling variables to a comparable scale, we obtained for the additive model the estimate Y = 0.20 −.18XHP −.06XSD −.09XIC. All three factors contribute to a reduction on Y, but the seasonal factor XHP had the largest absolute effect.
To access dependence between factors, we run a multiple linear regression model with interaction terms obtaining for re-scaled variables the estimate Y = 0.11 −.16XHP +0.03XSD +0.01XIC −.01XHPXSD −.01XHPXIC −.09XSDXIC. Note that the absolute value of individual coefficient of the seasonal factors XHP remains the largest and its estimate is very close to the estimate obtained in the additive model. The interaction coefficients between seasonal factor and the other two factors are considerably small which shows that the seasonal factor affected all countries independently of the levels of social distancing and income factors. The individual coefficients for social distancing and income factors changed from negative to positive, which shows that individually these factors did not contribute to reduce the growth rate of cases. The reduction measured previously in the additive model is contained in the interaction between social distancing factor and income factor. This implies that the social distancing effectiveness is highly correlated with income and that its impact was bigger in high income countries than in low income countries. Details and further regression analysis can be found in the supplementary appendix.
Lastly, to access the variability of the seasonal effect, we show in figure 4 the box-plots of the seasonal effects for the Southern and Northern hemispheres. This emphasizes the difference in the distribution of the effects of the two hemispheres, which corroborates to the hypothesis of seasonality.
the boxplot for seasonal effects of countries in the Northern hemisphere is shown on the left (red boxplot). The boxplot for seasonal effects from countries in the Southern hemisphere is shown on the right (light blue boxplot).
3. Discussion
In this study we set that seasonality strongly affects COVID-19’ transmission and that its seasonal period follows the autumn/winter pattern typical from other viruses with respiratory transmission. We also measured the social distancing effect, providing the distinction between the seasonal effect, the social distancing effect and the income effect.
Our results partially contradict the results obtained by Flaxman et al.12 and Islam et al.13 In the first study, the decrease in daily cases in 11 European countries in late April and early May is attributed exclusively to social distancing interventions without considering seasonality or other possible confounding factors. In the second, worldwide data are analyzed, but again the entire effect is attributed to social distancing measures without considering other factors such as seasonality.
Prediction of second waves and consequences for decision-making
The Seasonal force of transmission drives the second waves in seasonal diseases. Once the seasonal effect is established in COVID-19 and its seasonal periods are known, we can predict the appearance of future epidemics. In particular, a second wave is expected to begin around September or October in the Northern hemisphere. In contrast, a significant reduction in the transmission rate is expected to begin in countries of the Southern hemisphere in the same period.
We estimated that the seasonal periods occur from April to September (high season in the south) and from October to March (high season in the north). These are periods when the transmission rate is highest in each hemisphere. They are highly related to the epidemic periods, but the first should not be confused with the second one. The seasonal moments of reversal are good estimates for the beginning of the increase in the number of cases in the countries of the respective hemisphere. Therefore, we expect a general increase in the number of COVID-19 cases in most countries in the Northern hemisphere in October and a general decrease in the number of cases in most countries in the Southern hemisphere to begin at the same time.
Nevertheless, remind that epidemic periods also depend on other variables such as the percentage of the susceptible population and that each city has its own seasonal transmission rates βmax and βmin. Thus, places where major epidemics have occurred will be less impacted by this change in the transmission rate. Some possible examples are Sweden, Belgium and some cities in the United States, Spain and Italy that have already had major epidemics and where the increase in the transmission rate will affect less the size of the epidemic due to a smaller proportion of the susceptible population. On the other hand, some countries with very large proportions of susceptible population will be more affected if they do not control their epidemics. Some examples are most countries in Europe and Asia.
The first wave in most of Europe and the rest of the Northern hemisphere lasted mainly just two months or less, from mid-February to mid-April. The social distancing measures had to be adopted for around two months before the seasonal period ended. In May, when the majority of European countries began to make social distancing measures more flexible, the transmission rate was lower and the epidemics were controlled with less stringent measures. Cities with R0min < 1 did not see the continuation of the epidemic. Some other locations, like some cities in the United States, certainly have R0min > 1 and therefore, the epidemic continued to increase. Nonetheless, it is important to note that they have not been affected by the seasonal effect.
In the Southern hemisphere, the social distancing measures had an effect similar to that of the Northern hemisphere, producing a stabilization in the number of cases in the first two weeks of April. But the seasonal effect produced a significant increase in the transmission rate, which went from βmin to βmax and in addition to the social distancing measures, the number of cases increased consistently from April 15. Moreover, more restrictive measures such as lockdown, though effective, could not be adopted for the entire seasonal period which extends over 6 months. An example is Argentina, which kept its epidemic controlled by lockdown measures like most of Europe but when it started to become more flexible (either officially or due to population fatigue) the number of cases increased differently from what happened in Europe because the transmission rate was βmax instead of βmin. From October 2020 to March 2021 the transmission rate in the Southern hemisphere will decrease from βmax to βmin. In places where R0min < 1, the epidemic will be controlled and in places with smaller proportions of susceptible populations, such as most of South America and South Africa, we also expect the epidemic to be controlled. Nevertheless, some countries with large proportions of susceptible populations like New Zealand and Australia, may experience major epidemics in cities where R0min > 1 if no control measures are adopted.
With the seasonality effect taken into account, we predict that many (but not all) countries in the Northern hemisphere will have second waves which will begin more generally during September and will increase in October and November. In fact, at the end of August, when this article was finalized, the trend of increasing the number of cases in some countries in the Northern hemisphere restrictive social distancing measures such as lockdown maybe be not as good as it was in the first wave due to the length of the period high season will take its entire six-month span. Other control measures such as vaccination of a part of the population must be adopted. Until vaccines are not available, other control measures such as active contact tracing will have to be adopted, otherwise effective social distancing would have to last until either a vaccine is available or mid-April. The other option is mitigating measures such as less restrictive social distancing measures and waiting for herd immunity.
Detection of Seasonality in future pandemics
We briefly describe below the steps to detect the presence of seasonal effects in future pandemics:
The comparison between aggregated slopes in the Southern and Northern hemispheres curves can be used to detect seasonality at the beginning of a pandemic, even with data from a short period of time. Beware of confounding factors must be taken in this preliminary analysis.
Monitoring abrupt changes in their slopes (either R0t or βt) close to estimated seasonal moment of reversal (expected begin of seasonal period) is a way to confirm this detection.
Measuring changes in data of the chosen parameters, before and after the seasonal moment of reversal provides an estimate of the seasonal effect.
Our method is particularly useful for detecting seasonality during a pandemic, since a large amount of data from both hemispheres are available in this case. In minor epidemics where data from many countries are not available, data from a small number of countries can still be used to apply this methodology as long as data from countries in both hemispheres are available. In the event of an epidemic in a given country, data on similar diseases could be used to make the comparison and provide possible evidence of seasonal forces in transmission.
Note that we can measure the seasonal force of infection by obtaining the reproduction numbers R0min and R0max for the low and high seasonal periods. Since R0 (R0t) is a very popular measure, it is tempting to calculate it for pooled data for both hemispheres or, alternatively, it can be calculated for each country separately and then obtain averages. In both cases we believe this method would be technically incorrect. To be properly calculated, it must be obtained from the mean of all R0 for each city in the sample, not each country. We did not perform this valuable analysis because of the limited time we have due to the urgency of the pandemic. Therefore, this important methodology is led to future work.
Funding
No funding.
Competing interests
The author declares no competing interests.
Data and materials availability
The data was extracted from the John Hopkins University website https://coronavirus.jhu.edu/. More details and information are available at the supplementary material.
Acknowledgements
The author thanks Rodrigo M.C. Dias for his support with the data and figures and Fernanda Di Genio for fruitful discussions.
Footnotes
Minor corrections. Multiple linear regression included. Supplementary appendix included.