Abstract
Successive waves of infection by SARS-CoV-2 have left little doubt that this virus will transition to an endemic disease 1,2. Projections of the endemic seasonality of SARS-CoV-2 transmission are crucial to informed public health policy 3. Such projections are not only essential to well-timed interventions and the preparation of healthcare systems for synchronous surges with other respiratory viruses 4, but also to the elimination of seasonality as a confounder in the identification of surges that are occurring due to viral evolution, changes in host immunity, or other non-seasonal factors. However, the less than two-year duration of SARS-CoV-2 circulation, pandemic dynamics, and heterogeneous implementation of interventions have grievously complicated evaluations of its seasonality 5. Here we estimate the impending endemic seasonality of SARS-CoV-2 in global population centers via a novel phylogenetic ancestral and descendent states approach 6 that leverages long-term data on the incidence of circulating coronaviruses. Our results validate a major concern that endemic COVID-19 will typically surge coincident with other high-morbidity and -mortality respiratory virus infections such as influenza and RSV 7. In temperate locales in the Northern Hemisphere, we identify spatiotemporal surges of incidences that range from October through January in New York to January through March in Yamagata, Japan. This knowledge of likely spatiotemporal surges of COVID-19 is fundamental to optimal timing of public health interventions that anticipate the impending endemicity of this disease and mitigate SARS-CoV-2 transmission.
The current COVID-19 pandemic has resulted in over 5.5 million deaths worldwide 8.Sustained transmission is predicted to continue into the foreseeable future 3,9, rendering COVID-19 into a global endemic disease 1. Changes in case numbers have been observed in different regions and at different times during the last year; it is unclear whether these changes have been consequences of public health regulations, behavior change, the virulence evolution, or interactions of these factors under local environmental conditions 5. Analyses of data on pandemic spread have also associated higher rates of infection and mortality with low UV light, low humidity, and temperature 10,11. However, given the global variability in public health measures throughout the pandemic and the short duration since SARS-CoV-2 emergence, there is little applicable data upon which to directly determine the seasonality of the virus 12. This absence of annual SARS-CoV-2 infection data without interventions has hampered efforts to determine COVID-19 seasonality and resulted in contradictory estimates of seasonal trends 13–16. Therefore, an alternate approach to estimating the seasonality that will not be confounded by public health interventions or pandemic transmission dynamics is necessary for global preparation and pandemic policy decision-making.
Seasonality of SARS-CoV-2 can be inferred by analogy to the annual variation in transmission of other respiratory viruses, including multiple coronaviruses 16–20. However, such analogies lack rigorous quantification. As an alternative to analogy, data on human-infecting coronavirus relatives of SARS-CoV-2 can be analyzed via a well-established quantitative comparative phylogenetic approach. This approach utilizes available trait data from close evolutionary relatives, the rate at which a trait evolves, and information on evolutionary divergences to estimate ancestral and descendent states 21,22. Such evolutionary inference has already yielded crucial estimates of the durability of immunity against reinfection by SARS-CoV-2 6. Here we apply comparative phylogenetics to extensive long-term incidence data on other coronaviruses (HCoV-OC43, HCoV-NL63, HCoV-HKU1, HCoV-229E) across major population centers. This analysis provides an unconfounded means for estimation of the seasonal force of infection that is not dependent on isolation of interventions or identification of underlying mechanisms. Our projections of endemic SARS-CoV-2 seasonality are fundamental to the anticipation and optimization of public policy for high-risk periods and to the preparation of healthcare providers for localized surges.
Results
Our systematic review regarding seasonal patterns of endemic coronavirus incidence identified 14 studies that met the criteria of providing at least one year of data on at least three circulating human-infecting coronaviruses within a locale. These studies spanned three continents across the Northern Hemisphere (Table 1): North America (Datasets i & ii), Europe (Datasets iii–vii), and Asia (Datasets viii–xiii). In temperate regions, endemic coronaviruses typically exhibited pronounced seasonality (Supplementary Figs. S1–S4).
Datasets on seasonal coronavirus incidence
For each location, we pruned the phylogeny of major coronavirus lineages from Townsend et al. 6 to include only the endemic human-infecting coronaviruses with sample data and SARS-CoV-2 (Table 1, Fig. 1A). To generate maximum-likelihood estimates of the spatiotemporal incidences of SARS-CoV-2, we conducted analyses of ancestral and descendent states on the relative monthly incidences for each coronavirus (Fig. 1B–E). All four endemic coronaviruses contributed to our projection of the relative monthly incidence of SARS-CoV-2 (Fig. 1F). However, the late-diverging HCoV-OC43 and HCoV-HKU1 provide more phylogenetic information than the early-diverging HCoV-NL63 and HCoV-229E. Application of this evolutionary analysis to Trøndelag, Norway provides projections that late fall and winter months will exhibit significantly higher levels of SARS-CoV-2 incidence than summer and early fall months (Fig. 1F).
Phylogenetic inference of relative monthly incidence of SARS-CoV-2 under endemic conditions. (A) Time tree based on the phylogenetic divergence of circulating human-infecting coronaviruses. Empirical relative monthly incidences of HCoV (B) -NL63, (C) -229E, (D) -HKU1, and (E) -OC43, and ancestral- and descendent-states analytical estimates of relative monthly incidences of SARS-CoV-2 in Trøndelag, Norway.
This lower incidence in the summer and surrounding months is largely generalizable to much of the temperate Northern Hemisphere (Fig. 2). Specifically, significantly higher SARS-CoV-2 incidence is projected in late fall and winter months in New York City (Fig. 2A). A similar seasonality is projected for Tampere, Finland; Gothenburg and Stockholm in Sweden (Fig. 2B); as well as multiple locales in Asia, including Yamagata, Japan; Guangzhou, China; and South Korea (Fig. 2C). However, in each Northern Hemisphere continent, there are regional deviations from this seasonal pattern. In Denver, incidence is projected not to rise until the late winter, peaking in early Spring (Fig. 2A). Incidence in Amsterdam is similarly projected to decline in late spring. In contrast to Denver, incidence in Amsterdam rises earlier, during the late Fall (Fig. 2B). In Asia (Fig. 2C), incidence in Sarlahi, Nepal is projected to surge in early winter. In coastal, subtropical Hong Kong, the projected apex is in late fall, but monthly variation in incidence is muted across other seasons. Seasonality for tropical Nakhon Si Thammarat, Thailand is also projected to be muted relative to the temperate Northern Hemisphere locations, and the seasonality of incidence in the megalopolis of Beijing, China appears atypical with no distinct pattern. In all cases, these results were robust to the phylogenetic inference method, underlying molecular dataset, as well as the use of a chronogram or molecular evolutionary tree (Supplemental Figs. S5–S8).
(A) New York City and Denver, USA; (B) Amsterdam, Netherlands; Gothenburg and Stockholm, Sweden; Trøndelag, Norway; and Tampere, Finland; (C) Beijing, China; Sarlahi, Nepal; Guangzhou, China; Nakhon Si Thammarat, Thailand; Hong Kong, China; Yamagata, Japan; and South Korea (nationwide).
Discussion
Here we analyzed monthly incidence data of the currently circulating endemic coronaviruses HCoV-NL63, -229E, -HKU1, and -OC43 to quantify seasonality of incidence of these viruses in regions that span a broad range of predominantly temperate localities across North America, Europe, and Asia. We conducted ancestral- and descendent-states analyses, projecting the seasonality of SARS-CoV-2 as it becomes endemic. Across much of the temperate Northern Hemisphere, SARS-CoV-2 can be expected to transition to a seasonal pattern of incidence that is high in late fall and winter months relative to late spring and summer. Our expected incidences through time also reveal geographic heterogeneity. This heterogeneity often manifested as a syncopation of the general northern hemispheric trend—a delay in rise to peak incidence, or a prolonged duration of higher levels of incidence relative to other areas. These temporal transmission patterns of SARS-CoV-2 provide fundamental insights for the determination of local public health policies, enabling preparedness and consequent mitigation of seasonal rises in incidence.
Several previous studies have taken on the challenge of predicting seasonality of SARS-CoV-2 based on direct analysis of incidence across seasons during the initial pandemic spread 23–25. During a zoonotic pandemic, out-of-phase emergence, regional variations in public health intervention, and stochastic pulses of local transmission can obscure the signature of seasonality from surveillance data 12. Such concerns have made these analyses controversial 26,27. To avoid such concerns, our analyses are based on multi-year endemic coronavirus incidence data and are not subject to the biases introduced by pandemic emergence and large-scale public health interventions. Results from our analysis are broadly consistent with the seasonal incidence trends observed for common human-infecting respiratory viruses in the northern hemisphere 17.
Our results on the seasonality of SARS-CoV-2 impart expected incidence trends under endemic conditions. Through two alternative mechanisms, seasonality during its pandemic phase might be either greater or lesser than that expected during subsequent endemicity. On the one hand, the absence of previous exposure and a corresponding naive immune response that are associated with overall higher transmission in a pandemic have the potential to exacerbate peaks and troughs of transmission. In this context, seasonality can be further amplified by an overwhelmed and lagging public health response. As such, we could observe heightened seasonal differences in incidence relative to those seen during endemic spread, overlaid onto peaks and troughs caused by the out-of-phase emergence of pandemic disease 28. On the other hand, the mechanisms that are driving the seasonality of coronavirus infections might exert slight influences that are magnified by pathogenic population dynamics year on year 29. This resonation to convergence could underlie the observed seasonality of endemic coronaviruses (Supplemental Figs. S1–S4). With a smaller forcing factor that is amplified by pathogen population dynamics, we would expect less seasonality for SARS-CoV-2 during pandemic spread than would be seen in its eventual endemic incidence. In this context, it is possible that not enough time has elapsed for endemic seasonality to be fully realized. Regardless of how the seasonal dynamics will manifest during this transition from its pandemic phase, our projections provide the expected endemic seasonality.
The seasonal coronavirus incidences in each location were collected in studies that monitored disease in distinct time spans and that may have been subject to a number of annually varying factors that can drive seasonal trends of respiratory infections. However, in many cases the incidences were obtained across multiple years of sampling. For example, the Stockholm, Sweden dataset 30 encompasses 2,093 samples spanning a full decade. Consequently, it is unlikely that the month-to-month average incidences of these long-term datasets are substantially affected by anomalous years. Our results project a seasonal rhythm of SARS-CoV-2 that is broadly similar to the trends observed among many major human-infecting respiratory viruses 31–33. This well-known seasonal trend toward greater respiratory incidence in the winter has been ascribed to a number of factors: temperature 17,34–36, humidity 37–40, solar ultraviolet radiation 41, and host behavior 42. This trend is typically considered to be muted in the tropics, and reversed in the Southern Hemisphere 33,43.
Expanded global surveillance of endemic seasonal coronavirus incidence—especially in the undersampled tropics and Southern Hemisphere—will enhance our understanding of coronavirus seasonality and facilitate preparedness. Denser sampling will enable more precise regional estimates. Sampling in the tropics would enable testing of the muted seasonality that appears there; sampling in the Southern Hemisphere would enable testing of a hypothesis of inverted seasonality compared to the Northern Hemisphere. This information would strengthen the foundation for forecasting not only endemic coronavirus seasonality, but also the seasonality of deadly emergent coronaviruses such as SARS-CoV-2.
Both public health interventions and evolutionary change impact whether the projected seasonality of SARS-CoV-2 will be observed. Transmission could be dampened by the acceleration of vaccination efforts around the world that, like other interventions, have the potential to disrupt erstwhile seasonality. Alternatively, the emergence of novel variants with elevated transmissibility—such as the Delta or Omicron variants 44–46—have the potential to thwart public health efforts and impact seasonal trends. Our results suggest that surges of novel COVID-19 variants will frequently coincide with other seasonal endemic respiratory viruses including influenza and respiratory syncytial virus 47,48, potentially overwhelming healthcare facilities. Our projections affirm the need for systematic, prescient public health interventions that are cognizant of seasonality.
Foreknowledge of seasonality will enable informed, advanced public health messaging regarding seasons of high concern that could help to overcome barriers of nonadherence. Even with widespread vaccination efforts, SARS-CoV-2 is poised to join HCoV-229E, HCoV-NL63, HCoV-OC43, and HCoV-HKU1 as a circulating endemic coronavirus 49. For epidemiological inferences such as seasonality that require long-term datasets, evolutionary biology can provide the theoretical foundation to deliver swift, quantitative, and rigorous insight into how novel threats to human health may behave. Our approach provides guidance for myriad public health decisions until the pandemic phase of SARS-CoV-2 spread has passed and collection of long-term data on endemic COVID-19 incidence becomes feasible. Moreover, in future research it can be broadly applied to seasonal data from any group of viruses to forecast the endemic traits of any emergent threat.
Authors contributions
JPT and AD conceived the project and designed the study; ADL and CN performed literature review with contributions from AD and AAN; ADL, CN, and AAN accessed, processed, and curated seasonality data; ADL performed formal analyses with guidance from AD, JPT, and HBH; AD, ADL, and JPT designed and implemented data visualizations; JPT and AD wrote the manuscript; ADL, PS, and AAN contributed components of the manuscript; and all authors reviewed the manuscript before submission. JPT and AD were responsible for the decision to submit the manuscript. All authors had full access to all the data in the study and had final responsibility for the decision to submit for publication. Data was verified by ADL.
Competing Interests
Authors declare that they have no competing interests.
Funding
National Science Foundation of the United States of America RAPID 2031204 (JPT and AD), NSF Expeditions CCF 1918784 (JPT and APG), and support from the University of North Carolina, Charlotte to AD.
Data and materials availability
All data, inferred phylogenetic trees, imputed monthly proportions, and code underlying this study are publicly available on Zenodo: DOI:10.5281/zenodo.5274735.
Supplementary Information
Supplementary Information is available for this paper.
Materials and Methods
Study Design
We performed a comparative evolutionary analysis on monthly verified cases of HCoV-NL63, HCoV-229E, HCoV-HKU1, and HCoV-OC43 infection within populations across the globe. We applied ancestral and descendent states analyses on reconstructions of the evolutionary history of human-infecting coronaviruses to estimate the expected annual changes in cases at different geographic locales. These analyses provide a global-scale projection of the likely global changes of endemic seasonality for SARS-CoV-2.
Data acquisition
Phylogenetic tree topologies—Phylogenetic relationships of SARS-CoV-2 and the endemic human-infecting coronaviruses were based on data from 58 Alphacoronavirus, 105 Betacoronavirus, 11 Deltacoronavirus, and three Gammacoronavirus as analyzed in Townsend et al. 6; Fig. 1A. These estimates of the phylogenetic topology were consistent with previous hypotheses of evolutionary relationships among coronaviruses (62-66) and were congruent across multiple methods of inference with strong (100% bootstrap) support for all nodes. Tree topologies were inferred by multiple maximum-likelihood (ML) analyses of the concatenated DNA sequence alignment, and results were robust to alternative phylogenetic likelihood search algorithms—IQ-TREE v2.0.6 (67) and RAxML v7.2.8 (68)—and to branch-length differences arising from different approaches to divergence time estimation—IQ-TREE v2.0.6 (67), Relative Times (RelTime; 69) in MEGA X v10.1.9 (70) and TreeTime v0.7.6 (71)—and to a potential history of recombination among or within genes, through phylogenetic analyses using an alignment of the putative non-recombining blocks (72). All trees from Townsend et al 6 were pruned of SARS-CoV-1 and MERS-CoV branches because temporal trends of infection by these viruses reflect short-term outbreaks and not seasonal endemic circulation.
Seasonal infection data—We conducted a literature search using the PubMed and Google Scholar databases searching for terms related to coronavirus, seasonality, and the known seasonalendemic human-infecting coronaviruses (HCoV-NL63, HCoV-229E, HCoV-HKU1, and HCoV-OC43). Searches were conducted in English between October 2020–August 2021, using the names of each coronavirus lineage as a key term in addition to all combinations of: coronavirus, seasonality, environmental, incidence, infection, prevalence, latitude, temperature, humidity, weather, global, cases—with no language restrictions imposed. Seasonal infection data were extracted from published, peer-reviewed research papers that reported monthly or finer seasonal case data for three or more coronaviruses, spanning at least one year.
Estimating the seasonality of SARS-CoV-2
To estimate the seasonality of infections by SARS-CoV-2, we first extracted the average numbers of cases per month testing positive for HCoV-NL63, -229E, -HKU1, and -OC43 for each location. We scaled these case counts by the annual total to yield proportions of the cases sampled in each month. We then performed a phylogenetically informed ancestral and descendent states analysis, executing Rphylopars v0.2.12 (18) on the monthly proportions of cases to estimate the proportion of yearly infection by SARS-CoV-2 each month for each location, executing Rphylopars v0.2.12 (18) on the monthly proportions of cases. This approach takes known trait values (here, monthly proportions of cases for endemic coronaviruses) and applies a Brownian model of trait evolution and a phylogeny to estimate unobserved trait values for a taxon or taxa, providing best linear unbiased predictions that are mathematically equivalent to universal kriging (Gaussian process regression). Phylogenetic ancestral and descendent analyses were repeated across all topologies resulting from different inference approaches (molecular trees, relative phylogenetic chronograms, and non-recombinant alignment) to assess the impact of phylogenetic inference methods on our estimation of seasonality.
Acknowledgments
We thank Dan Warren for helpful discussion at the inception of this work.
References cited
References
- 62.
- 63.
- 64.
- 65.
- 66.
- 67.
- 68.
- 69.
- 70.
- 71.
- 72.