ABSTRACT
Seasonal peaks in infectious disease incidence put pressures on health services. Therefore, early warning of the timing and magnitude of peak activity during seasonal epidemics can provide information for public health practitioners to take appropriate action. Whilst many infectious diseases have predictable seasonality, newly emerging diseases and the impact of public health interventions can result in unprecedented seasonal activity. We propose a machine learning process for generating short-term forecasts, where models are selected based on their ability to correctly forecast peaks in activity and can be useful during the aforementioned atypical seasonal activity, in contrast to traditional modelling. We have validated our forecasts using typical and atypical seasonal activity, using respiratory syncytial virus (RSV) activity during 2019-2021 as an example. During the winter of 2020/21 the usual winter peak in RSV activity in England did not occur but was ‘deferred’ until the Spring of 2021.
We compare a range of machine learning regression models, with alternate models including different independent variables, e.g. with or without seasonality or trend variables. We show that the best-fitting model which minimises daily forecast errors is not the best model for forecasting peaks when the selection criterion is based on peak timing and magnitude. Furthermore, we show that best-fitting models for typical seasons contain different variables to those for atypical black swan seasons. Specifically, including seasonality in models improves performance during typical seasons but worsens it for the atypical seasons. In conclusion, we have found that including seasonality in forecast models can result in overfitting, where the models are required to be used out-of-season or during atypical seasons.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
RAM and AJE receive support from the National Institute for Health Research (NIHR) Health Protection Research Unit (HPRU) in Emergency Preparedness and Response. AJE. receives support from the NIHR HPRU in Gastrointestinal Infections. The views expressed are those of the author(s) and not necessarily those of the NIHR, UK Health Security Agency or the Department of Health and Social Care.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
Applications for requests to access relevant anonymised health data included in this study should be submitted to the UKHSA Office for Data Release