Abstract
Background The COVID-19 pandemic has driven demand for forecasts to guide policy and planning. Previous research has suggested that combining forecasts from multiple models into a single “ensemble” forecast can increase the robustness of forecasts. Here we evaluate the real-time application of an open, collaborative ensemble to forecast deaths attributable to COVID-19 in the U.S.
Methods Beginning on April 13, 2020, we collected and combined one- to four-week ahead forecasts of cumulative deaths for U.S. jurisdictions in standardized, probabilistic formats to generate real-time, publicly available ensemble forecasts. We evaluated the point prediction accuracy and calibration of these forecasts compared to reported deaths.
Results Analysis of 2,512 ensemble forecasts made April 27 to July 20 with outcomes observed in the weeks ending May 23 through July 25, 2020 revealed precise short-term forecasts, with accuracy deteriorating at longer prediction horizons of up to four weeks. At all prediction horizons, the prediction intervals were well calibrated with 92-96% of observations falling within the rounded 95% prediction intervals.
Conclusions This analysis demonstrates that real-time, publicly available ensemble forecasts issued in April-July 2020 provided robust short-term predictions of reported COVID-19 deaths in the United States. With the ongoing need for forecasts of impacts and resource needs for the COVID-19 response, the results underscore the importance of combining multiple probabilistic models and assessing forecast skill at different prediction horizons. Careful development, assessment, and communication of ensemble forecasts can provide reliable insight to public health decision makers.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
ELR, NGR, EC and others were supported by the US Centers for Disease Control and Prevention (U01IP001122). NGR, NW, and others were supported by the National Institutes for General Medical Sciences (R35GM119582). JB was supported by the Helmholtz Foundation via the SIMCARD Information & Data Science Pilot Project. The content is solely the responsibility of the authors and does not necessarily represent the official views of CDC or NIGMS. BAP and others were partially supported by the National Science Foundation (Expeditions CCF-1918770, CAREER IIS-1750407, RAPID IIS-2027862, Medium IIS-1955883, NRT DGE-1545362), funds from Georgia Tech Research Institute (GTRI) and funds/computing resources from Georgia Tech. The content is solely the responsibility of the authors and does not necessarily represent the views of the funding agencies, Georgia Tech, GTRI, University of Iowa, UIUC or IQVIA. SC and others, as well as computing resources, were funded by the University of Michigan. The content is solely the responsibility of the authors and does not necessarily represent the views of the University of Michigan. PK and others were supported in part by the William W. George and the Virginia C. and Joseph C. Mello endowments at Georgia Tech, the NSF grant MRI 1828187, and research cyberinfrastructure resources and services provided by the Partnership for an Advanced Computing Environment (PACE) at Georgia Tech. ECL and others were supported by the State of California, the U.S. Department of Health and Human Services and the U.S. Department of Homeland Security. This work was additionally supported by the Office of the Dean at the Johns Hopkins Bloomberg School of Public Health, the Johns Hopkins Health System, and with computing service credits from Amazon Web Services. These thoughts and opinions are our own and do not represent the views of the U.S. federal government. YW and others were supported by NIH grant GM124104. MBN and others were supported by the Defense Threat Reduction Agency (Award Number: HDTRA1-19-D-0007) and by the National Science Foundation (Award Number: 2031536). KS and others were supported by the Wellcome Trust [210758]. JC and others were supported by National Science Foundation awards 2035360 and 2035361. GE and others were supported by NSF RAPID award 2027718. RCR and others were supported by the Bill & Melinda Gates Foundation and National Science Foundation award 2031096. GEG and others were supported by funding provided by the US Army Corps of Engineers Geospatial Task Force.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
UMass-Amherst IRB.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
All data and code referred to in the manuscript are publicly available.
https://github.com/reichlab/covid19-forecast-hub/