Abstract
The novel coronavirus (COVID-19) pandemic, first identified in Wuhan China in December 2019, has profoundly impacted various aspects of daily life, society, healthcare systems, and global health policies. There have been more than half a billion human infections and more than 6 million deaths globally attributable to COVID-19. Although treatments and vaccines to protect against COVID-19 are now available, people continue being hospitalized and dying due to COVID-19 infections. Real-time surveillance of population-level infections, hospitalizations, and deaths has helped public health officials better allocate healthcare resources and deploy mitigation strategies. However, producing reliable, real-time, short-term disease activity forecasts (one or two weeks into the future) remains a practical challenge. The recent emergence of robust time-series forecasting methodologies based on deep learning approaches has led to clear improvements in multiple research fields. We propose a recurrent neural network model named Fine-Grained Infection Forecast Network (FIGI-Net), which utilizes a stacked bidirectional LSTM structure designed to leverage fine-grained county-level data, to produce daily forecasts of COVID-19 infection trends up to two weeks in advance. We show that FIGI-Net improves existing COVID-19 forecasting approaches and delivers accurate county-level COVID-19 disease estimates. Specifically, FIGI-Net is capable of anticipating upcoming sudden changes in disease trends such as the onset of a new outbreak or the peak of an ongoing outbreak, a skill that multiple existing state-of-the-art models fail to achieve. This improved performance is observed across locations and periods. Our enhanced forecasting methodologies may help protect human populations against future disease outbreaks.
Competing Interest Statement
Mauricio Santillana has received institutional research funds from the Johnson and Johnson foundation, Janssen global public health, and Pfizer. Other authors declare no competing financial or non-financial interests.
Funding Statement
K.L. and M.S. were supported by the National Institutes of Health (NIH) under award number R35GM133725. M.S. was also partially supported by the NIH under award number R01GM130668. M.S. has been funded (in part) by contracts 200-2016-91779 and cooperative agreement CDC-RFAFT-23-0069 with the Centers for Disease Control and Prevention (CDC). The findings, conclusions, and views expressed are those of the author(s) and do not necessarily represent the official position of the CDC.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
The sections of Acknowledgement and Conflict of Interests have been updated. Fig.7 in the previous version was moved to Fig.2 in the revised version.
Data Availability Statement
The data used in this study are publicly available and consist of daily COVID-19 cumulative infectious and death cases reported for U.S. counties. The dataset was obtained from the Johns Hopkins Center for Systems Science and Engineering (CSSE) Coronavirus Resource Center, spanning from January 21st, 2020, to April 16th, 2022 [15]. The dataset can be directly accessed from the Johns Hopkins CSSE Coronavirus Resource Center website (https://github.com/CSSEGISandData/COVID-19). Researchers interested in utilizing the data for further analysis can refer to the original source for detailed documentation on data collection methods and definitions. For additional information or inquiries about the dataset, please visit the website or contact the Johns Hopkins CSSE Coronavirus Resource Center.