Abstract
The COVID-19 pandemic has generated an enormous amount of data, providing a unique opportunity for modeling and analysis. In this paper, we present a data-informed approach for building stochastic compartmental models that is grounded in the Markovian processes underlying these models. Our initial data analyses reveal that the SIRD model – susceptiple (S), infected (I), recovered (R), and death (D) – is not consistent with the data. In particular, the transition times expressed in the dataset do not obey exponential distributions, implying that there exist unmodeled (hidden) states. We make use of the available epidemiological data to inform the location of these hidden states, allowing us to develop an augmented compartmental model which includes states for hospitalization (H) and end of infectious viral shedding (V). Using the proposed model, we characterize delay distributions analytically and match model parameters to empirical quantities in the data to obtain a good model fit. Insights from an epidemiological perspective are presented, as well as their implications for mitigation and control strategies.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
Research supported in part by the C3.ai Digital Transformation Institute sponsored by C3.ai Inc. and the Microsoft Corporation, in part by the Jump ARCHES endowment through the Health Care Engineering Systems Center of the University of Illinois at Urbana-Champaign, and in part by the National Science Foundation grant NSF-ECCS 20-32321.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Only the publicly available data were used in this study. Therefore, an IRB review was not required.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Research supported in part by the C3.ai Digital Transformation Institute sponsored by C3.ai Inc. and the Microsoft Corporation, in part by the Jump ARCHES endowment through the Health Care Engineering Systems Center of the University of Illinois at Urbana-Champaign, and in part by the National Science Foundation grant NSF-ECCS 20-32321.
Added ORCID for coauthor Prashant Mehta
1 Although this is a useful approximation, the transitions I →R and I →D may also be affected by the population, e.g., if the health care system is strained as a result of a large number of infected agents.
2 Anonymized line-list data consist of information on start and end date of certain epidemiological stages at an individual level.
Data Availability
The data referred in this study are collected from publicly available resources. Centers for Disease Control and Prevention, COVID-19 Case Surveillance Public Use Data, https://data.cdc.gov/Case-Surveillance/COVID-19-Case-Surveillance-Public-Use-Data/vbim-akqf/data, 2020. M. Kraemer, Epidemiological data from the nCoV-2019 outbreak: Early descriptions from publicly available data, https://virological.org/t/epidemiological-data-from-the-ncov-2019-outbreak-early-descriptions-from-publicly-available-data/337, 2020. midas network, COVID-19, https://github.com/midas-network/COVID-19/blob/master/data/cases/global/line_listings_imperial_college/international_cases_2020_08_02.csv, 2020. Y. Xu, COVID19 inpatient cases data, https://doi.org/10.6084/m9.figshare.12195735.v3, 2020. ThisIsIsaac,Data-Science-for-COVID-19, https://github.com/ThisIsIsaac/Data-Science-for-COVID-19/blob/master/Covid19_Dataset/patients.csv, 2020. mrc ide,COVID19_CFR_submission,https://github.com/ mrc-ide/COVID19_CFR_submission/blob/master/data/deaths_integrated_with_linelist_17feb.csv, 2020. Public line list and summaries of the COVID-19 outbreak in South Korea, https://github.com/parksw3/COVID19-Korea/blob/master/COVID19-Korea-2020-04-06.xlsx, 2020. Novel Coronavirus 2019 time series data on cases, https://github.com/datasets/covid-19, 2020. I. Dorigatti, L. Okell, A. Cori, N. Imai, M. Baguelin, S. Bhatia, A. Boonyasiri, Z. Cucunuba, G. Cuomo-Dannenburg, R. FitzJohn et al., Report 4: Severity of 2019-novel coronavirus (nCoV), Imperial College London, London, 2020.