Abstract
Both COVID-19 and novel pandemics challenge those of us within the modeling community, specifically in establishing suitable relations between lifecycles, scales, and existing methods. Herein we demonstrate transitions between models in space/time, individual-to-community, county-to-city, along with models for the trace beginning with exposure, then to symptom manifest, then to community transmission. To that end, we leverage publicly available data to compose a chain of Graphical Models (GMs) for predicting infection rates across communities, space, and time. We’ll anchor our GMs against the more expensive yet state-of-the-art Agent-Based Models (ABMs). Insight obtained from designing novel GMs calibrated to ABMs furnishes reduced, yet reliable surrogates for the end-to-end public health challenge of community contact tracing and transmission. Further, this novel research transcends and synergizes information integration and informatics, leading to an advance in the science of GMs. Cognizance into the data lifecycle using properly coarse-grained modeling will broaden the toolkit available to public health specialists, and hopefully empower governments and health agencies, here and abroad, in addressing the profound challenges in disease and vaccination campaigns confronting us by COVID and future pandemics.
In this proof of principle study, focusing on the GM methodology development, we show, first, how static GM of the Ising model type (characterized by pair-wise interaction between nodes related to traffic and communications between nodes representing communities, or census tracts within a given city, and with local infection bias) emerge from a dynamic GM of the Independent Cascade type, introduced and studied in Computer and Networks sciences mainly in the context of the spread of social influences. Second, we formulate the problem of inference in epidemiology as inference problems in the Ising model setting. Specifically, we pose the challenge of computing Conditional A-posteriori Level of Infection (CALI), which provides a quantitative answer to the questions: What is the probability that a given node in the GM (given census tract within the city) becomes infected in the result of injection of the infection at another node, e.g. due to arrival of a super-spreader agent or occurence of the super-spreader event in the area. To answer the question exactly is not feasible for any realistic size (larger than 30-50 nodes) model. We therefore adopt and develop approximate inference techniques, of the variational and variable elimination types, developed in the GM literature. To demonstrate utility of the methodology, which seems new for the public health application, we build a 123-node model of Seattle, as well as its 10-node and 20-node coarsegrained variants, and then conduct the proof of principles experimental studies. The experiments on the coarse-grained models have helped us to validate the approximate inference by juxtaposing it to the exact inference. The experiments also lead to discovery of interesting and most probably universal phenomena. In particular, we observe (a) a strong sensitivity of CALI to the location of the initial infection, and (b) strong alignment of the resulting infection probability (values of CALI) observed at different nodes in the regimes of moderate interaction between the nodes. We then speculate how these, and other observations drawn from the synthetic experiments, can be extended to a more realistic, data driven setting of actual operation importance. We conclude the manuscript with an extensive discussion of how the methodology should be developed further, both at the level of devising realistic GMs from observational data (and also enhancing it with microscopic ABM modeling and simulations) and also regarding utilization of the GM inference methodology for more complex problems of the pandemic mitigation and control.
- Machine Learning
- Statistics
- Applied Mathematics
- Graphical Models
- Agent-Based Models
- Data Lifecycle
- Model Reduction
- Graphical Model
- Inference
- Learning
- Public Health
- Pandemic
- COVID-19
- Mitigation
- Computational Geometry
- Combinatorics
- Convexity
- Submodularity
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work is partially supported by NSF RAPID: Infer and Control Global Spread of Corona-Virus with Graphical Models
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
N/A
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Data Availability
The data used in this manuscript are from the FLUTE, a publicly available stochastic influenza epidemic simulation model: Dennis L Chao, M Elizabeth Halloran, Valerie J Obenchain, and Ira M Longini Jr. Flute, a publicly available stochastic influenza epidemic simulation model. PLoS Comput Biol, 6(1):e1000656, 2010