Abstract
COVID-19 poses a dramatic challenge to health, community life, and the economy of communities across the world. While the properties of the virus are similar from place to place, the impact has been dramatically different from place to place, due to such factors as population density, mobility, age distribution, etc. Thus, optimum testing and social distancing strategies may also be different from place to place. The Epidemiology Workbench provides access to an agent-based model in which a community’s demographic, geographic, and public health information together with a social distancing and testing strategy may be input, and a range of possible outcomes computed, to inform local authorities on coping strategies. The model is adaptable to other infectious diseases, and to other strains of coronavirus. The tool is illustrated by scenarios for the cities of Urbana and Champaign, Illinois, the home of the University of Illinois at Urbana-Champaign. Our calculations suggest that massive testing is the most effective strategy to combat the likely increase in local cases due to mass ingress of a student population carrying a higher viral load than that currently present in the community.
1 Introduction
Mathematical models of infectious disease epidemiological dynamics can be provide valuable assistance to public health officials and health care providers in assessing the likely seriousness of an epidemic or its potential to grow into a pandemic, and later in allocating resources to counter the spread of the disease [105]. Simulations that trace either prior or projected time courses make use of various mathematical and computational techniques, including (non-exhaustively) differential equation models, statistical regression and curve fitting, network propagation dynamics, and direct representation of human actors and their actions by means of agent-based models. The SIR model in particular along with its various adaptations [100] has remained successful, at least in part, due to its universality as evidenced by its empirical adequacy across multiple epidemics, and by its formal robustness when connecting microscale host-pathogen related events and macroscale disease observables [28, 10].
Stochastic versions of the SIR model show that adding noise to the system changes the predicted onset of an epidemic [113], the stability of its endemic equilibrium [123], the value of its effective reproduction number [57] or its duration [72] when contrasted against the deterministic one. This is significant not only at the theoretical level when studying the stability and asymptotic representativeness of deterministic vs stochastic SIR models under various noise regimes, but at the policy making level where computational epidemiology may form the basis of informed decisions under policy, budgetary and other types of constraints [46].
Moreover, the SIR has been extended spatially to account for the diffusion-like properties associated with geographic patterns observed during epidemics. While the qualitative (and some quantitative) properties of the traditional and the spatial SIR models remain largely shared, spatial versions appear to be numerically susceptible to how these capture spatial interactions [101]. As with any other diffusion process in some space, we are usually interested on the ability of a disease to cover larger shares of the population as time marches on. Detailed analysis of deterministic and stochastic SIR models with spatial components [16] indicates the existence of solutions corresponding to traveling waves that propagate the disease among point-like processes. It has also been shown that spatial SIR models can account for the effects of long-distance travel by replacing diffusion operators containing local processes with appropriate integro-differential ones that capture non-local dispersal processes [117].
Agent-based models (ABMs) constitute a family of models where sets of active entities (i.e. agents) interact collectively by following prescribed individual rules intended to portray the emergent dynamics of a real social system [20]. In computational epidemiology, ABMs have been used and comparatively evaluated against SIR models of various kinds. Analysis of the behavior of both classes of models suggests that for many purposes the two classes will give qualitatively the same result, but that agent-based models have an advantage in ease of accounting for heterogeneity in subpopulations where that is significant [5, 96]. Furthermore, the SIR differential equations model is derivable from asymptotic limit of an SIR ABM model through diffusion approximation [17].
A notable coordinated effort to develop agent-based models of a flu pandemic was the Models of Infectious Disease Agents study funded by the National Institutes of Health [51]. More recently the attention of the world was dramatically drawn to the need for public health interventions in the case of COVID-19 by a simulation model projecting 2.2 million deaths from COVID-19 in the United States, and 510,000 in the U.K., in the absence of such interventions [41]. Since then, models continue to be refined as more data are analyzed [3]. It is important to note that there are enormous local geographic variations in the incidence of, and deaths from, COVID-19 [23]. These local variations imply a need for local models, to enable local authorities to construct appropriate strategies of social distancing and testing for mitigation of the effects of COVID-19. The work described in this paper is designed to meet this need.
2 COVID-19 poses a wicked policy problem
Wicked policy problems are characterized by 1) complexity of elements to be accounted for and their relationship to each other, 2) uncertainty in relationship to the description of the problem and the consequences of actions, and 3) divergence within the affected community of values and interests [55]. By all three measures of wickedness –complexity, uncertainty, and divergence-COVID-19 is a highly wicked problem and will continue to be at least until there is an effective and universally available vaccine.
Dimensions of complexity in COVID-19 emerge from all of the multiple ways in which people interact with each other in such a way as to breathe the same air, and from the consequent trade-offs. These trade-offs involve public health, economics, every aspect of community life, and levels of emotional stress in individuals—a multi-level hierarchy of perspectives involving psychology, sociology, economics, health care, and politics. Correspondingly, we expect COVID-19 modeling efforts to include a growing number of these concerns while remaining actionable and scientifically useful. We expect the complexity of such models to grow, but to do so in a manner that remains intellectually transparent [38] about what is stated in them. To this extent, models become critical components within the top-level decision support system necessary to regain situational control during the current pandemic. Dimensions of uncertainty abound. As noted above, there is enormous geographic variation in the documented impact of the disease, and variation even in the apparent fundamental parameters of the virus –transmissibility, latency, and virulence-for reasons that are not yet understood. Contributors to the uncertainty may be genetic variation in human populations [26], genetic variation in the virus as it continues to evolve [107], variation in childhood vaccine regimens from one nation to another [82], variations in weather patterns [61], and variations in testing rates and disease reporting accuracy and criteria [60]. Also, there is an element of pure chance –whether or not a particular community was “seeded” with infectious individuals, and how many.
Dimensions of divergence are in some ways clear, and in some ways complicated. The clearest divergence is between the imperative to save lives by social distancing and the costs of social distancing to the community—both economic costs [15, 110] and also costs that are less tangible due to how wealth moves across the global economy [79]. Early on in this crisis most of the world appears to have made the choice of that we would throw our economies into Depression [12] and restrict many community activities we value [40, 97] in order to save the lives of the probably less than 1% of the population who would die from infection should no social distancing constraints be imposed. At the time of writing, this choice is constantly being reconsidered, or at least recalibrated [50]. More, and more open discussions are being held on increasing economic activity even at the acknowledged cost of more infections and even deaths. One is reminded of a famous comedy routine when comedian Jack Benny, whose comic persona was as a notorious cheapskate, is held at gunpoint by a robber who demands “Your money or your life!” This is followed by a long silence, a repeated demand, and a response by Benny, “I’m thinking!”. It seems that COVID-19 has the whole world thinking about the trade-offs between economics –and other aspects of community life- and lives. With respect to divergence, COVID-19 seems as wicked as possible. The economic dimensions of community life can be measured in dollars. The many other dimensions have no common units of measurement, so their relative value is literally incalculable. And yet we are forced to decide about what to value.
Another way to look at a wicked policy problem is a one where the space of potential solution alternatives contains far more social dilemmas than solutions. We may thus define a social dilemma is a situation where multiple agents have (explicit or implicit) stakes in the resolution of a problem, stakes are tied to multiple value systems (and not just shared, “objective” technical considerations for example), and a proposed solution contains value contradictions that get translated to unacceptable potential losses were that solution to materialize; at the same time, the social dilemma also provides consequences if a solution fails to appear. Conversely, a perfect solution to a wicked problem is a point of total satisfaction of constraints at all levels of representation of the problem. This is, of course, an idealization; in practice some constraints must be eliminated, relaxed or ignored to find a collective solution. In summary, a social problem is wicked if the density of true solutions is low in the space of all solution alternatives and the search for them can be described as unstructured or even counter-incentivized at best.
Thus, the final element of COVID-19 as a truly wicked problem is that, although it is insoluble, we must make our best effort to solve it. The consequences of not trying to solve this intractable problem, of simply guessing at answers guided only by intuition, are far worse than the consequences of being guided by imperfect models.
3 Building a multi-objective model for COVID-19: the agent-based route
Based on the discussion above, our current research efforts have focused on the development of an integrated simulation model capable of a) accurately reflecting known dynamics of the current pandemic and the qualitative results of other models, b) simulating data-driven stochastic heterogeneity across agent populations to more realistically reflect the variability of underlying human populations when the model is applied, c) integrating economic considerations in association with observable features of the pandemic, d) allowing detailed simulation of known public policy measures at different times, intensities and dates, and e) providing a simple interface for non-expert users to configure and interpret.
In relation to the latter point (e), we envision assisting the decision-making process in two steps: first by providing some metaphor or visual proxy for users to construct intuitions by running specific scenarios, and then by translating some of these intuitions into fullyfledged computing and analysis tasks. Succinctly, we wish to facilitate decision making processes that are both robust and adaptive [66] while helping decision makers to avoid falling into the “illusion of control”, or the false belief in the causal relation between computing consequences with a model and immediately improving their decision-making abilities [63]. At the same time, we remain painfully aware of the intrinsic difficulties posed by imperfect data, imperfect implementation of public policy measures, and uncertain timelines for when and for how long to apply measures under unknown timelines for availability of vaccines [104]. Ee expect our model to be beneficial when a) decision makers are fully aware of the underlying simplifications we have made, b) model outcomes are contrasted and adjusted with incoming data during an unfolding situation, c) experts assisting decision makers carefully determine and document how data produced by these simulations is analyzed and translated into tentative recommendations [70].
To this extent, we have focused our efforts on providing modeling tools for population centers with 100,000 inhabitants or less. Our choice is motivated by the geographic distribution of cities and towns across the US [45] and by the apparent inverse correlation between population size and rurality. This is significant since the push for urbanization seems to have driven rural cities and towns to more precarious health systems than their urban counterparts [95]: one can expect COVID-19 propagation to be slower due to lower population densities, but the impact to be at least similar or stronger due to age distribution and availability of health care facilities [8], with a special emphasis on availability of ICUs [58].
3.1 Generalities
In our model, agents interact and traverse a discrete 2D torus composed of connected lattice points that represent geographic locations. Agent actions and decisions are governed by random variables with suitable distributions. A single execution (i.e. a scenario run) of a parametrization of the model corresponds to a possible world, while a simulation (i.e. a scenario) comprises an ensemble of multiple executions with the same parametrizations where outcomes correspond to distributions of agent states and observable quantities must be computed through averages.
The choice of geometry presupposes that agents move across a common landscape at all times, and no agents enter or leave it. This simplification of the geographical landscape, while in general unrealistic, is not uncommon [93, 9] and provides two advantages: a) it naturally matches intuitions behind interactive particle systems driven by mass action principles [75] such as in the SIR model and b) when translated into code, no boundary checks need to be performed by the agents. Lattice sites are connected, from the perspective of an agent, by a Moore neighborhood in an effort to reduce the effect of discretization artifacts [64]. We note here that our model presupposes a homogeneous population density as a means of ensure representativeness of processes within the geographical domain. Although accounting for variable population density areas in the same scenario is possible, our approach simplifies implementation aspects and prevents artifacts for model outcomes that may be strongly density dependent [121] in both epidemiological and economic aspects.
Our model does not explicitly contain a rich representation of locations where agents are drawn to and act as temporal sinks. Instead, we chose to model agent tropism through randomized agent dwell times. Dwell time (τdw) refers here to the minimum amount of time an agent susceptible to COVID-19 contagion needs to spend in a given location to acquire the ≥ virus. Based on existing estimates [36], we have chosen τdw = 15 minutes, which corresponds to one time step ts, s 0. Average total dwell time for an agent corresponds to the (integer) average number of steps an agent will dwell on a single location. Since our model assumes a day as a usual reporting unit in decision-making activities, all parameters stated in days are internally rescaled to τdw units (i.e. 1 day = 96 simulation steps). Agent dwell times are set at creation time using a random deviate from Poisson(k; λ = kdw).
3.2 Core parameterization
To configure of a scenario, a collection of demographic and disease parameters must be specified. Age structure appears to be strongly associated with differences in COVID-19 fatality rates [19, 32, 37, 98]. The model requires estimates both of the distribution and observed fatality proportions and per sex (i.e. male and female) and age (i.e. every f f ten-year intervals) respectively. Co-morbidities are introduced in a similar manner by means of the age and sex structure of the population as a collection of positive multipliers per relevant condition, and aggregate clinical data about their prevalence per age and sex.
The total number of agents N at the onset of the simulation remains constant at all times, except when an influx of new agents is simulated. To do so, the number of new agents entering the population correspond to a proportion pnew of the existing ones. In addition, one must specify when the agents will be introduced ts = Tnew and how long it takes them to enter the space τnew. In this manner our model can account for seasonal population increases driven by, for instance, agricultural production or the start of semesters in university towns. After time ts = Tnew + τnew, the simulation will contain approximately N′ = N + pnew N agents.
To account for population density ρpop, the model specifies the width W and height H of the lattice that will contain the agents. We presuppose that agents move across the lattice one jump at a time if their dwell time is exhausted. The size of the lattice should be also adjusted based on the mobility and transportation patterns of individuals within the enclosed region of interest –that is, excluding realistic density fluctuations due to commuters that spend most of their time outside of the simulated region. The effect of such adjustment is equal to re-scaling the population density by a mobility pre-factor kmobl. After these calculations have been performed, the model should ensure for Tnew = ∞ that
Intuitively, the effect of greater mobility is equivalent to increased population density, or correspondingly, to traversing a smaller space. While our model does not include the effect of road networks or vehicle use, a carefully constructed average for kmobl can provide a sufficiently adequate approximation.
Disease-wise, the model comprises six critical parameters. First, the initial proportion of agents piexp that are exposed to the disease. The model assumes that their introduction occurs at the onset of the disease incubation period. Then, the incubation period τincb and the recovery time τrecv are inputs correspond to average observed or estimated values in clinical patients [69]. In the simulation, each agents is initialized with individual incubation and recovery times drawn from Poisson(k; λ = τincb) and Poisson(k; λ = τrecv) respectively. Our reasoning behind this choice rests on the fact that a) the time at which symptoms manifest across patients depends on a common organismal response to the pathogen dependent on the activation of known (and yet unknown) molecular pathways [2, 103, 111], and at the same time on the intra-population variations that are found across individuals due to their specific genetic make-up and context. Thus, we treat symptoms onset as a homogeneous Poisson process for simplicity even when the described coupling exists. An improvement to our current model would entail, if the reasoning above holds, computing observables using a general Poisson point process by assuming that the Radon-Nikodym density exists [29].
Fourth, the proportion of individuals who remain asymptomatic pasym is accounted for and utilized when stochastically deciding the fate of exposed agents. The role of asymptomatic patients remains heavily investigated [11, 85] and appears to play a crucial role in disease mitigation for COVID-19 [18, 33, 43, 122]. From literature data, the proportion of asymptomatic patients appears to vary greatly across countries and demographics (e.g. [4, 6]), although the matter is far from settled. In this sense, our model is intended to be applied by using data starting at the most local level if possible, and only moving to larger geographical instances when data cannot be obtained by means of statistically robust antibody testing.
Fifth, the proportion of severe cases psev is significant for the model due to its relation to hospital capacity. The definition of severity used here is that reported in [118]; we suggest similar guidelines on estimating the proportion severe cases should be followed. Another proxy for severity may comprise the number of non-ICU and ICU admissions, and their ratio [86]; in general, the need of hospitalization implies that clinical evaluation of a patient raises enough concerns as to consider the possibility of transitioning from non-ICU stage to the ICU stage of care [94]. To account for saturation of health care services, the model males use of the proportion of beds proportional to population density pbeds, and we assume that only severe cases are hospitalized. If a severe patient cannot be hospitalized due to saturation, then its probability of fatality rises by a factor to be determined empirically; for reference, our model presupposes a four-fold increase. At present, our model does not provide estimates of ICU occupancy.
Finally, the model utilizes the probability of contagion Pcont per agent per interaction. This quantity can be obtained from field data or other (more coarsed-grained) epidemiological models at the onset from estimates of the basic reproduction number R0 by observing that where ⟨nS,I⟩t is the average number of contacts between susceptible (S) and infectious (I) agents, and d is the duration of infectiousness of the disease. Recalling the SIR model we observe that γ = d−1 and
Our view of R0 is that of a preliminary estimate for initial calibration at the onset of the period of interest. For reporting purposes, we favor the effective reproductive number R(t), and consequently provide information about the observable for nS,I(t) such that
Since one agent represents multiple individuals in the region of interest, it becomes necessary to compensate for this renormalization process. For such purpose, we provide a scaling factor associated to the representativeness of the model kr given with a scale 1:R by
We found empirically γ ≈ 1.58489, such that rescaling the probability of contagion to obtain leads to
In addition, R(t) ∝ ρpop (Eq. 1), which implies that kmob must also modulate R(t). The final expression becomes
3.3 Agent dynamics
At the model level, observables are interrogated across the agent population, agent step actions scheduled and the step number updated. At each step, agents move one space across the 2D torus after exhausting their dwell time per location. All agents posses an internal state σ that stores information relevant to the disease, its economic aspects and various control structures. Their motion is driven by a random walk within their Moore neighborhood, unless their state has been set to isolation. Isolation means not moving across the space regardless of dwell times. More than one agent can inhabit one lattice site, which forms the basis for determining when a susceptible agent becomes exposed and the infectious cycle starts. Prior to performing stage-dependent computations from the disease perspective, agents compute the consequences of policy measures and adjust various elements of their internal states relevant to epidemiological and economic actions to follow.
3.3.1 Epidemiology
Our model departs from the usual compartments of the SIR and extends it in order to account for a more fine grained variety of significant infection stages. Agent disease states are as follows:
Susceptible
All agents (except those marked as initially exposed) start as the susceptible population. When susceptible agents share the same lattice site, these may come in contact with other exposed, symptomatic (i.e. due to voluntary or involuntary breaking of quarantine with probability 1 − Pisoeff) or undetected asymptomatic agents. If at least one agent is infectious, the agent changes its state σ to exposed as dictated by Bernoulli(σ, Pcont). Unless quarantined due to a policy measure, susceptible agents move freely across the lattice.
Exposed
In the absence of any policy measures impacting exposed agents, these continue to explore lattice sites until their incubation period given by Poisson(x, λ = τincb) is exhausted. At that time, agents become asymptomatic as dictated by Bernoulli(σ, pasym) or symptomatic detected, otherwise.
Asymptomatic
Asymptomatic agents continue moving through the space and remain infectious until the recovery period given by Poisson(x, λ = τrecv) is exhausted. At that point, the agent enters the population of those recovered.
Symptomatic
When an agent becomes symptomatic, it is immediately marked as detected and quarantined. Regardless of stringency of testing policies, the definition of confirmed case depends at the minimum on being both symptomatic and a positive identification via some form of testing (i.e. RT-PCR qualitative or quantitative, serological [91]). Symptomatic agents follow two possible trajectories. In the first one, agents convalesce without becoming severe until their recovery time is exhausted an recuperate. In the second one, agents become severe as dictated by Bernoulli(σ, psev). In terms of the impact of saturation of health care services, research is needed to determine lethality of patients outside hospitals and other facilities. However, using existing ethical guidelines that provide heuristics of fair resource allocation for beds and ventilator equipment [39] as a proxy, we estimate that lethality increases (conservatively) by at least a factor of 5.
Asymptomatic Detected
Asymptomatic agents detected by some widespread systematic testing strategy are immediately quarantined in place, and wait until their recovery time is exhausted. At that point, they join the population of recovered agents.
Severe
Agents that enter the severe stage represent patients that require some form of hospitalization, and for some of them, use of mechanical ventilators; these remain under perfect quarantine. Lethality, computed per age and sex population structures as is used to determine whether a severe agent becomes a fatality as given by Bernoulli(σ, pf). Agents, on the other hand, may survive until recovery.
Recovered
Agents that have recovered leave quarantine and move again freely across the lattice. Our model does not consider the probability of re-infection, but this may need to be included in the future [92]. At the moment of writing, this aspect of COVID-19 remains speculative and uncertain for human patients [34, 90, 115] despite encouraging evidence obtained from experiments using rhesus macaques [14].
Deceased
Agent that count as fatalities do not interact with other agents or undergo any further epidemiological significant events until the end of the simulation.
3.3.2 Inbound infectious cases
In order to increase model realism, we consider the effect of inbound infectious cases of two sorts: people that live within the community but have to travel and work outside of it in a steady stream daily that maintains the overall population density stable, and people that move seasonally within the community, potentially bringing in more cases that have a different viral load. This last case describes, for example, the opening of university campuses for instruction where several people relocate to adjacent towns.
For the first situation, the model includes the probability of susceptible people becoming infected with a daily rate that determines the infectious stage depending on Bernoulli(x, ribnd). In the second type of infectious cases, at time Tmass a proportion pmass of the current population will enter the simulation space during a time period τmass with a probability of being in the exposed state of Pmass.
3.4 Policy measures
COVID-19 has forced a frantic search for public policy measure combinations capable of containing viral transmission, and ideally quelling its progression altogether [13, 42, 47, 56, 67, 71, 89, 99, 106, 115, 116, 120]. All measures reviewed and emerging across literature roughly belong to four main classes of measures: 1) those that aim to reduce at any instant the density of individuals at locations with potentially high concentration of people by means of imposed self-isolation of non-essential workers and cancellation of activities involving massive amounts of people, 2) those that reduce the likelihood of viral exposure for those qualified as essential workers for whom close social interactions are inescapable, 3) those intended to detect and isolate positive virus carriers through application of molecular or serological testing and 4) those that seek to reconstruct interaction histories in which a positive infectious patient may have had an active role in unknowingly spreading the disease.
It has become increasingly clear that these measures are essential yet hard to sustain for long periods of time. On the one hand, various degrees of negative psychological impact have been recently reported [31], in particular impacts that decrease adherence to public policy measures [59] which are expected to naturally arise after periods of prolonged confinement under a collective crisis; World War II critics of air raid shelter policies constitute a significant precedent [80]. On the hand, mounting concerns on lasting economic impacts [76, 83] materialize the wickedness of the COVID-19 pandemic and the cost of measures to address it [22], concerns that which include social protection of workers [44], the labor market [30], patterns of work [65], gender equality [7], nation-state economics [74], and monetary policy [21] to name a few.
Motivated by the latter, our work attempts to model the individual variability expected when these types of measures are implemented, their various impacts in terms of disease and economy collective observables, and the potential outcomes of combining them in various manners. We note that societal and economic impacts of COVID-19 differ from those in other pandemics due to the tight coupling of global events and the effect of near-instantaneous digital communication. We are changing the pandemic while living it. While our model does not provide mechanisms to state the associated control problem in cybernetics terms, emerging literature (e.g. [35, 48, 87, 119]) suggests that such approach may be possible, and even essential to provide solutions that account for the complexities involved in politically and socially driven environments.
3.4.1 Self-isolation
Self-isolation in our model is captured by establishing a period in which a proportion pisol of agents in mobile states (i.e. susceptible, exposed, asymptomatic) remains at a fixed location for a well-defined period of time. Self-isolation starts at time Tisol and extends for a period τisol, after which motion across space is restored. Agents isolate with effectiveness Pisoeff, representing the probability that when in contact with another infective agent the final probability of contagion becomes .
3.4.2 Social distancing
Social distancing is modeled as a distance-dependent adjustment constant δ(ℓ) that adjusts the probability of contagion Pcont depending on linear distance ℓ assumed between agents within the same cell such that . Based on recent experiments on the effect of air turbulence on droplet dispersion [24], we assume a decreasing sigmoid profile after 1.5 meters. Hence, with where K is a constant that adjust the decrease rate of the sigmoid function. For the purposes of the COVID-19 model, K = 10.0. The value of ℓ can be adjusted also to account for the effect of other interventions that decrease contagion probability per contact by decreasing the effective viral load, such as the use of various types of face masks [62].
3.4.3 Testing
Similar to self-isolation, testing in the simulation operates by distributing a target percentage of the population into a given period. Susceptible and asymptomatic agents can be tested; in our simulation, we do not re-test those who recover. A symptomatic case is treated immediately as tested, representing the case where a patient reaches a health provider and a test is applied to determine the correspondence between symptoms and the disease.
Testing proceeds in the following manner. Once time Ttest is reached, symptomatic and asymptomatic agents are selected with a probability Ptest proportional to the period τtest. This testing process simulates massive testing policies without any statistical design or underlying population structure.
3.4.4 Contact tracing
We simulate forms of automated contact tracing by means of a set of known prior contacts. We assume a delay of two days for contact follow-up once an infected patient has been discovered. Once an agent is marked as positive, all of its contacts are evaluated and classified either as susceptible (negative), symptomatic or asymptomatic detected. Contact tracing utilizes the same start time and period as testing.
3.5 Estimation of economic impact
Along with an epidemiological model, we have included economic factors tied to disease stages. After some investigation around statistical theories in economics [27], we decided to implement a simple model where value creation in terms of exchanges between money and products or services [84] are computed in connection with the progression of the disease. Our model is an oversimplification of economic systems, and as such, its goal is to provide a numerical intuition about the immediate effects on the accumulation of capital by individuals and the public sector during the period of interest. Thus, the economic model makes no assumptions about wealth distribution, wealth inequality or other societal factors, and as such only aims at portraying the impact of an epidemic on transactional capital gains or losses at the private and public levels.
Regarding public value [81], we observe that the complex web of actions across individuals and institutions make construction of a detailed model expensive in connection with infectious disease dynamics. Growing literature on the subject is indicative of the latter [25, 53, 78, 112]. To the best of our current experience, proper treatment of the current situation would require a different class of model based on fractional operators coupling epidemiological [1] and econometric aspects [109, 73] capable of accounting for short- and long-term memory macroeconomic effects [108].
3.5.1 A disease-economy input-output system
Our take on the matter can be stated through the following somewhat intuitive principles:
Disease stages that allow interactions should cause non-linear positive externalities in terms of collectively amplified public value. The more frequent interaction exchanges, the higher the materialization of public value as a function of non-linear effects of reaching throughput efficiencies that both maximize economies of scale and translate into deep capital accumulation and redistribution across the public sector.
Disease stages that forbid interactions but do not put individual lives at risk should cause linear negative externalities associated to both increased unemployment ripple effects and inability to reach throughput thresholds capable of amplifying value creation.
Disease stages that both forbid interactions and put individual lives at risk should cause non-linear negative externalities due to increased unemployment ripple effects, inability to reach throughput thresholds capable of amplifying value creation and the saturation of alternatives under an increasingly severe public health crisis.
In addition, we estimate the cost of performing one test as part of the public cost. The purpose of this observable is to provide an account of testing as a public measure versus other actions for which their cost is harder to account for.
In order to materialize the above principles, our approach is inspired by that of input-output matrices utilized to account for combined economy-ecosystem calculations [52]. The equivalent of the matrix M in our model expresses economic outputs per disease stages. Viewed as a matrix computation, the components of the input vector u correspond to proportions of the agent population per infectious stage, while the output vector υ is of the form υ = (υpriv, υpub). Depending on the disease stage, vector components are computed aggregating the value of interactions or individually. By a suitable homotopic transformation 𝒯 [88], we approximate non-linear effects of market dynamics, thus the final value of υ becomes . The matrix components m and both α and α are model inputs.
3.6 Model outcomes
Our model captures during its execution various observables every time step. All values are aggregates and no particular agent information is stored; in the future, this can be of value if the model is extended with network information. After a simulation has completed, the following classes of observables are recorded in a CSV format:
Simulation
Step number, population size
Disease (fraction)
Susceptible, exposed, asymptomatic, symptomatic quarantined, asymptomatic quarantined, severe, recovered, diseased
Measures
Self-Isolated, tested, traced
Epidemiology
Effective reproductive number
Economics
Cumulative private value, cumulative public value
3.7 Implementation
The Epidemiology Workbench is implemented using Python (v 3.6) using the Mesa agentbased simulation library [77]. In batch processing mode, our model receives a JSON file with all the parameters described above and a number indicating the number of cores to use during the simulation. Once a sanity check is performed, the parameters are used in conjunction with the multiprocessing batch running features provided by Mesa.
We also provide a parameterizable web-based dashboard to explore individual runs. The objective behind this corresponds to enabling decision-making users to progressively gain intuitions behind each parameter, and not to provide operational information. Our code is openly accessible through GitHub1.
3.8 Calibration
Model calibration requires the best possible clinical data estimates for three critical parameters: the average incubation time, the average recovery time, and the proportion of asymptomatic patients. For the average incubation time, hospital triaging of COVID-19 cases can provide information leading to estimates, or values provided by trusted sources at the most local level possible can be used. Incubation time is likely to be blurred by multiple confusion variables captured in our model as a Poisson distribution. Despite its stochasticity, the model supposes that large population sizes group tightly towards a mean value. This assumption may need to be revised retrospectively later. For the proportion of asymptomatic patients, contact tracing and structured antibody testing can provide specific local information.
As a general calibration protocol, we suggest the following steps:
Adjust grid size based on fixed population density until R0 matches the best known value for the area in the first days of the model (steps = 96) and > 30 runs. Population density in the model loosely includes average mobility patterns, and cell sizes reflect the distance traversed every 15 minutes. Also –but not recommended- the probability of contagion may be used to calibrate. This may imply unknown population conditions and should be used only to test hypotheses about individual variations that manifest in the ability of COVID-19 to spread.
Execute the model to the point where the number of symptomatic agents corresponds to one representative agent. For example, with an ABM of 1000 agents and a population of 100k individuals, the critical infected agent-to-population ratio is 1:100. Use the point in time rounded to the nearest integer day as point of departure for policy measures. In practice, this implies executing the model in excess of as many days as the longest known incubation period.
If policy measures have been introduced, use date above as the reference point for their introduction.
To develop scenarios, we strongly recommend starting from the most recently calibrated model that includes policies as well as using a model without measures as a basis for counterfactual arguments.
4 Case study: the cities of Urbana and Champaign after reopening UIUC, Fall 2020
The cities of Urbana (pop. 42,214)2 and Champaign (pop. 88,909)3 comprise a population of approximately 132,000 inhabitants. Figure 1 describes the current percent distribution per age group. In addition, distribution per sex is 50.5% males and 49.5% females. However, the population across both cities undergoes a seasonal decrease on mid May due to the end of the Spring semester and an increase in early August with the start of the Fall semester in the University of Illinois at Urbana-Champaign. From the more than 50,000 students enrolled on campus on May4, we estimate that around 30,000 leave during summer to return for the fall semester. Effectively, the summer population amounts to around 100,000 inhabitants. We used COVID-19 age- and sex-dependent mortality values as reported by CDC until June 10.
Our interest rests on simulating the effect of various measures once mobility restrictions remain at values similar to the present one (phase 4 reopening) and combinations of testing and automated contact tracing when an estimate of 30,000 incoming students with a higher average viral load reach both cities, on an increase of 30% of the population present in summer.
4.1 Calibration
COVID-19 data was obtained from the Champaign-Urbana Public Health District (CU- PHD)5 and CDC for mortality data6. Sex-dependent mortality was established at 61.8% for males and 38.2% for females. The first local case in the community was reported on March 8, and state-wide shelter-in-place measures were applied on March 217. Later on, mandatory mask usage was established on May 18. By April 21, cumulative cases had reached 0.1% of the population, and by July 8 it had increased to 1%. The local R(t) value at the peak period between April 21 and May 17 reached an estimated value of 1.2; after these measures, it remained around 0.91. We used a probability of contagion per interaction of 0.004 every 15 minutes if there is at least one person infected at the same location as the susceptible one. This value, although computed from data, appears to reflect compliance with various sanitation practices among the population. Based on the fluctuation in local data, we have estimated an inbound probability of 0.0002 new cases due to members of the community becoming exposed elsewhere. CU-PHD performs strict contact tracing across all cases, hence for all simulations we assume contact tracing remains active across all simulations.
Google Mobility data were used to estimate the average effective shelter-in-place value to a 45% of the population with 0.8% efficacy of self-isolation. The severity was similarly estimated at 5% based on local case information. Similarly, a starting value of R(t) = 1.2 corresponded to a grid of 190 225 cells and 1000 agents and kmob 0.4781; this last value appears to be consistent with the regular use of public transport in the area and local observed shelter-at-home patterns. At the start of the simulation, the agent-to-inhabitant ratio is approximately 1:100. Hence, the calibration starts with one exposed agent. Differences between R(t) in April and July correspond to ℓ = 1.89. Intuitively, this implies that the effect of wearing a mask may roughly translate into increasing the distance 0.39 meters beyond what WHO recommends for proper social distancing. However, this cannot and should not be interpreted as to relax mask usage in any manner. In regard to asymptomatic patients, we have established a conservative value of 35% based on prior studies [49]. The following sequence of events was assumed toward the start of classes this Fall:
Exposure of the first representative agent on April 15 (simulation day 0)
First symptomatic representative agent on April 21 (simulation day 6)
Mask order from the State of Illinois on May 1 (simulation day 16)
Students start massive ingress two weeks between August 9-22 (simulation days 116- 129)
Community members are tested between August 23-29 (simulation days 130-136)
Simulation continues without testing for two weeks between August 30-September 12 (simulation days 137-153)
4.2 Simulation targets
Our goal for simulating the impact of mass ingress was to determine, based on various measures, the viability of preserving public health under variations of measures a week after a significant portion of the population has been tested. To determine the latter, we obtained data for R(t), symptomatic, asymptomatic, severe and deceased fractions of the population. In addition, we collected information about economic impact using our input-output matrix model. A total of 6 simulations explore the following parameter settings:
shelter-in-place continued/removed on day 117,
massive testing performed to 25%, 50% and 75% of the population in the enlarged community.
We also computed a counterfactual case corresponding to no massive testing and lifting of shelter-at-home orders to compare against a backdrop without measures. Model calibration results (Figure 2) correspond to the parametrization publicly available at the GitHub project repository9.
4.3 Results and Discussion
We computed a total of 7 scenarios (shelter-at-home times testing levels plus counterfactual), each one with an ensemble of 30 independent runs. Execution of these scenarios was performed on Amazon EC2 infrastructure using a c5a.8xlarge non-dedicated instance with 36 processors and 64 GB RAM. We used an Ubnutu 18.04 86 image as the choice of operating system. Calibration CPU time with an ensemble of similar size for the first 40 days is, on average, 16.3 0.4 minutes, and the average execution time of a complete scenario is 65.2 1.3 minutes. Preliminary profiling indicates that random number generation using the SciPy library [114] explains most of the execution time. No attempts were made to further speed up our code by compiling it using Cython [102] or Numba [68]. Our goal in these simulations was not to reproduce exactly the case curves observed in the community, but to obtain a picture that remains quantitatively and qualitatively rigorous. Confidence intervals are calculated at 95% when present. We provide scripts to automate the setup of the Amazon EC2 instance with model installation10, the execution of all scenarios11 and their visualization12 for reproducibility purposes.
The following convention is applied to all figures: dark red corresponds to 25% testing, teal to 50% and dark blue to 75%; the counterfactual case is colored in purple. A dotted line corresponds to the start of mass ingress, a dot-dash line to the end of mass ingress and the start of massive testing, and a dashed line to the end of massive testing. All figures related to disease stages start at day 80.
4.3.1 Public health
Model results indicate that outbound exposed individuals coupled with local fluctuations appear to drive the behavior of this pandemic for the cities of Urbana and Champaign. The combination of contact tracing and public health management by CU-PHD, compliance with health and sanitary measures and rapid implementation of shelter-in-place measures have prevented the pandemic from escalating in the region. Considering both cities as a closed system, adequate health management appears to be ultimately responsible for the small number of severe cases and hospitalizations in the region. The effect of masks, based on the information obtained from our simulation, appears to have a significant effect on the value R(t) according to Fig. 2.
Our simulation of the reopening of the University of Illinois local campus with a higher viral load suggests that an increase in cases should be observable independent of the testing regime or relaxation measures. The future impact of this increase, however, is not. Figure 3 indicates that lifting all shelter-at-home restrictions and performing testing has a significant impact regardless of the testing level, while preserving shelter-at-home measures along with any testing level can drastically sustain R(t) slightly below pre-mask order values. We note that R(t) decreases in all cases, which can be explained by contact tracing-based testing. This suggests that even hybrid education modes (in presence + online) constitute significantly better alternatives than full campus reopening.
Testing intensity matters. In terms of the outcome after day 137, testing intensity determines the last value of active cases during two weeks after testing. Note that even when the testing instant itself has passed, capturing a larger and larger number of positive cases (particularly asymptomatic ones) drastically reduces the infectious population (Figure ??). Even when shelter-at-home measures have been lifted, testing reduces the fraction of the population classified as active cases at least higher but close to its value prior to mass ingress (25%), roughly equal its prior value (50%) or lower than the prior value (75%). Lifting shelter-at-home measures has a significant impact on the magnitude of both the peak of active cases and the value after two weeks have passed since testing occurred. Plans to test the student population once per week at scale, although not simulated here, appear to be a most effective solution to further tame the curve.
The main mechanism massive testing addresses in general, according to our simulations, is the removal of infectious individuals from the population, in particular those who are asymptomatic. In general, the estimated proportion of asymptomatic patients is a significant driver of the contagion in our model. When the population increases, many more individual contacts are possible within the same geographical area, and the lag induced by the incubation time translates into observing the impact of testing at least a week later. Figure 5 compares the asymptomatic fraction of the population across a baseline simulation without massive testing or some degree of sheltering. As in the previous case, shelter-inplace measures have an effect on the growth rate of the asymptomatic population, but even a testing intensity of 25% appears to lower it significantly compared to the counterfactual case. At case severity of 5%, more hospitalizations may be expected six days after day 130 (mass ingress), particularly if shelter-at-home orders have been lifted (Figure 6). The impact of testing, while it cannot be fully distinguished due to the overlap of confidence intervals in our simulations
4.3.2 Economic impact
Our analysis of economic impact focuses on two per-capita averages: cumulative private value (without any egress) and cumulative public value. While this part of our proposed model is experimental and requires further analysis, we proceed to state the current results.
First, we studied the effect of the pandemic only on individual accumulation of private value (Figure 7). Mass ingress appears to renormalize temporarily the distribution and removing shelter-at-home measures, predictably, increases the final value at day 153, but only by approximately five units. All testing levels appear to comprise a relatively tight bundle, which can be interpreted either as an artifact due to the model being simplistic, or as the fact that the impact of testing levels on cumulative private value in the context of a pandemic under control (as in Urbana and Champaign cities) is limited. Even when these differences are small, they do exist. Testing at low levels (i.e. 25% and 50%) reduces the number of isolated people compared to testing at a broader scale (i.e. 70%). However, when an epidemic process remains under control, public health benefits largely appears overshadow individual losses, contingent on the validity of this approach.
In the case of per-capita cumulative public value, the model predicts negative outcomes even for an epidemic under control (Figure). This appears to align with public expenditure needed to mitigate any economic and societal lockdown necessary to stop the spread of the disease, as well as the negative externalities leading to less public transactions taking place on systems that were designed to support a certain minimum load to remain profitable: public finances tend to be, in general, inelastic for that reason. When shelter-at-home measures are lifted, a natural order of solutions arises from worst-case (our counterfactual, purple) to best-case (75% testing, blue): strong testing reduces the long-term impact of active, asymptomatic and severe cases. In this situation, however, the effect of mass ingress appears to be to re-bundle the behavior as R(t) increases again. If shelter-at-home is preserved, an interesting situation arises: doing nothing appears to be a good economic solution. We believe the main reason behind this result to be that, for the case modeled here, the social and economic cost of the lockdown is higher due to the situation being under control; however, the material gains computed by the model between are rather small, and preserving public health over economics is a better-long term strategy. Despite this, strong massive testing still provides a next best solution from the point of view of economics, and the best strategy from the public health perspective; this is evidenced by the diverging curves in Fig. 8(b). We speculate, based on our current simulation outcomes, that the ordering in the economic cost profile of a pandemic during its exponential phase should be similar to that of Fig. 8(a), but with the divergence observed in (b).
5 Conclusions
We reported here the construction of an agent-based workbench using the Mesa modeling framework capable of capturing epidemic processes alongside public policy measures. The model is fully stochastic, entailing the computation of observables of different kinds. While computationally expensive, its formulation allows to easily obtain quantities that appear to be useful in the process of combating an epidemic. We applied our workbench to understanding the possible epidemiological profile of two cities, Urbana and Champaign, in the context of the reopening of the University of Illinois local campus next Fall. Our simulations indicate that at least 50% testing of the local population is needed to sustain the pressure of mass ingress of individuals with a higher viral load compared to the local one. More generally, contemporary management of an epidemic demands changing the mode of interaction across as much as the population as possible from those requiring physical proximity to those that do not. Although digital technologies provide mechanisms to preserve safe spatial distancing, temporal distancing can also be intelligently used to reduce the probability of contagion. In terms of economics, public health measures must be privileged over financial concerns, since the panorama appears similarly bleak during the early phases of an epidemic, and strict measures possibly provide the best solution during the exponential phase.
Our model contains the following key limitations. First, only one measure per type can be specified at the moment, instead of a sequence of dates paired with values corresponding to the measured (or expected) effect of measures of the same type. In the example above, we used an approximation of shelter-in-place for the entire simulation period (April 21– September 12), even when the State of Illinois ordered phase 4 re-openings on June 26. Another critical element missing in our model corresponds to preferential shelter-in-place per age group. Even when mortality appears to more strongly impact the elderly, local mortality is low. This may be due to higher compliance of that population with shelter-at- home and other sanitary measures, including wearing masks in public. Variable viral loads per disease stage [54] are missing in our model, but these are harder to calibrate due to the biosafety and time elements involved in quantitative PCR-RT. Nevertheless, we do foresee situations where this may be possible and pertinent. Finally, our model assumes agents have an effectively infinite memory to remember whom they have had contact with. Contact tracing has a stringent limit when performed manually, which can be expanded greatly by means of various information technologies. Hence, our model is not realistic in this sense, since it does not distinguish between these two cases: when an epidemic has reached certain critical mass, active cases will be underestimated.
Computationally, the Epidemiology Workbench is limited by the lack of true concurrency in Mesa, thereby impacting scaling properties of our simulation. Distributing the agents across multiple compute nodes in Python requires architectural changes in Mesa beyond the scope of our work. At present, efforts are under way to develop an Elixir-based ABM platform capable of addressing this limitation as part of the SPEC collaboration in the Computational Social Science community. Another critical bottleneck is random number generation, for which various strategies may be applied, including the use of (approximately) irrational numbers as coefficients of Fourier series.
A final aspect underscored by our research, in particular in the context of COVID-19, is the need for anticipatory mechanisms driving public policy measures. In this sense, simulation methods such as the one presented here or others lack convenience when introducing external events and the execution cost of recomputing complex models, and appear to make sense only at early stages of a epidemic process: once the disease spread has reached its exponential phase, the need moves from prediction to probabilistically qualified estimations of short term measure effectiveness. This suggests a different class of stochastic methods that not only predict expected trends but also recommend measures based on their effectiveness, similar to those used in high-speed trading of financial derivatives and futures. The complexity of the socio-technical character of the global economy and society demands these more powerful methods to successfully address the wicked character of a pandemic, in particular this one.
As of now, our efforts concentrate on seeking viable ways to package and deploy the Epidemiology Workbench across various cyberinfrastructure resources in order to make it available to other small cities (including training resources) and, particularly, to communities with strong presence of underrepresented minorities whose public health planning resources are heavily constrained.
Data Availability
The authors have provided all code and computer experiment data online at a public GitHub repository, including the script to setup a cloud computing instance.
Acknowledgments
The authors wish to thank the following individuals for their contributions and support: Julie Pryde and Awais Vaid (Champaign-Urbana Public Health District), public health data and situational knowledge; Tomas de Camino Beck (ULead), epidemiology and mathematical biology; Carlos Rau’l Guti’errez-Guti’errez (ICANN/ISOC), economics and econometrics; Jacqueline Kazil and Tom Pike (Mesa/George Mason University), Mesa programming and Python multiprocessing; Rajesh Venkatachalapathy (Portland State University), Srikanth Mudigonda (Saint Louis University), Milton Friesen (Cardus, U Waterloo) and Jeff Graham (Susquehanna University) on agent-based modeling strategies (SPEC collaborative); Galen Arnold (NCSA UIUC), Python optimization strategies, and William Gropp and Jim Glasgow for institutional support. This work was supported by Illinois Informatics and the ACM/Intel SIGHPC Computational and Data Science Fellowship, 2017 cohort.
Footnotes
↵2 US Census Bureau population estimates for 2019. See: https://bit.ly/3hj1PnC
↵3 Idem. See: https://bit.ly/2WGcOjq
↵4 University of Illinois News Bureau. See: https://bit.ly/39jYj9K
↵5 See: https://bit.ly/3eOlISa
↵6 CDC COVID-19 Death Data and Resources. See: https://bit.ly/3hpap4A
↵7 State of Illinois. See: https://bit.ly/2WHU6I5
↵8 Idem. See: https://bit.ly/3hsxm6D
↵9 See: https://bit.ly/2BhCuv3
↵10 See: https://bit.ly/30wgS6M
↵11 See: https://bit.ly/3jvR2Zw
↵12 See: https://bit.ly/2BkMh3v
References
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].↵
- [59].↵
- [60].↵
- [61].↵
- [62].↵
- [63].↵
- [64].↵
- [65].↵
- [66].↵
- [67].↵
- [68].↵
- [69].↵
- [70].↵
- [71].↵
- [72].↵
- [73].↵
- [74].↵
- [75].↵
- [76].↵
- [77].↵
- [78].↵
- [79].↵
- [80].↵
- [81].↵
- [82].↵
- [83].↵
- [84].↵
- [85].↵
- [86].↵
- [87].↵
- [88].↵
- [89].↵
- [90].↵
- [91].↵
- [92].↵
- [93].↵
- [94].↵
- [95].↵
- [96].↵
- [97].↵
- [98].↵
- [99].↵
- [100].↵
- [101].↵
- [102].↵
- [103].↵
- [104].↵
- [105].↵
- [106].↵
- [107].↵
- [108].↵
- [109].↵
- [110].↵
- [111].↵
- [112].↵
- [113].↵
- [114].↵
- [115].↵
- [116].↵
- [117].↵
- [118].↵
- [119].↵
- [120].↵
- [121].↵
- [122].↵
- [123].↵