Abstract
Despite the United States Center for Disease Control (CDC)’s May 2023 expiration of the declared public health emergency pertaining to the COVID-19 pandemic (Silk 2023), approximately 3 years after the first cases of SARS-CoV-2 appeared in the United Sates, thousands of new cases persist daily. Many questions persist about the future dynamics of SARS-CoV-2’s in the United States, including: will COVID continue to circulate as a seasonal disease like influenza, and will annual vaccinations be required to prevent outbreaks? In response, we present an Agent Based Networked Simulation of COVID-19 transmission to evaluate recurrent future outbreaks of the disease, accounting for contact heterogeneity and waning vaccine-derived and natural immunity. Our model is parameterized with data collected as part of the Berkeley Interpersonal Contact Survey (BICS; Feehan and Mahmud 2021) and is used to simulate time series of confirmed cases of and deaths due to SARS-CoV-2, paying special attention to seasonal forces and waning immunity (Kronfeld-Schor et al. 2021; X. Liu et al. 2021; Nichols et al. 2021). From the BICS ABM model we simulate SARS-CoV-2 dynamics over the 10-year period beginning in 2021 with waning immunity and inclusion of annual booster doses under a variety of transmission scenarios. We find that SARS-CoV-2 outbreaks are likely to occur frequently, and that distribution of booster doses during certain times of the year—notably in the late winter/early spring—may reduce the severity of a wintertime outbreak depending on the seasonal epidemiology of the pathogen.
1 Introduction
Three years after the first cases of SARS-CoV-2—the pathogen responsible for the COVID-19 pandemic— appeared in the United States, many control measures put in place during the early phase of the pandemic have been eliminated (Silk 2023) in favor of a desire to return to ‘business as usual’. This includes mask mandates, shelter-in-place and work from home ordinances, physical distancing guidelines, and recommendations to isolate symptomatic cases. While vaccines for COVID-19 were a source of optimism through early 2021, it became clear over the subsequent months that waning natural and vaccine-derived immunity and the pathogen’s immunity-evading mutations rendered the vaccines less effective at ending the pandemic than initially hoped (Levin et al. 2021,Centers for Disease Control and Prevention 2021). Additionally, uptake of booster doses has lagged far behind targets (National Center for Immunization and Respiratory Diseases (NCIRD) 2022a; National Center for Immunization and Respiratory Diseases (NCIRD) 2022b). Over time, ‘pandemic fatigue’ has set in as compliance with disease-preventing behavioral mandates, especially mask usage and contact limitation, has slipped and such ordinances have been lifted (Reicher and Drury 2021). The current phase of the pandemic is substantially different from the initial phase, characterized by a highly transmissible but less severe form of the illness owing in part to many factors acting in different directions, including: highly transmissible but less severe later variants (Davies et al. 2021; Strasser et al. 2022; Yang et al. 2022), higher levels of partial or full immunity from vaccine or prior infection (Clarke 2022), higher levels of social contacts approaching pre-pandemic levels, low or absent rates of mask usage and physical distancing (Crane et al. 2021), and better treatments reducing the probability of death or severe illness after infection (National Institutes of Health 2022). While better treatments, milder variants, and prior immunity has resulted in a far lower case fatality ratio than the early days of the pandemic, avoiding the illness is still a persistent challenge for those who remain at elevated risk of severe illness and death due to COVID-19, including the elderly and people with chronic health conditions.
Understanding the various drivers of future SARS-CoV-2 outbreaks can help to plan for future interventional strategies, including vaccine distribution, school or work closure, and other non-pharmaceutical interventions. Infectious disease models can help to plan for these future outbreak scenarios by helping understand how SARS-CoV-2 will exist in our medium to long-term future. Here, we consider the effect of recurrent outbreaks, annual distribution of booster doses, and seasonal change in transmission of SARS-CoV-2. It is presently unknown if SARS-CoV-2 will demonstrate a strong seasonal pattern; however, it is hypothesized to follow the seasonal patterns of influzena and other coronaviruses (Kronfeld-Schor et al. 2021).
To answer these questions, we use a stochastic Agent-Based Network Model paramaterized with contact data from the Berkeley Interpersonal Contact Survey (BICS; Feehan and Mahmud 2021). Using BICS data allows us to consider how contact heterogeneity, household structure, and other network dynamics play into the periodicity and size of future outbreaks. Our model also includes seasonal forcing of transmission parameters, waning immunity from vaccines and prior infection, and variable-rate case importation to capture interaction with counter-seasonal populations (i.e., travel between the hemispheres experiencing opposite seasons).
Agent-Based Models are alternative to compartmental models and allow for more flexible and dynamic transmission dynamics, including network structure (Ajelli et al. 2010; Bansal, Read, et al. 2010). Roubenoff, Feehan, and Mahmud 2023 utilized a compartmental model for analyzing SARS-CoV-2 transmission. While these types of models are heavily utilized for analyzing disease transmission, one particular limitation of these models with contact data is their requirement for relatively few and well-defined categorizations. ABMs are more flexible and contrast with compartmental models by keeping track of the disease status for each individual in the simulation, rather than the tally of individuals in a particular compartment. In place of differential equations describing flows between compartments, ABMs use highly explicit (either deterministic or stochastic) rules governing interaction (Bonabeau 2002; Bruch and Atwell 2015; J. D. Baker et al. 2013; He, Ionides, and King 2010). Interactions are governed by some aspect of the simulated population contained within an objective function, such that interactions between nodes of certain values are more likely that others. As a group, ABMs are free of many of the analytic requirements of compartmental models—especially the need for of explicit transition properties between states, only an objective function for optimization— earning them the descriptor ‘plug and play’ (He, Ionides, and King 2010). Importantly for our purposes, ABMs allow for flexibility in how the population mixes, allowing for contact inequality between simulated agents through either a spatial or network component. Network models are a particular type of agent-based model that assume an explicit network structure for disease-transmitting contacts. In network models, instead of a homogenous or even matrix-structured contact pattern employed by compartmental models, disease transmission is simulated as occurring over a graph representing network connections (Bansal, Bryan T Grenfell, and Meyers 2007; Danon et al. 2011; Matt J Keeling and Eames 2005).
A plethora of agent-based and network simulations of the COVID-19 pandemic have been published and tools released. The flexible yet highly specific ways that ABMs can be used to model social interactions is ideal for testing network and behavioral interventions for COVID-19. These models include a wide set of techniques, including models that explicitly account for how individuals navigate geographic space (Cuevas 2020) and social networks (Hunter and J. D. Kelleher 2021). Agent-based models have used to estimate parameters for the COVID-19 outbreak in France (Hoertel et al. 2020), Ireland (Hunter and J. D. Kelleher 2021), and Colombia (Gomez et al. 2021). Models can utilize existing contact data, such as Moghadas et al. 2021 and Sah et al. 2021, who use data from the POLYMOD survey (Mossong et al. 2008) as inputs to an ABM to evaluate willingness to vaccinate. Holmdahl et al. 2020 test a series of behavioral interventions in nursing homes using a two-cohort ABM (patients and caregivers), finding that testing frequency and isolation are the most effective ways to limit the spread of disease. We draw on a number of general ABMs developed for COVID-19, including Covasim (Kerr et al. 2021), an aspatial model that combines Erdos-Reyni Poisson random networks with SynthPops networks that are generated from empirical contact data, and OpenABM (Hinch et al. 2021), that simulates social network interaction through stochastic network simulation at the household, occupational, and random connectivity additively.
Although it remains to be seen, research has suggested that SARS-CoV-2 may exhibit seasonality similarly to influenza and other coronaviruses, which exhibit higher incidence in the colder months (Nichols et al. 2021). Indeed, in the United States, the highest number of cases were observed in winters 2020-2021 and 2021-2022. The periodicity and severity of future SARS-CoV-2 outbreaks is currently unknown, largely since the rate of mutations and long-term vaccine-derived and natural immunity is unknown, but many mechansims are theorized (Kronfeld-Schor et al. 2021). Seasonal forcing of respiratory diseases involves a consideration of multiple temporal factors relevant to modeling the transmission of SARS-CoV-2 including seasonal changes in host behavior and immune function (Altizer et al. 2006; Grassly and Fraser 2006). Although modeling studies suggest that climate may mediate the timing and peak incidence of SARS-CoV-2 outbreaks, susceptible supply driven by population immunity is the primary driver of such dynamics (R. E. Baker et al. 2020). Many childhood diseases, namely measles, exhibit seasonal cycles driven by the birth rate and increased contact between non-immune children during the school year (Metcalf et al. 2009). In some situations, it can be assumed that the pathogen is circulating at a very low level locally in between outbreaks; in others, like influenza, seasonal human-animal interactions and case importation between populations in opposite hemispheres experiencing counter-cyclical temperature-forced outbreaks may drive timing (Lofgren et al. 2007; Lowen and Steel 2014).
We use a stochastic agent-based network simulation of SARS-CoV-2 transmission parameterized with data from the Berkeley Interpersonal Contact Survey (Feehan and Mahmud 2021) and include seasonality, annual vaccination, waning immunity, and demography. To our knowledge, this represents the first agent-based model for examining COVID-19 endemic outbreak cycles and seasonality. We find that outbreaks are likely to occur regularly and that annually-distributed booster doses can be an effective tool to eliminate regular outbreaks. Depending on seasonal epidemiology of the pathogen, booster doses are most effective when distributed at certain times of year; in the absence of seasonality, booster doses are most effective when distributed in the first half of the year, but in a seasonally-forced transmission scenario distributing vaccines in early fall is more successful at eliminating major annual outbreaks.
2 Methods
2.1 Data
Like Roubenoff, Feehan, and Mahmud 2023, we also utilize here contact survey data collected by Feehan and Mahmud 2021 as part of the ongoing Berkeley Interdisciplinary Contact Survey (BICS), which captures disease-transmitting behavior during the COVID-19 pandemic. The BICS survey, collected in several waves beginning in March 2020, is an online survey aimed at capturing the frequency and nature of respondents’ face-to-face contact over a 24-hour period. Respondents to the BICS survey are recruited through a quota sample using an online survey panel provider, Lucid. Respondents are asked to report the total number of close, face-to-face contacts they had over the previous 24 hours, and to elaborate on three such contacts in detail. Respondents are also asked to report information regarding their demographic information, household structure, and other questions regarding their behavior. We utilize responses from Wave 6 (n = 5418, 12 May - 25 May 2021) of the BICS national (U.S.) survey to capture post vaccination contact patterns.
2.2 Model: BICS ABM
Using Hunter, Mac Namee, and J. D. Kelleher 2017’s taxonomy for categorizing Agent Based Models, the simulation model used here is disease-specific to COVID-19 and society-specific to the behaviours captured from respondents in the BICS sample frame. Behavior is modeled on networks and is without transportation and without environment. The BICS ABM simulation population is constructed of individuals (also referred to as agents, nodes, or vertices) within households. We simulate interaction and disease spread among a population of 1000 households (approx. 3200 individuals) representative of the U.S. according to the procedure described below and in the model supplement. Each agent in the simulation directly corresponds to a respondent in the BICS or POLYMOD surveys sampled with survey weights to match the distribution of age and sex of the US population, and the agents’ demographic and behavioral data is derived from the corresponding survey respondent. The simulation includes three types of social contacts: household contacts with household members, school contacts for children below the age of 18, and non-household ‘random’ contacts. As employment data are not available for this wave of the survey, ‘random’ contacts are designed to include employment contacts for adults. Household contacts and school contacts are drawn randomly at the start of the simulation according to the procedure described below and are maintained throughout the simulation; random draws of graphs representing random non-household contacts are taken during each daytime time step. In this way, the total network is dynamic as it changes throughout the course of the simulation.
The graph of household contacts is drawn according to the procedure described in the model supplement, which is similar to the SynthPops procedure utilized in COVASIM (Kerr et al. 2021; Mistry et al. 2021). Briefly, a supplied number of households are created with the following two-step procedure. First, BICS survey respondents are repeatedly sampled with replacement and adjustment for survey weights to be heads of household. Heads of household are chosen to match the age- and sex-distribution of adults in the United States using 2021 American Community Survey estimates (US Census Bureau 2022). Then, households are filled by sampling BICS respondents (again, with replacement and adjustment for survey weights) who match the household head’s reported household members’ age and gender, until each household is the proper size. Respondents under the age of 18 were not ascertained in the BICS survey; instead, they are sampled uniformly from the POLYMOD UK survey (Mossong et al. 2008). Throughout the simulation, each node’s behavior is derived from the corresponding BICS survey respondent’s responses; nodes derived from POLYMOD respondents are derived from the corresponding fields in the POLYMOD survey1.
The model progresses in hourly steps through simulation time. During morning and evening hours (6pm-8am), agents have contact with all members of their household. During daytime hours, random connections occur between members of the simulation population governed by the degree (number of non-household contacts) supplied by each survey respondent. A daily contact network is drawn using the Network Configuration Model, described below. Each such contact is designated to begin at a random time during the day chosen uniformly between 8am and 6pm. Each contact has a randomly chosen duration sampled according to the following probabilities: respondents to wave 6 of the BICS survey indicated that 17.1% of contacts were less than one minute, 45.2% were between 1 minute and 15 minutes, 18.7% were less than one hour, and 18.9% were more than 1 hour. Here, we choose to use the marginal distribution rather than individual-level responses due to computational limitations. During the duration of each contact, respondents are disconnected from other members of their household and reconnected after the conclusion of the random contact. If a node is clinically infectious, they may enter isolation for the duration of symptoms; if asymptomatic, they continue to mix as before. Isolation is incorporated by a parameter representing the multiplicative reduction in random daily contacts: for example, an isolation parameter of 0.1 means that a node normally with 10 daily non-household contacts with would have 1 such contact while in isolation.
Random contacts are drawn using the Network Configuration Model, which generates a random graph of contacts that preserves each node’s degree—here, the number of daily non-household contacts. The network configuration model creates random networks using only a provided degree distribution as an input. The configuration model works through a two-step procedure (Albert-ászó Barabási 2021):
1. First, assign a degree to each node in the network such that the distribution of degrees matches the desired distribution. For each node, assign a number of ‘stubs’ or half-edges equal to the degree of that node.
2. Second, randomly and uniformly join stubs to create edges until there are no stubs remaining in the network.
Without alteration, the model may produce self-edges (a node connected to itself), or multi-edges (multiple edges connecting a pair of nodes). Sampling ‘simple’ graphs that lack self- or multi-edges is a computationally intensive procedure and non-uniform in its graph-generating process. In our application, we continue to sample graphs uniformly and remove self-edges, but maintain multi-edges. Realizations of this type are likely to have total degree less than would be implied by the supplied degree distribution.
2.3 Transmission of SARS-CoV-2
All agents begin susceptible and vaccine roll-out begins at the beginning of simulation time. At a given time T 0 a supplied number of index cases are chosen randomly to be exposed to SARS-CoV-2. In addition, a vector representing the number of cases imported daily is provided as input to the simulation—in the present application, one case weekly is imported to ensure that SARS-CoV-2 is constantly circulating at a low level. At exposure, each agent is assigned a randomly drawn number of hours spent as exposed and infectious; they then proceed to either symptomatic or asymptomatic infection with a supplied probability. Baseline probability of transmission—before considering vaccine efficacy, contact duration, non-pharmaceutical inter-ventions, and asymptomatic reduction in transmisison probability—from an infected node to a susceptible node occurs with probability β(t), where t is the simulation’s tth day in the year. The value of β(t) thus represents the probabiltiy of transmission during an hour-long contact between one clinically symptomatic node and one susceptible, unvaccinated node without mask usage.
Various factors multiplicatively reduce the probability of transmission. First, transmission is reduced proportional to the duration of contact in fractions of an hour; a 15-minute contact is 1/4-th as likely to result in transmission as an hour-long contact. As well, transmission from an asymptomatic node to a susceptible node occurs at a reduced probability α relative to symptomatic nodes. The susceptible node’s vaccination status reduces the probability of transmission by the corresponding vaccine efficacy; the infectious node’s vaccination status is not assumed to affect transmission probability. As detailed further below, we allow for reinfection after a set amount of time; previous infection offers protection for the recovered node as a reduction in the transmission probability. Finally, the model includes a single Non-Pharmaceutical Intervention (NPI), designed to capture the combined disease-blocking effectiveness of masks, physical distancing, and other preventative measures. If BICS respondents corresponding to both nodes in a random contact report any mask usage, the probability of transmission is proportionally reduced by a supplied effectiveness. If NPI effectiveness is set to 0, the simulation is effectively in the absence of NPI usage. We assume that NPIs are not used among household contacts. At the conclusion of the infectious period, asymptomatic nodes all recover; symptomatic nodes die with a supplied age-based probability.
Unlike compartmental models that often hinge on parameter R0—the basic reproduction ratio, representing the average number of secondary cases caused by an index case in a fully susceptible population, estimated as the product of the transmission probability, contact rate, and duration of infectiousness—no such closed form solution for R0 in an ABM necessarily exists. Although the ABM’s ability to model contact and transmission through separate processes and objective functions allows for for increased flexibility, including time-variable and stochastic transmission probability, heterogeneous contact rates and network structure, variable duration of contact, isolation of infectious cases, and uneven NPI and vaccination usage, the parameters that govern the overall transmission dynamics are not well-defined in closed form. Instead, the R0 must be estimated from the the contact rate or incidence curve generated by the simulation itself (Hunter, Namee, and J. Kelleher 2018; Venkatramanan et al. 2018; Hunter and J. D. Kelleher 2021; Hoertel et al. 2020). We estimate R0 by observing the hourly contact rates ĉhin the initial 7 days of the simulation, using the expression: which estimates the average hourly number of edges weighted by the duration and effectiveness of non-pharmaceutical interventions, if used during each contact. Then, the average contact rate is taken to be the average of the hourly contact rates ĉh. Finally, R0 is taken to be: where parameter is the average transmission probability over the course of the simulation period, is the average duration of infectiousness, N is the size of the population, and the expression ρ +(1 − ρ)α represents the reduction in infectiousness among subclinical cases. Since R0 is not necessarily known until the conclusion of the simulation, we instead determine the overall transmission rate by varying the transmission probably β, then estimating R0 post-facto.
2.4 Vaccine effectiveness, waned immunity, and reinfection
An important feature of the model is waning natural and vaccine-derived immunity. Vaccination occurs uniformly in the population assuming that all eligible members of the population are equally able to access the vaccine (differing from the prioritization procedure taken by Roubenoff, Feehan, and Mahmud 2023), at a baseline 70% uptake rate. A set number of vaccine doses are available daily, and are distributed at 8am each day according to optional priority rules. Nodes are eligible for a second dose of the vaccine after 25 days. After a supplied amount of time, immunity wanes, and nodes are eligible for booster doses; booster doses are made available at a set date every year and are distributed with the same rate and priority schedule as primary doses. Only nodes with waned immunity are eligible for booster doses. All vaccination statuses (1st dose, 2nd dose, waned, and booster) have fixed proportional reductions in transmission probability.
As well, we include reinfection in the model with a similar procedure. Nodes that have recovered from infection remain completely protected from infection for a fixed amount of time; after this, nodes are assigned disease status ‘Recovered-Waned’ indicating that they may be re-infected, yet have some protection against future infection. Waned immunity is assumed to be the same for clinical and subclinical infections, and the protection offered does not depend on the number of previous infections.
At present, the duration of immunity after infection and vaccination is not known, but is estimated to be approximately 6 months of near-complete immunity followed by a steady decrease over time (Centers for Disease Control and Prevention 2021). This is implemented in our model with a pair of parameters: first, the duration of complete immunity, governing the time after infection or vaccination that an individual experiences the full effect of vaccine-derived or natural immunity; second, the wanted immunity effectiveness. For simplicity, these parameters are held to be the same for both vaccine-derived and post-infectious immunity.
2.5 Incorporation of Seasonality
Seasonal environmental changes are known to affect infectious disease transmission in predictable, annually recurrent cycles. Although seasonality is well documented in many infectious diseases, the underlying mechanisms are frequently poorly understood or difficult to tease out from other compounding effects (Fisman 2012). For respiratory infections transmitted between humans via the airborne pathway such as SARS-CoV-2 and influenza, seasonal effects can be grouped in three broad areas: environmentally-driven changes in host behavior, such as summertime school closings or increased wintertime indoor gatherings; the pathogen’s ability to survive outside of the human host adapted to certain climatic conditions, in turn affecting fitness for transmission; and seasonal changes in the host’s immune response, possibly due to changes in temperature or sunlight exposure (Altizer et al. 2006; Grassly and Fraser 2006; Dowell 2001; Kronfeld-Schor et al. 2021; Buonomo, Chitnis, and d’Onofrio 2018; Held and Paul 2012). Additionally, seasonal migration—especially temporary domestic migration with an annual cyclical pattern—may fundamentally change the landscape of interactions and population at risk (Buckee, Tatem, and Metcalf 2017).
Incorporating seasonality into a compartmental model is typically done by adding sinusoidal temporal forcing to the transmission parameter β as β(t) = β0(1 + β1cos(2πt)), through a binary indicator in the case of seasonal school closings as β(t) = β0(1 +β1term(t)), or other time-dependent functional form (Grassly and Fraser 2006; Matt J. Keeling, Rohani, and Bryan T. Grenfell 2001). Here, the basic reproductive number R0 represents the average number of secondary cases from a single index case introduced at a random time of the year, and is defined as where D is the average duration of infection. Forcing functions can be easily extended to include age-structured contact (Bolker and B. Grenfell 1993), and time-series methods allow for modeling of outbreak dynamics without fitting a functional form to the transmission parameter (Metcalf et al. 2009; Finkenstádt and B. T. Grenfell 2000). Extending beyond compartmental models, seasonality can be incorporated into modeling of incidence data; Held and Paul (2012) demonstrate how seasonal incidence can be decomposed into an endemic and epidemic component with independent temporal structure.
The long-term seasonal patterns and drivers of COVID-19 are still unknown and not necessarily possible to disentangle from control efforts, especially noting non-pharmaceutical interventions like shelter-in-place ordinances and mask usage. Weaver et al. 2022 note in a review that COVID-19 may be more stable and more transmissible in cooler environments, consistent with influenza2, although both stability in high humidity and low humidity have been observed (Morris et al. 2021; Marr et al. 2019; Matson et al. 2020; Dabisch et al. 2021). SARS-CoV-2’s preference for colder, drier conditions is consistent with climatic effects observed with influenza (Lofgren et al. 2007; Lowen and Steel 2014; Shaman and Kohn 2009). Another review article by Mecenas et al. 2020 finds a similar conclusion. While climate may affect SARS-CoV-2’s transmissibility directly, the indirect effect of climate’s effect on human behaviors has been demonstrated to be a much stronger effect (Susswein, Rest, and Bansal 2023; Damette, Mathonnat, and Goutte 2021; Weaver et al. 2022). Indeed, research on contact patterns that relate to COVID-19 have are known to be a substantial driver of outbreak dynamics (Feehan and Mahmud 2021).
While many applications of ABMs to infectious disease focus on investigating the interaction-level, network, or transportation aspects of infectious disease, few have focused directly on seasonality. Arduin et al. 2017 incorporate seasonal forcing of pneumococcal infections linked to influenza infection using a fixed multiplication of the transmission probability during the flu season, similar to the school-term forcing of the transmission probability used in compartmental models by Matt J. Keeling, Rohani, and Bryan T. Grenfell 2001. Similarly, Williams et al. 2022 incorporate seasonal forcing to an ABM used to study influenza by adding a multiplicative effect to the transmission parameter related to the temporal distance of each time period from the winter solstice, which is ‘intended to account for factors that may influence transmissibility across a range of seasons due to variability in factors such as temperature, humidity, and changes in contact rates.’ In an application of ABMs to COVID-19, Krivorotko et al. 2022 use an time-series model to decompose incidence counts into a time series effect, a seasonal effect, and a noise component; the seasonal effect and the time-series effects are specified to have a temporally autocorrelative function. ABMs have also been used to study seasonality in non-human diseases (Dawson et al. 2018; Oraby et al. 2014).
We incorporate seasonality in two ways: in the transmission probability β(t) and in the number of nonhousehold contacts. We allow for seasonal forcing of the transmission probability to capture how the transmissibility of the pathogen may change with weather, modeled as: where β0 represents the average transmission probability and β1 represents the amplitude of seasonal forcing (X. Liu et al. 2021; Grassly and Fraser 2006). Second, we include school contacts between children under 18, which are derived from the POLYMOD survey (Mossong et al. 2008). School contacts are drawn once at the start of the simulation and maintained through simulation time. Schoolchildren are taken to have school contacts during the same business hours as random contacts between September 1st and June 1st annually; children who are clinically infectious with SARS-CoV-2 are kept home from school until they recover.
2.6 Incorporation of Demography
Optional in the model is the inclusion of basic demographic vital rates in the form of age-specific fertility and mortality data. At baseline, we draw from the CDC’s 2021 estimates (Martin, Hamilton, and Osterman 2022; Xu et al. 2022) summarized to the age categories used in the model (0-18, 18-25, then in 10 year increments through age 85). The rates used at baseline are shown in figure 12. These rates are used to randomly introduce new susceptibles into the population and randomly remove members of the population, representing deaths due to non-SARS-CoV-2 causes. Each birth represents a new, fully susceptible and unvaccinated child in the population in the household of the birthing parent; they are assumed to have the number of non-household and school contacts equal to the population average. Demography is incorporated into the model once monthly as a Bernoulli random draw for each member of the population with rate equal to the supplied rate, divided by 12 (for males, fertility rate is 0).
2.7 Simulation Procedure and Parameters
We identify a set of baseline transmission parameters in line with those used by Roubenoff, Feehan, and Mahmud 2023, adapted to fit the parametric needs of our Agent Based Model. For all simulations, the number of households is fixed at 1, 000, producing approximately 3, 200 individuals at the start of the simulation. Simulations are run for 10 years and are seeded with 5 index cases at time t = 0, intended to mimic the wintertime outbreak of early January 2021. During the course of the simulation, one case is imported weekly to ensure that all SARS-CoV-2 is constantly circulating at a low level. To account for a wide range of transmission scenarios, we consider three levels of transmissibility: low, with β0 = 0.01 (implying R0 ≈ 1.3); moderate, with β0 = 0.025 (R0 ≈ 3.4); and high, with β0 = 0.05 (R0 ≈ 6.5). Baseline simulations are without seasonal forcing of transmission or contact but with school contacts included during school-year weekdays (Monday-Friday, 9am-3pm). Seasonal forcing of both transmission and contact parameters is introduced with low (10%), moderate (25%), or extreme (50%) seasonal amplitude as described above. To introduce stochasticity into the model, each infected case in the simulation contains a randomly drawn duration of latent period of between 48 and 96 hours after transmission; this is followed by a randomly drawn infectious period of between 72 and 120 hours. These distributions are held constant across all simulations. We assume that each case has a 20% chance of being subclinical—fewer than used by Roubenoff, Feehan, and Mahmud 2023 (derived from Johansson et al. 2021), but in line with recent meta-review estimates by Buitrago-Garcia et al. 2022. That same analysis identified seven studies comparing the secondary attack rate of asymptomatic cases and symptomatic cases with an average ratio of 32%. At baseline we assume that symptomatic cases have all of their normal random contacts and do not isolate, for a ‘business as usual’ scenario; we test the effect of isolation in sensitivity analysis. However, children are assumed to always be kept at home when ill. NPI effectiveness is set to zero, equivalent to the absence of NPIs or masks, but is varied during sensitivity analysis.
Vaccine effectiveness is taken to be 80% after one, dose, 90% after two doses, and 80% after three doses, consistent with estimates published in 2021 and 2022 (Tenforde 2021,Thompson 2021; Thompson 2022) and values used by Roubenoff, Feehan, and Mahmud 2023. Unclear at the present moment is the duration of immunity after infection and vaccination and the effectiveness of waned immunity; a 2021 CDC brief (updated 2022) estimates 6 months of nearly complete immunity that diminishes over time (Centers for Disease Control and Prevention 2021). We take 6 months to be the baseline assumption but vary this duration in monthly increments from 6 months to two years in a sensitivity test. We assume a pessimistic 25% efficacy of waned natural- and vaccine-derived immunity.
All main simulations are run for 10 years in replicates of 10, and we report a number of summary values for all simulations. These include the total number of clinical and subclinical cases, deaths due to SARS-CoV-2, and the timing and size of all outbreaks after year 5. All estimates are standardized to the population size to account for populations that vary randomly in size. Outbreaks are found using the Python library scipy’s find peaks function on the daily sequence of clinical cases, for a minimum incidence threshold of 5% of the population infected over a 30-day window.
3 Results
To elucidate future outbreaks of SARS-CoV-2, we simulate outbreaks at various levels of transmissibility, and test the distribution of annual booster doses in the absence of and presence of seasonality. We find that the optimal date of booster dose distribution for reducing the number of clinical infections is different for the simulations with and without seasonality; in the absence of seasonality booster doses in the first half of the year are most effective at eliminating a large annual outbreak, but with seasonality booster doses are most effective when distributed in early fall.
Infectiousness of COVID-19 and the duration of immunity after infection and vaccination have a strong effect on the dynamics of outbreaks. In a moderate transmission scenario, where the base probability of transmission for an hour-long contact the absence of NPIs or vaccination β0 = 0.025 (corresponding to an R0 of approximately 3.2), an average of 6.16 clinical infections occur per capita over a 10-year period. Over this period, an average of 33.6 outbreaks occur, each infecting an average of 17.7% of the population.
However, when β0 is raised to 0.05 (corresponding R0 ≈ 6.5), outbreaks are fewer (an average of 16.0 over the 10-year period) but more severe, with an average outbreak size of 59.63% and about 9.75 infections per capita—nearly one per person per year. These simulations are summarized in figure 3 and trajectories are shown in figure 4. We observe this dynamic throughout many simulations: when SARS-CoV-2 is more transmissible or less mitigated, outbreaks are fewer but more severe. Mortality in the high-transmission scenario is higher, proportional to the number of clinical infections—2.63% on average compared to 1.67% in the moderate transmission scenario and 1.1% in the low-transmission scenario. These simulations suggest that the public health planning and response for future variants may differ based on their epidemiology. More transmissible variants are likely to have fewer, larger outbreaks that may overwhelm the healthcare system capacity, but show few cases outside of the season; less transmissible variants may require a year-round response, but with less severe outbreaks.
Distributing annual booster shots consistently lowers the rates of SARS-CoV-2 for all simulations, but may independently induce a seasonal pattern in outbreaks. The ability of booster doses to successfully limit SARS-CoV-2 outbreaks is dependent on the timing of their distribution and whether or not seasonal forcing of transmission is included in the model. In the absence of seasonal forcing of transmission and contact (figures 5 and 6), distributing booster doses earlier in the year is most effective at reducing the size of outbreaks: assuming a 75% update of vaccinations distributed annually between January and May, about 1.68 − 2.35 clinical infections occur per capita (average outbreak size ranging from 4.25% to 6.13% of the population); when distributed after July 1st, this rises to 4.34 − 4.76 clinical infections per capita (average outbreak size ranging from 13.21% to 15.61%). In a high-vaccine uptake scenario (90% uptake, shown in the model supplement), the overall rates of SARS-CoV-2 are lower only when vaccines are distributed in the first half of the year; when vaccines distributed in the latter half of the year, the total cases and severity of outbreaks is comparable to the regular-uptake scenario. Outbreaks are observed to generally occur around 6 months—which is also the duration of full immunity after vaccination—after the date that booster doses are made available, with the strongest seasonal patterns observed with June-September distribution inducing a strong wintertime and October - December inducing an early spring outbreak. Indeed, as observed in figure 6, the peak out the outbreak appears to be ‘shifted’ approximately 6 months after the date of booster dose distribution, the duration of of full immunity after vaccination. This phenomenon—whereby January-May boosters nearly eliminate annual outbreaks in the steady state but later-year boosters fail to do so, albeit shift the outbreak timing—is likely driven by the initial outbreak (set to occur in January of 2021 to capture the large winter outbreak in the United States), which establishes the clock for a sufficient number of individuals with waned immunity to appear for an outbreak to occur predictably after. These simulations indicate that, in the absence of seasonality, the timing of booster dose distribution may have the power to govern the timing of the primary annual outbreak.
We also tested the distribution of vaccines in the presence of seasonal forcing of the transmission parameters (β1 = 0.5), shown in figure 7 and 8. Unlike when distributing booster doses in the absence of seasonal forcing, in which the timing of booster dose distribution shifts the timing of the main outbreak, in these simulations with seasonal forcing a substantial wintertime outbreak occurs at nearly the same time every year. However, the size of this outbreak, as well as the presence of secondary outbreaks throughout the year, depends on the timing of booster doses. When boosters are distributed in the first half of the year— January 1st through May 1st—a moderate-sized fall outbreak occurs. When boosters are distributed by July 1st, this fall outbreak nearly doubles in size; a September 1st distribution day results in a less predictable situation, with multiple (2-3) smaller outbreaks throughout the year. However, distributing doses too late (November 1st) results in a large summertime outbreak, despite the relatively lowered transmission rate during the summer months. These dynamics are similar when seasonal forcing is present in the transmission parameters and the contact rate, shown in the supplementary material. Overall, simulations where vaccines are distributed between January and June resulted in 2.76 − 3.36 infections per capita—1-1.5 fewer than when vaccines are distributed in the highest months of July or December (4.82, 5.19 respectively). However, distributing booster doses in September or October results in fewer infections, comparable with simulations when vaccines distributed earlier in the year, and smaller outbreaks (8.69%−9.59% of the population infected on average per outbreak).
As a sensitivity test, we varied the duration of immunity after infection and vaccination, and find that this parameter has a substantial effect on the timing and size of outbreaks. As this parameter governs the rate and which susceptibles are effectively re-introduced into the population, our results are onsistent with M. G. Baker, Peckham, and Seixas 2020. Recurrent outbreaks of SARS-CoV-2 are driven by the seasonality included in the model but also by the effect of waning natural and vaccine-derived immunity, such that even models without seasonal forcing and booster dose distribution may exhibit predictable outbreaks (figure 9). Rather than these outbreaks occurring at specific times throughout the year, they occur a certain amount of time after the previous outbreak—generally equal to the duration of complete immunity. At 6 month immunity, generally two outbreaks occur per year, spaced slightly more than 6 months apart, with periodic secondary outbreaks between; at one year of full immunity, large outbreaks occur slightly more than one year apart, without secondary outbreaks. These simulations indicate that preparations for outbreaks should include evaluation of the previous major outbreak.
We also explored the possibility of isolation as a means to control SARS-CoV-2, despite it being unlikely to be used as a general control strategy in the future. Isolation remains an effective way to limit the spread of SARS-CoV-2 in the steady state, however only the higher degrees of isolation— reducing non-household contacts among clinically infectious individuals to less than 50% of their normal amounts as compared to a ‘business as usual’ scenario— has a substantial effect on transmission dynamics (figure 10). At 50% isolation, an average of 2.72 clinical infections occur per capita, down from 4.84 when clinically infectious nodes have 75% of their normal random contacts and 6.22 in the complete absence of isolation. With 50% isolation, the outbreak size drops dramatically to 6.05% of the population infected during an average of 40.8 outbreaks, down from from 18.7% of population infected in an average of 32.2 outbreaks in the absence of isolation. More extreme isolation reduces the severity of outbreaks even further: at 25% of normal contacts, clinical infections average one per capita; with perfect isolation, clinical infections are less than 0.5 per capita.
4 Discussion
Across all simulations, we observed frequent and predictable SARS-CoV-2 outbreaks over a 10-year period, even with the annual distribution of booster doses as the primary disease-averting intervention. Depending on the epidemiology of the pathogen—namely, should SARS-CoV-2 exhibit seasonality in transmission probability and contact—outbreaks may occur at predictable times of the year, and distribution of booster doses may be able to mitigate the worst of seasonal outbreaks. Our results are consistent with R. E. Baker et al. 2020, who find that outbreak cycles are primarily determined by the levels of susceptibility in the population, although seasonality is an important moderator in outbreak dynamics. Different vaccination campaigns may be needed in areas that exhibit stronger transmission seasonality. We find that distributing booster doses in the first half of the year—January through May—may be an effective strategy at limiting recurrent outbreaks depending on the seasonality exhibited by the pathogen. We find that in simulations without seasonal forcing, distributing booster doses in the first half of the year is most effective at limiting outbreaks; however, with the inclusion of seasonal forcing of transmission and contact, distributing booster doses in September or October may limit the average outbreak size the most. In these simulations, distributing booster doses in the late fall ‘shifts’ the outbreak to the summer, when transmissibility is lower. Since influenza vaccines are distributed in the fall, including booster doses for SARS-CoV-2 at the same time may be easiest in implementation, but less successful than Springtime distribution in limiting outbreaks should
SARS-CoV-2 fail to exhibit transmission seasonality
In addition to illustrating how vaccination interventions can avert the burden of illness and death due to SARS-CoV-2, our simulations further understanding of how to prepare for future SARS-CoV-2 outbreaks. Although distributing booster doses in early fall—like annual vaccinations for influenza—avoids a large wintertime outbreak in a seasonally forced environment, multiple smaller outbreaks may occur throughout the year. This may be preferable to avoid exceeding the treatment capacity of the health system. However, this strategy may not be effective in a less seasonally-variable climate, where booster dose timing merely shifts the main outbreak to later in the year after immunity wanes. Overall, we generally observe variance in mortality proportional to the number of clinical infections that occur in the simulations, indicating both that the most effective strategies for limiting clinical infections also limit deaths. Although we focus on the number of infections as the primary outcome, reducing the number of deaths due to SARS-CoV-2 may be possible with the interventional strategies outlined here.
While agent-based models have been used in a variety of applications for COVID-19, ours represents one of the first to examine long-term dynamics using real population contact data. Our results hint at a rather bleak outlook for the future of SARS-CoV-2: that outbreaks are extremely likely to continue given that natural and vaccine-derived immunity wanes over time. However, annual booster doses—especially when timed properly—and isolation of infectious cases may be effective control strategies. The investigation above tells us that outbreaks can be expected to occur more frequently than annually, and the epidemiology of the pathogen—namely, the base transmission probability and the duration of immunity after infection and vaccination— determines the frequency and severity of outbreaks more than vaccination timing and within a reasonable range of effectiveness. As our approach to SARS-CoV-2 shifts from eradication to management, the strategies presented here show how we can time the distribution of doses to minimize strain on the healthcare system and limit chronic complications from SARS-CoV-2 infections.
Our model has a number of shortcomings that limit its generalizability. First, we are limited by computing resources to a population of 1000 households (approximately 3, 200 individuals). During development, we found that using too few households resulted in simulations that were highly unstable; with a larger population, simulation results were more consistent between runs but took much longer to complete. As a result, we chose the number of households to balance numerical stability while maintaining a reasonable run time of approximately 4 minutes per simulation. Larger populations may exhibit different dynamics as the spread of infection may take longer; as such, it is not known presently if this chosen population size is representative of a larger population or is limited in generalizability to smaller communities. As well, like Roubenoff, Feehan, and Mahmud (2023), we borrow contact patterns for the youngest age group from the POLYMOD UK survey (Mossong et al. 2008), which may not be representative of that age group in the US during the COVID-19 pandemic. However, since our application here is in consideration of the long-term patterns of SARS-CoV-2,’ we believe that these contacts represent a return to ‘business as usual’ for school children.
A key feature of our model is in the random network generation, that produces daily draws of random contacts according to the network configuration model parameterized with degree and duration from the BICS survey. Like any random network generation, the network structure of contacts produced by this model may affect transmission dynamics. Future work should draw inspiration from the COVASIM agent based model (Kerr et al. 2021), which allows for comparison of outbreak dynamics between simulations with Poisson random networks, Hybrid networks, or SynthPops networks depending on the needs of the simulation and data available. As well, we do not include any assortative mixing as emphasized by the model used by Roubenoff, Feehan, and Mahmud 2023 or serial (repeat) contacts, instead choosing a network generation function that maintains the degree of each node. Employment contacts, like school contacts for children under the age of 18, provide a consistent set of individuals having contact most days; inclusion of these such contacts may affect outbreak in unpredictable ways. Future development of the model should include associativity by age and employment status; however, doing so was outside of the scope of the present analysis.
Implementation Overview
The BICS ABM model is implemented in the C++ language and with use of the iGraph-C library (G. Cśardi and Nepusz, T. 2006; Gábor Cśardi et al. 2023), and a Python 3.8 API. The core C++ implementation was chosen over other languages, like Python or R, to maximize speed. Full implementation details are given in the supplementary material. Each simulation of 1000 households for 10 years completes in approximately 4 minutes on an Apple MacBook Air M1, and the entire suite of simulations completes in approximately 8 hours. Replication code is publicly available at https://github.com/eroubenoff/BICS_ABM.
Data Availability
All data produced in the present study are available upon reasonable request to the authors
5 Model Supplement
The model described here is a stochastic Agent-Based Network Simulation for COVID-19 transmission that utilizes data collected as part of the Berkeley Interpersonal Contact Survey to simulation population structure and contact networks. The core algorithm is written in C++, compiled using Apple Clang++17, and utilizes CMake 3.16 and utilizes the igraph 10.4 library (G. Cśardi and Nepusz, T. 2006; Gábor Cśardi et al. 2023), and includes a Python 3.8 user API. The BICS ABM C++ library is compiled to a dynamic library for linkage to the Python API and can be used with an external C++ program.
The BICS ABM C++ library hinges on two input data structures and their Python equivalents: Params, a C-struct that contains the input parameters, documented in table 1; and an array of the simulation population. The input population requires a strict format: each row is an individual and each column represents individual-level data passed to the simulation, flattened to one dimensional array of floats in column-major (Fortran-style) order. Currently, 8 fields are required for each simulated node: the node’s household id; age, represented as a categorical index 0-8; gender, where 0 corresponds to male and 1 corresponds to female; number of non-household contacts; number of school contacts, which is taken to be zero for adults; number of times left home (unused presently, but maintained for legacy purposes); vaccine priority (see 5.1.5); and a boolean indicating if the node uses NPIs or not. Parameters must be supplied indicating the dimensionality of the dataset. The simulation returns a trajectory, a C-struct containing an hourly time-series of all disease states.
The Python API contains a class BICS ABM, which is a wrapper around all of the above utilizing the ctypes library for cross-language functionality. Parameters can be passed to the model through the Python API identically as to the C++ library and we recommend interacting with the simulation through the Python API. The class constructor for BICS ABM takes any of the arguments to Params, runs the simulation, and saves as fields the components of the resultant trajectory as a numpy.ndarray; as such, the trajectory of clinical cases can be accessed as BICS ABM.Cc. The population can be accessed through BICS ABM. pop.
5.1 Model Pseudocode
Psuedocode for the model is given below:
1. In Python API: Establish the following parameters governing the population structure: number of households nhh, survey wave, and vaccine priority. Establish all other transmission parameters and store them in an object params.
2. Generate nhhhouseholds from the corresponding survey wave using the procedure described below. Assign each household a unique household identifier.
(a) First, the distribution of adult ages and sex is derived from ACS data, and a set of N HH households are sampled from this distribution.
(b) A ‘head’ of household is randomly chosen from the BICS survey data. A respondent is eligible to be head of household hi if they are the corresponding age and sex for the sampled household head. Eligible household heads are chosen with replacement and with probability adjusted for survey weights.
(c) Finally, households are filled by sampling (with replacement and adjustment for survey weights) from the set of BICS respondents who mach each of hi’s reported household members’ age, gender, and household size. Children under 18 are not ascertained in the survey; children are instead sampled from the POLYMOD survey. The max size of a household is 6 as as respondents were only asked to report 6 of their household members.
3. Assign vaccine priority to all nodes in the network based off of the rules provided as input, as elaborated in section 5.1.5.
4. Create and pass the params and population object to the C++ core algorithm.
5. In C++ core: Determine the household contact network assuming that all nodes have contact with all members of their household. Randomly draw a school contact network for children under the age of 18.
6. Repeat the following procedure representing one ‘day’ of simulation time, where each day contains 24 ‘hours’ of simulation time, until either no nodes are exposed or infectious OR the simulation has occurred for a supplied maximum number of days.
(a) If hour == 0 (midnight) and an index case is supplied to appear on the current simulation day, transition one node at random into ‘Exposed’ status.
(b) Each hour between midnight and 8am: assume that all nodes have contact with all members of their household. Execute transmission and decrement procedures for all nodes.
(c) 8am: Distribute vaccine doses to n vax nodes awaiting any dose of the vaccine.
(d) 8am: generate a random graph of daily contacts using the procedure outlined in section 5.1.3; assign a random duration and start time for all contacts.
(e) Each hour between 8am and 6pm: connect all random contacts; disconnect each node having a random contact from their household nodes; transmit; and decrement. Reconnect nodes after termination of random contact.
(f) Each hour between 6pm and midnight: transmit within households assuming that all nodes have contact with all members of their household. Execute transmit and decrement procedures.
7. At the conclusion of the simulation, return trajectories of each disease and vaccination status.
5.1.1 Update Handlers
The C++ program contains a centralized method for handling and dispatching changes to nodes, edges, and the graph itself. Classes exist for five types of changes can be executed: UpdateGraphAttribute, CreateEdge, DeleteEdge, UpdateEdgeAttribute, and UpdateVertexAttribute. Vertices cannot be created or destroyed using the update handler. All updates are stored in wrapper class UpdateList, which contains an overloaded method UpdateList.add update() to add an update of each type. Updates are then dispatched with the method UpdateList.add updates to graph(igraph t*), which takes a reference to the graph object as an argument to perform the updates.
The centralized update handlers were developed to streamline the attribute interface. Although the igraph API contains methods for updating individual node or edge attributes, it is more computationally efficient to pull all of the attributes, make all changes, then push them back to the graph. Since the igraph API involves many dynamically-allocated objects, this meant keeping track of many pointers, being sure to free all used vectors of attributes. With a project of this size, adding a centralized way of dispatching updates helped with debugging many issues with memory management, at the cost of a slight function overhead. As well, the object-oriented interface allows for saving of all, say, household edges in a single object to be connected and disconnected as needed.
5.1.2 Decrement procedure
The decrement procedure is among the most important function in the C++ core for tracking the progression of nodes through simulation time. Each ‘event’ that can occur to a node (infection, development of clinical or subclinical infectiousness, recovery, waning immunity, becoming vaccinated, etc) is accompanied by a duration of the event. The decrement procedure decrements the time remaining at each status, and for some events (for example, recovery from infectiousness) will automatically trigger a status change; for other events (like eligibility for vaccination), eligible nodes are placed in a queue for the next event to occur.
5.1.3 Random contacts
Random contacts are drawn for the daytime hours, 8am-6pm, using the network configuration model (as described in the main text; Albert-ászó Barabási 2021). First, the number of daily-nonhousehold contacts is supplied for each simulation node; this number is first multiplied by the isolation multiplier if the node is clinically infectious and contact multiplier if included in the model, then taken to be a random draw from a Poisson distribution with rate parameter equal to the product of all three terms. This is done to allow for minor stochasticity in the model. Each node is assigned a number of stubs equal to this Poisson random draw; if the total number of stubs subs to an odd number, one stub is randomly deleted until the sum is an even number. Should this sum be zero stubs then the procedure aborts. The configuration model is drawn using the igraph degree sequence game from the igraph library; the graph is then simplified to remove self-edges but not multi-edges.
5.1.4 Non-Pharmaceutical Interventions
We include a single generic Non-Pharmaceutical Intervention (NPI) intended to capture the combined transmission-preventing power of mask usage, gloves, and physical distancing. In wave 6, about 60% BICS respondents reported the usage of any of these possible NPIs in any of their reported contacts; any simulation nodes representing these respondents are given an NPI status of True. During the daytime simulation
5.1.5 Vaccine Distribution
Nodes in the population are assigned a discrete priority level for vaccination before the simulation begins. A set number of vaccines are distributed daily among nodes with the highest priority level until no nodes in that priority level remain; remaining doses are distributed among nodes with the next highest priority level, repeating until the day’s number of vaccines are exhausted. A priority level of −1 indicates that the node declines or is ineligible for vaccination. Nodes are eligible to receive the second dose of the vaccine 25 days after they received first dose. After a period of time, vaccine efficacy is assumed to wane; at this point, nodes are eligible to receive an additional booster dose.
5.1.6 Demographic Rates
Fertility and all-cause mortality are incorporated in the simulation according to published age-specific rates, aggregated for each age group in the simulation:
5.2 Supplementary Figures
5.2.1 Booster Dose Distribution Day, High Uptake
Footnotes
↵1 Some fields collected as part of the BICS survey are not available in the POLYMOD data and are imputed. For the present application, only mask/NPI usage is unavailable, and is taken to be the False.
↵2 See Weaver et al’s review of the following studies, among others: Chin et al. 2020; J. Liu et al. 2020; Ma et al. 2021; Matson et al. 2020; Morris et al. 2021; Nottmeyer and Sera 2021; Raiteux et al. 2021; Riddell et al. 2020; Sera et al. 2021; Smith et al. 2021; Wu et al. 2020