Bayesian Inference of State-Level COVID-19 Basic Reproduction Numbers across the United States

Abhishek Mallela; Jacob Neumann; Ely F. Miller; Ye Chen; Richard G. Posner; Yen Ting Lin; William S. Hlavacek

doi:10.1101/2021.09.27.21264188

Abstract

Although many persons in the United States have acquired immunity to COVID-19, either through vaccination or infection with SARS-CoV-2, COVID-19 will pose an ongoing threat to non-immune persons so long as disease transmission continues. We can estimate when sustained disease transmission will end in a population by calculating the population-specific basic reproduction number ℛ₀, the expected number of secondary cases generated by an infected person in the absence of any interventions. The value of ℛ₀ relates to a herd immunity threshold (HIT), which is given by 1 − 1/ℛ₀. When the immune fraction of a population exceeds this threshold, sustained disease transmission becomes exponentially unlikely (barring mutations allowing SARS-CoV-2 to escape immunity). Here, we report state-level ℛ₀ estimates obtained using Bayesian inference. Maximum a posteriori estimates range from 7.1 for New Jersey to 2.3 for Wyoming, indicating that disease transmission varies considerably across states and that reaching herd immunity will be more difficult in some states than others. ℛ₀ estimates were obtained from compartmental models via the next-generation matrix approach after each model was parameterized using regional daily confirmed case reports of COVID-19 from 21-January-2020 to 21-June-2020. Our ℛ₀ estimates characterize infectiousness of ancestral strains, but they can be used to determine HITs for a distinct, currently dominant circulating strain, such as SARS-CoV-2 variant Delta (lineage B.1.617.2), if the relative infectiousness of the strain can be ascertained. On the basis of Delta-adjusted HITs, vaccination data, and seroprevalence survey data, we find that no state has achieved herd immunity as of 20-September-2021.

Significance Statement COVID-19 will continue to threaten non-immune persons in the presence of ongoing disease transmission. We can estimate when sustained disease transmission will end by calculating the population-specific basic reproduction number ℛ₀, which relates to a herd immunity threshold (HIT), given by 1 − 1/ℛ₀. When the immune fraction of a population exceeds this threshold, sustained disease transmission becomes exponentially unlikely. Here, we report state-level ℛ₀ estimates indicating that disease transmission varies considerably across states. Our ℛ₀ estimates can also be used to determine HITs for the Delta variant of COVID-19. On the basis of Delta-adjusted HITs, vaccination data, and serological survey results, we find that no state has yet achieved herd immunity.

Introduction

Vaccines to protect against coronavirus disease 2019 (COVID-19) became available in the United States (US) in December 2020 (1). As of September 20, 2021, 181,728,072 persons have been fully vaccinated, an additional 30,307,256 persons have been partially vaccinated, and an uncertain number of persons have acquired immunity through infection (2). The entire US population does not need to be vaccinated to end sustained COVID-19 transmission because of the phenomenon of herd immunity (3), which is reached when a critical fraction of the population becomes immune. This fraction is called the herd immunity threshold (HIT).

The HIT for a population relates to the basic reproduction number, ℛ₀, as follows (3): HIT = 1 − 1/ℛ₀. ℛ₀ is defined as the expected number of secondary infections arising from a primary case in the absence of any immunity or intervention. As is well known, ℛ₀ and HIT are population-specific (4-5), which means that the effort required to control the local COVID-19 epidemic may vary from community to community. However, knowledge of the HIT for a given region is insufficient to determine when disease transmission within the region will end. One also needs to know the fraction of the population that has immunity. Estimating the immune fraction is difficult, because we cannot simply count the number of persons who have been vaccinated or the number of persons detected to be infected. Immunity is acquired not only through vaccination but also through infection (6), and case detection is imperfect. Insight into the immune fraction can be obtained from seroprevalence surveys, which use blood tests to identify persons who have antibodies against the SARS-CoV-2 virus (acquired through vaccination or infection).

Various estimates of ℛ₀ for transmission of COVID-19 have been provided in the literature (7). The estimates that have received the most attention are those given for China and Italy (8-12), which were among the first regions to be impacted by COVID-19. However, the relevance of these estimates for populations within the US (or elsewhere outside of China and Italy) is unclear. Several studies have estimated ℛ₀ for the US at the national level (13-15), the state level (16-18), and the county level (19-20). The usefulness of a national estimate is unclear given the heterogeneity of the US, and none of the county-level estimates are comprehensive. Some state-level estimates are also incomplete (16, 18). Because responses to COVID-19 within the US have been and continue to be driven mainly by governors of US states (21), we undertook a study to generate comprehensive state-level ℛ₀ estimates through Bayesian inference. With this approach, we were able to quantify uncertainty in each estimate through a parameter posterior distribution.

In earlier work, we developed a compartmental model for COVID-19 transmission dynamics that reproduces surveillance data and generates accurate forecasts for the 15 most populous metropolitan statistical areas (MSAs) in the US (22). Here, for each of the 50 states, we found a state-specific parameter posterior conditioned on this model from state-level COVID-19 surveillance data available from January 21 to June 21, 2020 (23). From these parameter posteriors, we then obtained region-specific ℛ₀ and HIT posteriors and maximum a posteriori (MAP) estimates. The MAP estimates for HITs together with other data—vaccination tracking data (24), serological survey data (25-26), and quantitative estimates of the increased transmissibility of the recently introduced SARS-CoV-2 variant Delta (lineage B.1.617.2) (27-28)—provide insight into the progress of each state toward herd immunity.

Materials and Methods

Model

To obtain regional ℛ₀ and HIT estimates, we used a compartmental model developed previously (22). We found region-specific parameterizations that allow the model to reproduce surveillance data (daily reports of new confirmed COVID-19 cases) available for each region of interest over a defined period (e.g., January 21 to June 21, 2020). The model is able to account for a variable number of social-distancing periods. We considered versions of the model accounting for one, two, and three social-distancing periods. The number of social-distancing periods deemed best (i.e., to provide the most parsimonious explanation of the data) for a given time period was determined using the model selection procedure described by Lin et al. (22). As in the study of Lin et al. (22), the model has 14 parameters with universal fixed values (applicable to all regions). The model also has 3(n + 1) + 3 parameters with region-specific adjustable values determined through Bayesian inference, where n + 1 denotes the number of social-distancing periods. In this study, for a given region, we censored case-reporting data whenever the cumulative reported case count was less than 10 cases. We also specified the onset time of the first social-distancing period as the earliest day on which the cumulative reported case count was 200 cases or more. A full description of model parameters is given in Lin et al. (22).

Simulations

Each region-specific model consists of a coupled system of ordinary differential equations (ODEs), which are given by Lin et al. (22). The ODEs were numerically integrated using the SciPy (29) interface to LSODA (30) and the BioNetGen (31) interface to CVODE (32). Python code was converted to machine code using Numba (33). The initial conditions were determined as in Lin et al. (22).

Calculation of epidemic parameters ℛ₀ and λ

To find the basic reproduction number ℛ₀, we considered a reduced form of the model of Lin et al. (22), which is given in Eqs. 1-8 of the Supplementary Information (SI). The reduced model omits consideration of interventions, including social distancing, quarantine, and self-isolation, which are all considered in the full model. From the reduced model, we derived an expression for ℛ₀ by applying the next-generation matrix method (34). In this procedure, ℛ₀ is determined as the spectral radius of the so-called next-generation matrix. Denoting this matrix as 𝒩, the (i, j) entry of 𝒩 is the expected number of new infections in the i^th compartment produced by persons initially in the j^th compartment. The expression for ℛ₀ given in the Results section below was obtained using Mathematica (35). The matrix 𝒩 was obtained using Mathematica’s LinearSolve function and ℛ₀ was computed as the dominant eigenvalue of 𝒩.

To characterize the initial rate of exponential growth for a local epidemic within a given region, we computed the epidemic growth rate λ as the dominant eigenvalue of the Jacobian of the reduced model linearized at the disease-free equilibrium (36). The derivation of λ is provided in the SI.

Bayesian inference

To infer region-specific values of adjustable model parameters (and ℛ₀ and HIT estimates), we followed the Bayesian inference approach of Lin et al. (22). In inferences, we used all region-relevant confirmed COVID-19 case-count data available in the GitHub repository maintained by The New York Times newspaper (23) for the period starting on 21-January-2020 and ending on 21-May-2020, 21-June-2020, or 21-July-2020 (inclusive dates). Markov Chain Monte Carlo (MCMC) sampling was performed using the Python code of Lin et al. (22) and a new release of PyBioNetFit (37), version 1.1.9, which includes an implementation of the adaptive MCMC method used in the study of Lin et al. (22). Inference job setup files for PyBioNetFit, including data files, are provided for each of the 50 states online (https://github.com/lanl/PyBNF/tree/master/examples/Mallela2021States). Results from both methods were found to be consistent (SI Fig S1). To ensure that MCMC sampling procedures converged, we visually inspected trace plots for log-likelihood (SI Fig S2) and parameters (SI Fig S3) and pairs plots (SI Fig S4). We also performed simulations using maximum likelihood estimates (MLEs) for parameter values to assess agreement of the simulations with the training data (SI Fig S5).

The maximum a posteriori (MAP) estimate of a parameter is the value of the parameter corresponding to the peak of its marginal posterior distribution, where probability density is highest. Because we assumed a proper uniform prior distribution for each of the adjustable parameters, as in the study of Lin et al. (22), the MAP estimates are MLEs.

Results

Bayesian uncertainty quantification

Following the Bayesian inference approach of Lin et al. (22), we quantified uncertainty in predicted trajectories of confirmed case counts for all 50 states, using data from January 21 to June 21, 2020. As illustrated in Fig 1 for the states of New Jersey, Wyoming, Florida, and Alaska, we find that each region-specific model parameterized on the basis of our MCMC sampling procedure reproduces the corresponding surveillance data over the period of interest. Results for the remaining states are shown in SI Fig S5. At the end of each MCMC sampling procedure, we obtained a marginal posterior distribution for β (the rate constant in the model for disease transmission) which provides a probabilistic characterization of region-specific SARS-CoV-2 transmissibility. If the marginal posterior is narrow, we have high confidence in the MAP estimate of β; if it is wide, we have less confidence in its value. Each state-specific marginal posterior yields a MAP estimate for β.

Figure 1.

Bayesian predictive inferences for daily confirmed case counts of COVID-19 in (A) New Jersey (B) Wyoming (C) Florida (D) Alaska, from January 21 to June 21, 2020 (inclusive dates). The compartmental model (22) accounts for an initial social distancing period followed by n additional periods. We considered n = 0, 1 and 2 and selected the best n using the model selection procedure of Lin et al. (22). Plus signs indicate daily case reports. The shaded region indicates the prediction uncertainty and inferred noise in detection of new cases. The color-coded bands within the shaded region indicate the median and different credible intervals (e.g., dark purple band corresponds to the median, the band with lightest shade of yellow corresponds to the 95% credible interval, and gradations of color between these two extremes correspond to different credible intervals as indicated in the legend). In each panel, the vertical broken line indicates the onset time of the first social-distancing period. For states with n = 1 (Alaska and Florida), there is an additional broken line, which indicates the onset time of the second social-distancing period. The model was used to make forecasts of new case detection for 14 days after June 21, 2020. The last prediction date was July 5, 2020.

We can propagate the uncertainty in β into uncertainty in ℛ₀ and HIT estimates, using the formula for ℛ₀ given below and HIT = 1 − 1/ℛ₀. In Fig 2, we show marginal posterior distributions for ℛ₀ and HIT for the states of New Jersey, Wyoming, Florida, and Alaska. We provide MAP estimates of the model parameters for all states in SI Table S1. Model parameters were found to be identifiable in practice. (We have no proof of identifiability.) MAP estimates for ℛ₀ and HIT for all 50 states are provided in SI Table S2. These tables also provide 95% credible intervals. These estimates characterize the infectiousness of SARS-CoV-2 ancestral strains in each region of interest.

Figure 2.

Marginal posterior distributions of ℛ₀ (left panels) and HIT (right panels) for ancestral strains of SARS-CoV-2 in four US states: (A, B) New Jersey, (C, D) Wyoming, (E, F) Florida, and (G, H) Alaska. Inferences are based on daily reports of new cases from January 21 to June 21, 2020. Each ℛ₀ posterior was obtained from the corresponding marginal posterior for β and Eq. 1. Each HIT posterior was obtained from the relation HIT = 1 − 1/ℛ₀ and the corresponding marginal posterior for ℛ₀. The 95% credible intervals for ℛ₀ are as follows: (6.44, 7.67) for New Jersey, (2.26, 2.47) for Wyoming, (5.20, 6.41) for Florida, and (2.26, 2.45) for Alaska. The 95% credible intervals for the HIT estimates are as follows: (0.84, 0.87) for New Jersey, (0.56, 0.59) for Wyoming, (0.81, 0.84) for Florida, and (0.56, 0.59) for Alaska. For each panel, the endpoints of the corresponding credible interval are indicated with vertical broken lines.

Region-specific basic reproduction numbers and herd immunity thresholds

To calculate the herd immunity threshold (HIT) for a specific region, we need to know the corresponding region-specific value of the basic reproduction number ℛ₀, which is given by the following formula (obtained as described in Materials and Methods and SI): where β characterizes the rate of transmission attributable to contacts between persons who are not protected by social distancing, f_A denotes the fraction of infected persons who never develop symptoms (i.e., the fraction of asymptomatic cases), c_A characterizes the rate at which asymptomatic persons recover during the immune clearance phase of infection, c_I characterizes the rate at which symptomatic persons with mild disease recover or progress to severe disease, ρ_E is a constant characterizing the relative infectiousness of presymptomatic persons compared to symptomatic persons (with the same behaviors), ρ_A is a constant characterizing the relative infectiousness of asymptomatic persons compared to symptomatic persons (with the same behaviors), m denotes the number of stages in the incubation period, and k_L characterizes disease progression, from one stage of the incubation period to the next and ultimately to an immune clearance phase. The value of ℛ₀ depends on one inferred region-specific parameter, β, and seven fixed parameters, which have values taken to be applicable for all regions (i.e., f_A, c_A, c_I, ρ_E, ρ_A, k_L, and m). Estimates of these fixed parameters were taken from Lin et al. (22).

The SARS-CoV-2 variant Delta (lineage B.1.617.2) has been estimated to be 1.64 times more infectious than variant Alpha (lineage B.1.1.7) (28), which has been estimated to be 1.50 times more infectious than ancestral strains (27). Assuming that Delta is the dominant circulating SARS-CoV-2 strain throughout the US (as of September 20, 2021) and that β for Delta is 1.64 × 1.50 = 2.46 times greater than β for ancestral strains (with other parameters in Eq. 1 remaining the same), the MAP estimate of the Delta-adjusted ℛ₀ ranges from 5.6 for Wyoming to 18 for New Jersey (from the multiplier given above and SI Table S2). The population-weighted Delta-adjusted ℛ₀ for the US is 12. These estimates indicate that the herd immunity threshold (HIT) for the Delta variant of SARS-CoV-2 ranges from 82% to 94%.

Estimates of initial region-specific epidemic growth rates

HIT estimates are directly determined by estimates of the basic reproduction number, which are related to the initial growth rate of the epidemic in a given region. Here, our ℛ₀ estimates are conditioned on a compartmental model that has been parameterized to reproduce case-reporting data available for each region over a five-month period (January 21 to June 21, 2020). We can use parameter estimates obtained for each region to calculate the initial epidemic growth rate λ, which is directly comparable to early surveillance data (Fig 3 and SI Fig S6). We provide MAP estimates and 95% credible intervals for λ, ℛ₀, and HIT for selected states in Table 1. MAP estimates and 95% credible intervals for λ, ℛ₀, and HIT for all states are provided in SI Table S2. These estimates are based on state-specific marginal posteriors for the parameter β of our compartmental model. State-specific MAP estimates and 95% credible intervals for β (and other adjustable model parameters) are given in SI Table S1. As can be seen (e.g., in Fig 3), our λ estimates are consistent with early case reporting data during the exponential takeoff phase of disease transmission.

Figure 3.

Consistency of model-derived λ estimates with empirical growth rates during initial exponential increase in disease incidence in (A) New Jersey, (B) Wyoming, (C) Florida, and (D) Alaska. In each panel, the initial slope of the solid curve corresponds to λ (calculated as described in Materials and Methods), the crosses indicate empirical cumulative case counts, and the broken line is the model prediction based on MAP estimates for adjustable parameters. The solid curve is derived from the reduced model (Eqs. 1-8 in the SI). This curve shows cumulative case counts had there not been any interventions to limit disease transmission. As can be seen, the initial slopes of the solid and broken curves are comparable. We selected n = 0 for New Jersey and Wyoming and n = 1 for Florida and Alaska. Among 35 states with n = 0, New Jersey has the largest inferred λ value (0.45) and Wyoming has the smallest inferred λ value (0.13). Among 15 states with n = 1, Florida has the largest inferred value of λ (0.39) and Alaska has the smallest inferred value of λ (0.13). It should be noted that, in contrast with Fig 1, the y-axis here indicates cumulative (vs. daily) number of cases on a logarithmic (vs. linear) scale.

View this table:

Table 1.

Maximum a posteriori (MAP) estimates and 95% credible intervals for epidemic parameters (β, λ, ℛ₀, HIT, and Delta-adjusted HIT) for the states of New Jersey, Wyoming, Florida, and Alaska.

Sensitivity of β to the surveillance data used in inference

For each state, we used data from January 21 to June 21, 2020 to infer the MAP estimate of β (and the values of the other region-specific adjustable model parameters). Thus, our estimates are derived from a particular subset of the available surveillance data. To check the robustness of MAP estimates for β to variations in training data, we performed a sensitivity analysis wherein we inferred β using data collected over three distinct periods in 2020: 1) January 21 to May 21, 2) January 21 to June 21, and 3) January 21 to July 21. By visualizing our estimates with a rank order plot (Fig 4) and conducting pairwise two-sample Kolmogorov-Smirnov tests (38), we found that the 4-, 5-, and 6-month training datasets yielded estimates for β that were not statistically significantly different from each other. The MAP estimates for β obtained using the 4-, 5-, and 6-month datasets are listed in SI Table S3. We assessed sensitivity by computing the relative error between the β estimates obtained from the 5-month dataset and the average β estimate over all datasets considered. We found that none of the state-level MAP estimates for β showed sensitivity (i.e., a relative error exceeding 100% in magnitude) to variations in the training data (SI Table S4). The largest relative error was 12% (for Kansas).

Figure 4.

MAP estimates of the basic reproduction number ℛ₀ for ancestral strains of SARS-CoV-2 in all 50 US states. The different symbols refer to different training datasets used to estimate ℛ₀. Open triangles correspond to surveillance data collected from January 21 to May 21, 2020, filled circles correspond to surveillance data collected from January 21 to June 21, 2020, and open squares correspond to surveillance data collected from January 21 to July 21, 2020. Estimates of ℛ₀ are sorted by state from largest to smallest values according to the ℛ₀ estimates derived from the surveillance data collected for January 21 to June 21, 2020. The whiskers associated with each filled circle indicates the 95% credible interval (inferred from the 5-month dataset). States are indicated using two-letter US postal service state abbreviations (https://about.usps.com/who-we-are/postal-history/state-abbreviations.pdf).

Global asymptotic stability of the disease-free equilibrium

The model of Lin et al. (22) has a globally asymptotically stable disease-free equilibrium (DFE) if ℛ₀ < 1, which can be deduced by following the approach of Shuai and van den Driessche (39). As a consequence, the model predicts that the epidemic will be extinguished as the system dynamics are attracted to the DFE.

To confirm that the model behaves as expected around the HIT, we conducted a perturbation analysis for the states of New York (Figs 5A and 5B) and Washington (Figs 5C and 5D). We simulated disease dynamics starting from an arbitrarily chosen initial condition near the HIT number of persons, S_h, given by the following formula: S_h = HIT × S₀, where S₀ denotes the population size of the region considered. We defined the size of our perturbation as ε = 0.2 × S_h for Figs 5A and 5C and as ε = −0.2 × S_h for Figs 5B and 5D. The initial condition was S₀ − S_h − 1 + ε susceptible persons, 1 infected person, and S_h − ε recovered persons. As expected, for S_h < HIT × S₀ (Figs 5A and 5C), the number of infectious persons grows over time, whereas for S_h > HIT × S₀ (Figs 5B and 5D), the number of infectious persons decays over time.

Figure 5.

Perturbation analysis using the full model of Lin et al. (22) for the states of New York (panels A and B) and Washington (panels C and D). In each panel, the black solid line represents the number of infectious persons (initially 1), the black broken line represents the threshold number of persons required for herd immunity (i.e., S_h), and the gray broken line represents the number of recovered persons (initially S_h − ε, obtained as described in Results). Simulations are based on MAP estimates for model parameters obtained using surveillance data collected from January 21 to June 21, 2020.

Progress toward herd immunity

From our state-specific HIT estimates and other information (discussed below), we were able to calculate percent progress toward herd immunity for each state (Fig 6, SI Table S5). We estimated the percent progress of each state’s population toward herd immunity, 𝒫 ∈ [0%, 100%], using the following equation (the derivation of which is given in the SI): where ℛ₀ is the population-specific basic reproduction number that we estimated for ancestral strains (SI Table S2), Y_Delta is a multiplier that accounts for the increased transmissibility of SARS-CoV-2 variant Delta, f_r denotes the fraction of the population with immunity acquired through infection, f_v is the fraction of the population that has been vaccinated (24), ε_r is the fraction of infected persons who are protected against productive infection (i.e., an infection that can be transmitted to others), and ε_v is the fraction of vaccinated persons who are protected against productive infection. Recall that we use Y_Delta = 2.46 (27-28). We estimate that ε_r = 1.0 (40) and ε_v = 0.66 (41). We obtain four different estimates for f_r as follows. In the first case, we obtain f_r as the cumulative number of detected cases within a population divided by the population size. In the second case, we adjust our previous estimate for f_r by a multiplier of 5.8 (42). In other words, we assume that the true disease burden is 5.8 times higher than the detected number of cases. In the third case, we obtain f_r as the fraction of the population that has been infected according to the latest serological survey results reported online at Ref. (25). In the fourth case, we assume f_r = f_r,0 /(1 − f_A), where f_r,0 denotes the estimate of seroprevalence in a given region and f_A denotes the fraction of all cases that are asymptomatic. With this approach, we are assuming that asymptomatic cases are not detected in serological testing (43). We adopt the estimate of Lin et al. (22) that f_A = 0.44.

Figure 6.

Percent progress toward herd immunity in each of the 50 US states. Percent progress 𝒫indicates the fraction of immune persons required for herd immunity. 𝒫 was calculated using Eq. 2. Black bars (Panel A) correspond to the first scenario (i.e., f_r estimated as the number of detected cases divided by population size), gray bars (Panels A and C) correspond to the second scenario (i.e., f_r estimated as the number of detected cases within a population divided by the population size, adjusted for lack of detection of undiagnosed SARS-CoV-2 infections), black bars (Panel B) correspond to the third scenario (i.e., f_r given by seroprevalence survey results), and gray bars (Panels B and D) correspond to the fourth scenario (i.e., f_r given by seroprevalence survey results adjusted for lack of detection of asymptomatic cases). Estimates for 𝒫 are sorted by state from largest to smallest values according to the second scenario (Panels A and C) and the fourth scenario (Panels B and D). North Dakota was omitted from Panels B and D because a recent estimate of seroprevalence was not available at Ref. (25). States are indicated using two-letter US postal service state abbreviations (https://about.usps.com/who-we-are/postal-history/state-abbreviations.pdf).

As can be seen in Fig 6C, which is based on case reporting data, 18 of the 50 states have reached herd immunity. However, in Fig 6D, which is based on serological survey data, none of the states have reached herd immunity. South Dakota is closest to herd immunity, with 84% of the immune persons required for herd immunity. Idaho is furthest from herd immunity, with 45% of the immune persons required for herd immunity. The mean (median) progress toward herd immunity, across all states, is 63% (63%).

Discussion

One of our most important findings is quantification of how COVID-19 transmissibility, in terms of the basic reproduction number ℛ₀, varies across the 50 US states. The MAP value of ℛ₀ for ancestral strains of SARS-CoV-2 ranges from 2.3 for Wyoming to 7.1 for New Jersey. The population-weighted mean for the US is 4.7. These estimates indicate that the herd immunity threshold (HIT) for the Delta variant of SARS-CoV-2 ranges from 82% to 94%, assuming that Delta is 2.46 times more transmissible than ancestral strains. The uncertainty in each ℛ₀ estimate was quantified: 95% credible intervals are indicated in Figure 4. The 95% credible intervals for ancestral HIT estimates are given in SI Table S2. Because we can estimate the relative effort required to reach herd immunity across the US (in terms of HIT), resources for vaccination campaigns can be targeted to those areas where it is more difficult to achieve herd immunity.

Our ℛ₀ and HIT estimates differ from estimates given in previous studies. For example, various researchers derived point estimates for ℛ₀ from data using tools from time-series analysis, without assuming an underlying mechanistic model (13, 15). These tools depend on slope estimation and thus can be expected to depend sensitively on noise and errors in early case-reporting data. Ives and Bozzuto (16) provided state-level estimates for ℛ₀ (in 36 states), and Fellows et al. (17) used a Bayesian framework to obtain state-level estimates for ℛ₀ (in all 50 states). For the 30 states that are considered in Ives and Bozzuto (16), Fellows et al. (17), Milicevic et al. (18), and the present study, our estimates for ℛ₀ were most similar to those of Milicevic et al. (18) (SI Table S6). Milicevic et al. (18) provided state-level ℛ₀ point estimates (for 45 states) that are statistically consistent with our MAP estimates of ℛ₀ for ancestral strains of SARS-CoV-2. The main points of difference between these earlier studies and the present study are as follows. Our ℛ₀ and HIT estimates were obtained from a model consistent with new case-reporting data, as illustrated in Figs 1 and 3. We were able to provide estimates for all 50 states (Fig 4, SI Table S2), and we were able to obtain a Bayesian quantification of the uncertainty in each estimate (Fig 4, SI Table S2).

In the face of Delta, the estimates of Fig6C (based on case reporting data) suggest that a majority of states have yet to achieve herd immunity, and the estimates of Fig 6D (based on serological survey results) suggest that no state in the US has achieved herd immunity as of September 20, 2021. In either case, persons in the US lacking immunity are still at risk (44). The perspective provided by Fig 6D is consistent with the study of Moghadas et al. (45) indicating that only 62% of persons in the US had some form of immunity as of July 15, 2021 (either through infection or vaccination). Given that the percentage of immune persons required for herd immunity according to Fig 6D ranges from 84% for South Dakota to 45% for Idaho (Fig 6D) ∼20 months (counting from January 2020) into the COVID-19 pandemic and ∼9 months after vaccines became widely available, it seems that this situation will persist for months, if not years. How can the US accelerate the approach to herd immunity?

Policies that encourage infection of children and vaccinated persons who have healthy immune systems may be rationalized because such persons seem to be well-protected against severe (but not mild) disease (46) and infected persons seem to have greater protection against productive infection (40). However, this approach has obvious drawbacks, starting with the risks of infection. Another is that non-immune persons may not be able to self-identify as such. Unfortunately, it seems that we cannot rely on currently available vaccines to stop community transmission. Delta-adjusted HITs are mathematically impossible to achieve through vaccination alone because these HITs are close to 1 (SI Table S2) and vaccine protection against productive infection is imperfect (i.e., ε_v is significantly less than 1) (41). Thus, use of Delta-targeted vaccines may be needed to accelerate the approach to herd immunity and to minimize COVID-19 impacts.

One potential benefit of our comprehensive state-level ℛ₀ estimates is that they quantify how differences in social structure and contact patterns across the US—the factors presumably underlying the spatial heterogeneity in β and ℛ₀ —influence the spread of an aerosol-transmitted virus (47-48). This information, by identifying the regions in the US where transmission is likely to be highest, could be useful for responding to future pandemics caused by viruses similar to SARS-CoV-2.

Our study has several notable limitations. Our HIT estimates are potentially biased downward because of general awareness within the US of the impacts of COVID-19 in other countries (e.g., China and Italy), which could have resulted in a fraction of the US population changing their behaviors to protect themselves from COVID-19 before the start of the local epidemic. In addition, our estimation of percent progress toward herd immunity crucially depends on seroprevalence estimates of the true disease burden. These estimates are associated with some uncertainty (49-51). As illustrated in Fig 6, percent progress toward herd immunity is underestimated if serological tests fail to detect all cases of infection. The reader must also be cautioned that our analysis depends on a number of assumptions. For example, we considered a compartmental model in which populations are taken to be well-mixed and to lack age structure. This is clearly a simplification. More refined estimates could be obtained by making the model more realistic, but this would have the drawback of increasing the complexity of inference, which at some point would make inference impracticable.

Data Availability

Inferences were obtained using problem-specific code. The functionality of the code has been added to a freely available open-source software package (PyBioNetFit, version 1.1.9). We have confirmed that the results of the problem-specific code are reproduced by PyBioNetFit.

https://github.com/lanl/PyBNF

Author contributions

A.M., R.G.P, Y.T.L., and W.S.H. designed research; A.M., J.N., E.F.M., Y.C., R.G.P., Y.T.L., and W.S.H. performed research; A.M., J.N., Y.T.L., and W.S.H. analyzed data; and A.M. and W.S.H. wrote the paper.

Supplementary Information

Reduced Model

We derive ℛ₀ from a simplified form of the compartmental model of Lin et al. (1). The reduced model is obtained by omitting variables and terms for interventions, including social distancing, quarantine, and self-isolation. Thus, the reduced model describes disease transmission dynamics in the absence of interventions. The equations of the reduced model are as follows: where t denotes time, β, S₀, ρ_E, ρ_A, k_L, f_A, f_H, f_R, c_A, c_I, and c_H are positive-valued time-invariant parameters, as defined in Lin et al. (1), and m denotes the number of stages taken to comprise the incubation period. Here and in the study of Lin et al. (1), m = 5. The values of β (a rate constant characterizing disease transmission) and S₀ (the total population) are taken to be region-specific; the other parameters have values that are taken to be universal (i.e., applicable to all regions of interest). The variable S denotes the population of susceptible persons. The variables E₁ to E_m denote populations of exposed persons, e.g., persons incubating virus but not symptomatic. As noted earlier, the incubation period is divided into m stages. The variable A denotes the population of persons who have progressed through the incubation period but will never develop symptoms (i.e., persons with asymptomatic infections). The variable I denotes the population of persons with mild symptomatic disease. The variable H denotes the population of persons with severe disease who are hospitalized or isolated at home. The variable R denotes the population of recovered persons, and the variable D denotes the population of deceased persons.

Basic Reproduction Number

The basic reproduction number, ℛ₀, is defined as the number of secondary infections caused by an infected person during the entire period of infectiousness when introduced into a population consisting of susceptible persons only and there are no interventions to limit disease transmission. Here, we use the next-generation matrix method to compute ℛ₀ (2). The model has a disease-free equilibrium (DFE) x₀ with S = S₀, where S₀ is the total population and the remaining populations (E₁, E₂, …, E_m, A, I, H, R, D) are equal to 0.

To use the next-generation matrix method, we let x = (E₁, E₂, E₃, E₄, E₅, A, I) denote the vector of state variables corresponding to compartments containing infected persons. For each infected compartment i, we define f_i as the rate of entry of newly infected persons into compartment i and v_i as the net transfer of persons out of the i^th compartment. Then, we have dx_i/dt = f_i(x) – v_i(x). Now, we let F and V denote the Jacobians of f and v evaluated at the disease-free equilibrium x₀. The (i, j) entry of the matrix F is the rate at which infected persons in the j^th compartment produce a new infection in the i^th compartment. The (j, k) entry of the matrix V⁻¹ is the expected amount of time that a person introduced to the k^th compartment will spend in a single visit to the j^th compartment. The matrix F, which is non-negative, is defined as follows:

The matrix V, which is non-singular (i.e., invertible), is defined as follows:

We find ℛ₀ as the spectral radius (i.e., the dominant eigenvalue) of the matrix FV⁻¹ (2), which is given by Eq. 1 in the main text.

Epidemic growth rate

The epidemic growth rate λ is defined as the dominant eigenvalue of the Jacobian of the reduced model linearized at the disease-free equilibrium (DFE). Thus, λ is the largest root of the characteristic polynomial for the 7-dimensional Jacobian matrix J, which is equivalent to F − V. We used the CharacteristicPolynomial function in Mathematica (3) to find J:

The largest root was found numerically. Solutions were based on state-specific estimates for β and the estimates of Lin et al. (1) for other parameters in Eq. 9.

Progress toward herd immunity

In this section, we explain the assumptions and derive the formula for our metric of progress toward herd immunity (Eq. 2 in the main text). First, we define the variables used in our analysis. For a given region, S₀ denotes the total population size, N_d denotes the cumulative number of cases detected, N_a denotes the cumulative number of asymptomatic cases, N_v denotes the cumulative number of vaccinations completed, N_v,s denotes the number of persons who were susceptible at the time of vaccination, N_v,r denotes the number of persons who had recovered from infection at the time of vaccination, N_c denotes the cumulative number of all cases, ε_v denotes the fraction of vaccinated individuals protected from productive infection (i.e., an infection that can be transmitted to others), ε_r denotes the fraction of recovered individuals protected from productive infection, N_r denotes the number of individuals who have recovered from infection, HIT denotes the herd immunity threshold for ancestral strains, Y_Delta denotes the infectiousness of SARS-CoV-2 variant Delta relative to ancestral strains, S_h ≡ HIT × S₀ denotes the threshold number of persons with immunity needed for herd immunity (in the face of ancestral strains), S_i denotes the estimated number of persons with immunity, f_A ≡ N_a/N_c denotes the fraction of all cases that are asymptomatic, f_r ≡ N_r/S₀ denotes the fraction of the population with immunity acquired through infection, and f_v ≡ N_v/S₀ denotes the fraction of the population that has been vaccinated.

We assume that S₀ is constant. We take N_r = N_c to be a good approximation. We assume that we know S₀, N_d, and N_v. We assume that susceptible and recovered individuals have the same probability of being vaccinated. From our assumption that susceptible and recovered individuals have the same probability of being vaccinated, it follows that N_v,s = (1 − f_r)N_v and N_v,r = f_rN_v. These relations are consistent with N_v ≡ N_v,s + N_v,r. The number of individuals with immunity (protection from productive infection) is given by

We assume that Y_Delta gives the value of β for SARS-CoV-2 variant Delta relative to β for ancestral strains. We assume all other model parameters are the same for Delta. Thus, Y_Deltaℛ₀ is the basic reproduction number in the face of Delta. We define 𝒫, percent progress toward herd immunity, as

Using the expression given above for S_i (Eq. 10), 1 − 1/(Y_Deltaℛ₀) as the Delta-adjusted HIT, and S₅ = HIT × S₀, we find Eq. 2 in the main text.

SI Figure Legends

Figure S1. Consistency of results obtained from different codes used to perform Markov chain Monte Carlo (MCMC) sampling. Shown here are 1-dimensional marginal posteriors of parameters for Wyoming (n = 0) derived using the Python code of Lin et al. (1) (blue) and PyBioNetFit (4) (red).

Figure S2. Markov chain log-likelihood trace plots for each of the 50 US states. Bayesian inference was conditioned on the compartmental model of Lin et al. (1). Bayesian inference was performed as described by Lin et al. (1) except that training data consisted of daily COVID-19 case counts for states (vs. case counts for metropolitan statistical areas). The compartmental model accounts for an initial social distancing period followed by n additional periods. We considered n = 0, 1 and 2 and selected the best n using the model selection procedure described by Lin et al. (1). The number of epochs (or iterations) used for each state was chosen so that convergence was achieved in each case. Inferences are based on daily reports of new cases of COVID-19 from January 21 to June 21, 2020.

Figure S3. Parameter trace plots for each of the 50 US states. These parameter trace plots are matched to the likelihood trace plots of Fig S2. It should be noted that the number of parameters varies across the states depending on the selected value of n. See the caption of Fig S2 for additional details.

Figure S4. Matrix of 1- and 2-dimensional marginalizations of the posterior samples obtained for the adjustable parameters associated with the compartmental model for each of the 50 US states. Inferences are based on daily reports of new cases of COVID-19 from January 21 to June 21, 2020. Plots of marginal posteriors (1-dimensional marginalizations) are shown on the diagonal from top left to bottom right. Other plots are 2-dimensional marginalizations (presented as histograms) indicating the correlations between parameter estimates. Brightness indicates higher probability density. A compact bright area indicates low correlation. An extended, asymmetric bright area indicates high correlation. The pairs plots shown here are matched to the trace plots of Figs S3 and S4. See the caption of Fig S2 for additional details.

Figure S5. Posterior predictive checking. The time-dependent predictive posterior distribution for daily number of COVID-19 cases detected is visualized for all states except New Jersey, Wyoming, Florida, and Alaska, which are considered in Fig 1 of the main text. Inferences are based on daily reports of new COVID-19 cases from January 21 to June 21, 2020 (inclusive dates). The compartmental model (1) accounts for an initial social distancing period followed by n additional periods. We considered n = 0, 1 and 2 and selected the best n using the model selection procedure of Lin et al. (1). Crosses indicate observed daily case reports. The shaded region indicates the prediction uncertainty and inferred noise in detection of new cases. The color-coded bands within the shaded region indicate the median and different credible intervals (e.g., dark purple corresponds to the median, the lightest shade of yellow corresponds to the 95% credible interval, and gradations of color between these two extremes correspond to different credible intervals as indicated in the legend). In each panel, the vertical broken line indicates the onset time of the first social-distancing period. For states with n = 1, there is an additional (rightmost) broken line, which indicates the onset time of the second social-distancing period. The model was used to make forecasts of new case detection for 14 days after June 21, 2020. The last prediction date was July 5, 2020.

Figure S6. Consistency of model-derived λ estimates with empirical growth rates during initial exponential increase in disease incidence in 46 states of the US (i.e., excluding New Jersey, Wyoming, Florida, and Alaska; see Fig 3 in the main text). In each panel, the initial slope of the solid curve corresponds to λ (calculated as described in Materials and Methods), the crosses indicate empirical cumulative case counts, and the broken line is the model prediction based on MAP estimates for adjustable parameters. The solid curve is derived from the reduced model (Eqs. 1-8 in the SI). This curve shows cumulative case counts had there not been any interventions to limit disease transmission. As can be seen, the initial slopes of the solid and broken curves are comparable. It should be noted that, in contrast with Fig S5, the y-axis here indicates cumulative (vs. daily) number of cases on a logarithmic (vs. linear) scale.

Acknowledgements

A.M. was supported by the 2020 Mathematical Sciences Graduate Internship program, which is sponsored by the Division of Mathematical Sciences of the National Science Foundation. E.F.M, J.N., Y.C., R.G.P, and W.S.H. were supported by NIH/NIGMS Grant R01GM111510. Y.T.L. was supported by the Laboratory Directed Research and Development (LDRD) program at Los Alamos National Laboratory. Computational resources for this study consisted of the FARM cluster, a Linux-based supercomputing cluster for the University of California at Davis, and Northern Arizona University’s Monsoon cluster, which is funded by Arizona’s Technology and Research Initiative Fund.

References

1.↵
J. Gee et al., First month of COVID-19 vaccine safety monitoring—United States, December 14, 2020-January 13, 2021. MMWR Morb Mortal Wkly Rep. 70, 283–288 (2021).
OpenUrl CrossRef PubMed
2.↵
National Center for Immunization and Respiratory Diseases (NCIRD), Data from Centers for Disease Control and Prevention (CDC). https://data.cdc.gov/Vaccinations/COVID-19-Vaccinations-in-the-United-States-Jurisdi/unsk-b7fc. Accessed 20 September 2021.
3.↵
P. Fine, K. Eames, D. L. Heymann, “Herd immunity”: a rough guide. Clin Infect Dis. 7, 911–916 (2011).
OpenUrl
4.↵
B. Ridenhour, J. M. Kowalik, D. K. Shay, Unraveling R0: Considerations for public health applications. Am J Public Health. 108, S445–S454 (2018).
OpenUrl
5.↵
L. Temime et al., A conceptual discussion about the basic reproduction number of severe acute respiratory syndrome coronavirus 2 in healthcare settings. Clin Infect Dis., 72, 141–143 (2021).
OpenUrl CrossRef
6.↵
J. M. Dan et al., Immunological memory to SARS-CoV-2 assessed for up to 8 months after infection. Science. 371, eabf4063 (2021).
OpenUrl Abstract/FREE Full Text
7.↵
C.-J. Yu et al., Assessment of basic reproductive number for COVID-19 at global level: A meta-analysis. Medicine. 100, e25837 (2021).
OpenUrl
8.↵
A. J. Kucharski et al., Early dynamics of transmission and control of COVID-19: a mathematical modelling study. Lancet. 20, 553–558 (2020).
OpenUrl
9.
R. Li et al., Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science. 368, 489–493 (2020).
OpenUrl Abstract/FREE Full Text
10.
L. Ferretti et al., Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing. Science. 368, eabb6936 (2020).
OpenUrl Abstract/FREE Full Text
11.
M. D’Arienzo, A. Coniglio, Assessment of the SARS-CoV-2 basic reproduction number, ℛ₀, based on the early phase of COVID-19 outbreak in Italy. Biosaf Health. 2, 57–59 (2020).
OpenUrl
12.↵
Sanche et al., High contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2. Emerg Infect Dis. 26, 1470–1477 (2020).
OpenUrl PubMed
13.↵
E. O. Romero-Severson, N. Hengartner, G. Meadors, R. Ke, Change in global transmission rates of COVID-19 through May 6 2020. PLOS ONE. 15, e0236776 (2020).
OpenUrl CrossRef PubMed
14.
R. Ke, E. O. Romero-Severson, S. Sanche, N. Hengartner, Estimating the reproductive number ℛ₀ of SARS-CoV-2 in the United States and eight European countries and implications for vaccination. J Theor Biol. 517, 110621 (2021).
OpenUrl CrossRef
15.↵
J. D. Kong, E. W. Tekwa, S. A. Gignoux-Wolfsohn, Social, economic, and environmental factors influencing the basic reproduction number of COVID-19 across countries. PLOS ONE. 16, e0252373 (2021).
OpenUrl
16.↵
A. R. Ives, C. Bozzuto, State-by-State estimates of R0 at the start of COVID-19 outbreaks in the USA. medRxiv [Preprint] (2020). https://www.medrxiv.org/content/10.1101/2020.05.17.20104653v3 (accessed 4 September 2021).
17.↵
I. E. Fellows, R. B. Slayton, A. J. Hakim, The COVID-19 pandemic, community mobility and the effectiveness of non-pharmaceutical interventions: The United States of America, February to May 2020. arXiv [Preprint] (2020). https://arxiv.org/abs/2007.12644 (accessed 8 September 2021).
18.↵
O. Milicevic et al., PM2.5 as a major predictor of COVID-19 basic reproduction number in the USA. Environmental Research. 201, 111526 (2021).
OpenUrl
19.↵
A. R. Ives, C. Bozzuto, Estimating and explaining the spread of COVID-19 at the county level in the USA. Commun Biol. 4, 1–9 (2021).
OpenUrl
20.↵
K. T. Sy, L. F. White, B. E. Nichols, Population density and basic reproductive number of COVID-19 across United States counties. PLOS ONE. 16, e0249271 (2021).
OpenUrl PubMed
21.↵
C. S. Weissert, M. J. Uttermark, K. R. Mackie, A. Artiles, Governors in control: Executive orders, state-local preemption, and the COVID-19 pandemic. Publius. 51, 396–428 (2021).
OpenUrl
22.↵
Y. T. Lin et al., Daily forecasting of regional epidemics of coronavirus disease with bayesian uncertainty quantification. Emerg Infect Dis. 27, 767–778 (2021).
OpenUrl
23.↵
The New York Times COVID-19 Data Team, Data from The New York Times. https://github.com/nytimes/covid-19-data. Accessed 20 September 2021.
24.↵
The Covid Act Now COVID-19 Data Team, Data from Covid Act Now. https://covidactnow.org/data-api. Accessed 20 September 2021.
25.↵
Surveillance Review and Response Group, Data from Centers for Disease Control and Prevention (CDC). https://covid.cdc.gov/covid-data-tracker/#national-lab. Accessed 20 September 2021.
26.↵
K. L. Bajema et al., Estimated SARS-CoV-2 Seroprevalence in the US as of September 2020. JAMA. 181, 450–460 (2021).
OpenUrl
27.↵
H. Fort, A very simple model to account for the rapid rise of the alpha variant of SARS-CoV-2 in several countries and the world. Virus Res. 304, 198531 (2021).
OpenUrl
28.↵
H. Allen et al., Increased household transmission of COVID-19 cases associated with SARS-CoV-2 Variant of Concern B.1.617.2: a national case-control study. (2021). https://khub.net/documents/135939561/405676950/Increased+Household+Transmission+of+COVID-19+Cases+-+national+case+study.pdf/7f7764fb-ecb0-da31-77b3-b1a8ef7be9aa (accessed 9 July 2021).
29.↵
P. Virtanen et al., SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 17, 261–272 (2020).
OpenUrl CrossRef PubMed
30.↵
L. Petzold, Automatic selection of methods for solving stiff and nonstiff systems of ordinary differential equations. SIAM J. Sci. Comput. 4, 136–148 (1983).
OpenUrl
31.↵
M. L. Blinov, J. R. Faeder, B. Goldstein, W. S. Hlavacek, BioNetGen: software for rule-based modeling of signal transduction based on the interactions of molecular domains. Bioinformatics. 20, 3289–3291 (2004).
OpenUrl CrossRef PubMed Web of Science
32.↵
S. D. Cohen, CVODE, a stiff/nonstiff ODE solver in C. Computers in physics. 10, 138–143 (1996).
OpenUrl
33.↵
S. K. Lam, A. Pitrou, S. Seibert, “Numba: A LLVM-based Python JIT compiler” in Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, (Association for Computing Machinery, New York, NY, 2015), pp. 1–6.
34.↵
O. Diekmann, J. A. Heesterbeek, M. G. Roberts, The construction of next-generation matrices for compartmental epidemic models. J R Soc Interface. 7, 873–875 (2010).
OpenUrl CrossRef PubMed Web of Science
35.↵
Wolfram S., Mathematica: A System for Doing Mathematics by Computer (Addison Wesley Longman Publishing Co., Inc., Boston, MA, 1991).
36.↵
H. J. Wearing, P. Rohani, M. J. Keeling. Appropriate models for the management of infectious diseases. PLOS Med. 7, e174 (2005).
OpenUrl
37.↵
E. D. Mitra et al., PyBioNetFit and the Biological Property Specification Language. iScience. 19, 1012–1036 (2019).
OpenUrl
38.↵
F. J. Massey, The Kolmogorov-Smirnov test for goodness of fit. J. Am. Stat. Assoc. 46, 68–78 (1951).
OpenUrl CrossRef Web of Science
39.↵
Z. Shuai, P. van den Driessche, Global stability of infectious disease models using Lyapunov functions, SIAM J. Appl. Math. 73, 1513–1532 (2013).
OpenUrl
40.↵
I. Dorigatti et al., SARS-CoV-2 antibody dynamics and transmission from community-wide serological testing in the Italian municipality of Vo’. Nat Commun. 12, 1–11 (2021).
OpenUrl CrossRef PubMed
41.↵
A. Fowlkes et al., Effectiveness of COVID-19 vaccines in preventing SARS-CoV-2 infection among frontline workers before and during B.1.617.2 (Delta) variant predominance—eight US locations, December 2020-August 2021. MMWR Morb Mortal Wkly Rep. 70, 1167–1169 (2021).
OpenUrl
42.↵
H. Kalish et al., Undiagnosed SARS-CoV-2 seropositivity during the first 6 months of the COVID-19 pandemic in the United States. Sci. Transl. Med. 13, eabh3826 (2021).
OpenUrl FREE Full Text
43.↵
S. Takahashi, B. Greenhouse, I. Rodríguez-Barraquer, Are seroprevalence estimates for severe acute respiratory syndrome coronavirus 2 biased?, J Infect. Dis. 222, 1772–1775 (2020).
OpenUrl
44.↵
H. E. Randolph, L. B. Barreiro, Herd immunity: understanding COVID-19. Immunity. 5, 737–741 (2020).
OpenUrl
45.↵
S. M. Moghadas, P. Sah, A. Shoukat, L. A. Meyers, A. P. Galvani, Population immunity against COVID-19 in the United States. Ann Intern Med. doi:10.7326/M21-2721 (2021).
OpenUrl CrossRef
46.↵
Science Brief: COVID-19 vaccines and vaccination. (2021). https://www.cdc.gov/coronavirus/2019-ncov/science/science-briefs/fully-vaccinated-people.html (accessed 8 Sep 2021).
47.↵
V. Stadnytskyi, C. E. Bax, A. Bax, P. Anfinrud, The airborne lifetime of small speech droplets and their potential importance in SARS-CoV-2 transmission. Proc Natl Acad Sci U.S.A. 117, 11875–11877 (2020).
OpenUrl Abstract/FREE Full Text
48.↵
M. Echternach et al. Impulse dispersion of aerosols during singing and speaking: A potential COVID-19 transmission pathway. Am J Respir Crit Care Med. 202, 1584–1587 (2020).
OpenUrl
49.↵
D. B. Larremore, B. K. Fosdick, S. Zhang, Y. H. Grad. Jointly modeling prevalence, sensitivity and specificity for optimal sample allocation. bioRxiv [Preprint] (2020). https://www.biorxiv.org/content/10.1101/2020.05.23.112649v1 (accessed 8 September 2021).
50.
A. Gelman, B. Carpenter. Bayesian analysis of tests with unknown specificity and sensitivity. J R Stat Soc Ser C Appl Stat. 69, 1269–1283 (2020).
OpenUrl
51.↵
E. Bendavid et al., COVID-19 antibody seroprevalence in Santa Clara County, California. Int J Epidemiol. 50, 410–419 (2021).
OpenUrl PubMed

References

1.↵
Y. T. Lin et al., Daily forecasting of regional epidemics of coronavirus disease with bayesian uncertainty quantification. Emerg Infect Dis. 27, 767–778 (2021).
OpenUrl
2.↵
O. Diekmann, J. A. Heesterbeek, M. G. Roberts, The construction of next-generation matrices for compartmental epidemic models. J R Soc Interface. 7, 873–875 (2010).
OpenUrl CrossRef PubMed Web of Science
3.↵
Wolfram S., Mathematica: A System for Doing Mathematics by Computer (Addison Wesley Longman Publishing Co., Inc., Boston, MA, 1991).
4.↵
E. D. Mitra et al., PyBioNetFit and the biological property specification language. iScience. 19, 1012–1036 (2019).
OpenUrl

View the discussion thread.

Posted September 28, 2021.

Download PDF

Supplementary Material

Data/Code

Citation Tools

Subject Area

Epidemiology

Subject Areas

All Articles

Addiction Medicine (361)
Allergy and Immunology (681)
Anesthesia (182)
Cardiovascular Medicine (2712)
Dentistry and Oral Medicine (320)
Dermatology (232)
Emergency Medicine (409)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (962)
Epidemiology (12355)
Forensic Medicine (10)
Gastroenterology (779)
Genetic and Genomic Medicine (4202)
Geriatric Medicine (394)
Health Economics (694)
Health Informatics (2716)
Health Policy (1012)
Health Systems and Quality Improvement (1010)
Hematology (367)
HIV/AIDS (871)
Infectious Diseases (except HIV/AIDS) (13794)
Intensive Care and Critical Care Medicine (806)
Medical Education (402)
Medical Ethics (111)
Nephrology (447)
Neurology (3990)
Nursing (216)
Nutrition (589)
Obstetrics and Gynecology (758)
Occupational and Environmental Health (707)
Oncology (2103)
Ophthalmology (598)
Orthopedics (249)
Otolaryngology (309)
Pain Medicine (255)
Palliative Medicine (77)
Pathology (475)
Pediatrics (1140)
Pharmacology and Therapeutics (474)
Primary Care Research (464)
Psychiatry and Clinical Psychology (3519)
Public and Global Health (6615)
Radiology and Imaging (1432)
Rehabilitation Medicine and Physical Therapy (839)
Respiratory Medicine (879)
Rheumatology (417)
Sexual and Reproductive Health (416)
Sports Medicine (349)
Surgery (458)
Toxicology (57)
Transplantation (192)
Urology (170)

[1] 1.↵
J. Gee et al., First month of COVID-19 vaccine safety monitoring—United States, December 14, 2020-January 13, 2021. MMWR Morb Mortal Wkly Rep. 70, 283–288 (2021).
OpenUrl CrossRef PubMed

[2] 2.↵
National Center for Immunization and Respiratory Diseases (NCIRD), Data from Centers for Disease Control and Prevention (CDC). https://data.cdc.gov/Vaccinations/COVID-19-Vaccinations-in-the-United-States-Jurisdi/unsk-b7fc. Accessed 20 September 2021.

[3] 3.↵
P. Fine, K. Eames, D. L. Heymann, “Herd immunity”: a rough guide. Clin Infect Dis. 7, 911–916 (2011).
OpenUrl

[4] 4.↵
B. Ridenhour, J. M. Kowalik, D. K. Shay, Unraveling R0: Considerations for public health applications. Am J Public Health. 108, S445–S454 (2018).
OpenUrl

[5] 5.↵
L. Temime et al., A conceptual discussion about the basic reproduction number of severe acute respiratory syndrome coronavirus 2 in healthcare settings. Clin Infect Dis., 72, 141–143 (2021).
OpenUrl CrossRef

[6] 6.↵
J. M. Dan et al., Immunological memory to SARS-CoV-2 assessed for up to 8 months after infection. Science. 371, eabf4063 (2021).
OpenUrl Abstract/FREE Full Text

[7] 7.↵
C.-J. Yu et al., Assessment of basic reproductive number for COVID-19 at global level: A meta-analysis. Medicine. 100, e25837 (2021).
OpenUrl

[8] 8.↵
A. J. Kucharski et al., Early dynamics of transmission and control of COVID-19: a mathematical modelling study. Lancet. 20, 553–558 (2020).
OpenUrl

[9] 9.
R. Li et al., Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science. 368, 489–493 (2020).
OpenUrl Abstract/FREE Full Text

[10] 10.
L. Ferretti et al., Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing. Science. 368, eabb6936 (2020).
OpenUrl Abstract/FREE Full Text

[11] 11.
M. D’Arienzo, A. Coniglio, Assessment of the SARS-CoV-2 basic reproduction number, ℛ₀, based on the early phase of COVID-19 outbreak in Italy. Biosaf Health. 2, 57–59 (2020).
OpenUrl

[12] 12.↵
Sanche et al., High contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2. Emerg Infect Dis. 26, 1470–1477 (2020).
OpenUrl PubMed

[13] 13.↵
E. O. Romero-Severson, N. Hengartner, G. Meadors, R. Ke, Change in global transmission rates of COVID-19 through May 6 2020. PLOS ONE. 15, e0236776 (2020).
OpenUrl CrossRef PubMed

[14] 14.
R. Ke, E. O. Romero-Severson, S. Sanche, N. Hengartner, Estimating the reproductive number ℛ₀ of SARS-CoV-2 in the United States and eight European countries and implications for vaccination. J Theor Biol. 517, 110621 (2021).
OpenUrl CrossRef

[15] 15.↵
J. D. Kong, E. W. Tekwa, S. A. Gignoux-Wolfsohn, Social, economic, and environmental factors influencing the basic reproduction number of COVID-19 across countries. PLOS ONE. 16, e0252373 (2021).
OpenUrl

[16] 16.↵
A. R. Ives, C. Bozzuto, State-by-State estimates of R0 at the start of COVID-19 outbreaks in the USA. medRxiv [Preprint] (2020). https://www.medrxiv.org/content/10.1101/2020.05.17.20104653v3 (accessed 4 September 2021).

[17] 17.↵
I. E. Fellows, R. B. Slayton, A. J. Hakim, The COVID-19 pandemic, community mobility and the effectiveness of non-pharmaceutical interventions: The United States of America, February to May 2020. arXiv [Preprint] (2020). https://arxiv.org/abs/2007.12644 (accessed 8 September 2021).

[18] 18.↵
O. Milicevic et al., PM2.5 as a major predictor of COVID-19 basic reproduction number in the USA. Environmental Research. 201, 111526 (2021).
OpenUrl

[19] 19.↵
A. R. Ives, C. Bozzuto, Estimating and explaining the spread of COVID-19 at the county level in the USA. Commun Biol. 4, 1–9 (2021).
OpenUrl

[20] 20.↵
K. T. Sy, L. F. White, B. E. Nichols, Population density and basic reproductive number of COVID-19 across United States counties. PLOS ONE. 16, e0249271 (2021).
OpenUrl PubMed

[21] 21.↵
C. S. Weissert, M. J. Uttermark, K. R. Mackie, A. Artiles, Governors in control: Executive orders, state-local preemption, and the COVID-19 pandemic. Publius. 51, 396–428 (2021).
OpenUrl

[22] 22.↵
Y. T. Lin et al., Daily forecasting of regional epidemics of coronavirus disease with bayesian uncertainty quantification. Emerg Infect Dis. 27, 767–778 (2021).
OpenUrl

[23] 23.↵
The New York Times COVID-19 Data Team, Data from The New York Times. https://github.com/nytimes/covid-19-data. Accessed 20 September 2021.

[24] 24.↵
The Covid Act Now COVID-19 Data Team, Data from Covid Act Now. https://covidactnow.org/data-api. Accessed 20 September 2021.

[25] 25.↵
Surveillance Review and Response Group, Data from Centers for Disease Control and Prevention (CDC). https://covid.cdc.gov/covid-data-tracker/#national-lab. Accessed 20 September 2021.

[26] 26.↵
K. L. Bajema et al., Estimated SARS-CoV-2 Seroprevalence in the US as of September 2020. JAMA. 181, 450–460 (2021).
OpenUrl

[27] 27.↵
H. Fort, A very simple model to account for the rapid rise of the alpha variant of SARS-CoV-2 in several countries and the world. Virus Res. 304, 198531 (2021).
OpenUrl

[28] 28.↵
H. Allen et al., Increased household transmission of COVID-19 cases associated with SARS-CoV-2 Variant of Concern B.1.617.2: a national case-control study. (2021). https://khub.net/documents/135939561/405676950/Increased+Household+Transmission+of+COVID-19+Cases+-+national+case+study.pdf/7f7764fb-ecb0-da31-77b3-b1a8ef7be9aa (accessed 9 July 2021).

[29] 29.↵
P. Virtanen et al., SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 17, 261–272 (2020).
OpenUrl CrossRef PubMed

[30] 30.↵
L. Petzold, Automatic selection of methods for solving stiff and nonstiff systems of ordinary differential equations. SIAM J. Sci. Comput. 4, 136–148 (1983).
OpenUrl

[31] 31.↵
M. L. Blinov, J. R. Faeder, B. Goldstein, W. S. Hlavacek, BioNetGen: software for rule-based modeling of signal transduction based on the interactions of molecular domains. Bioinformatics. 20, 3289–3291 (2004).
OpenUrl CrossRef PubMed Web of Science

[32] 32.↵
S. D. Cohen, CVODE, a stiff/nonstiff ODE solver in C. Computers in physics. 10, 138–143 (1996).
OpenUrl

[33] 33.↵
S. K. Lam, A. Pitrou, S. Seibert, “Numba: A LLVM-based Python JIT compiler” in Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, (Association for Computing Machinery, New York, NY, 2015), pp. 1–6.

[34] 34.↵
O. Diekmann, J. A. Heesterbeek, M. G. Roberts, The construction of next-generation matrices for compartmental epidemic models. J R Soc Interface. 7, 873–875 (2010).
OpenUrl CrossRef PubMed Web of Science

[35] 35.↵
Wolfram S., Mathematica: A System for Doing Mathematics by Computer (Addison Wesley Longman Publishing Co., Inc., Boston, MA, 1991).

[36] 36.↵
H. J. Wearing, P. Rohani, M. J. Keeling. Appropriate models for the management of infectious diseases. PLOS Med. 7, e174 (2005).
OpenUrl

[37] 37.↵
E. D. Mitra et al., PyBioNetFit and the Biological Property Specification Language. iScience. 19, 1012–1036 (2019).
OpenUrl

[38] 38.↵
F. J. Massey, The Kolmogorov-Smirnov test for goodness of fit. J. Am. Stat. Assoc. 46, 68–78 (1951).
OpenUrl CrossRef Web of Science

[39] 39.↵
Z. Shuai, P. van den Driessche, Global stability of infectious disease models using Lyapunov functions, SIAM J. Appl. Math. 73, 1513–1532 (2013).
OpenUrl

[40] 40.↵
I. Dorigatti et al., SARS-CoV-2 antibody dynamics and transmission from community-wide serological testing in the Italian municipality of Vo’. Nat Commun. 12, 1–11 (2021).
OpenUrl CrossRef PubMed

[41] 41.↵
A. Fowlkes et al., Effectiveness of COVID-19 vaccines in preventing SARS-CoV-2 infection among frontline workers before and during B.1.617.2 (Delta) variant predominance—eight US locations, December 2020-August 2021. MMWR Morb Mortal Wkly Rep. 70, 1167–1169 (2021).
OpenUrl

[42] 42.↵
H. Kalish et al., Undiagnosed SARS-CoV-2 seropositivity during the first 6 months of the COVID-19 pandemic in the United States. Sci. Transl. Med. 13, eabh3826 (2021).
OpenUrl FREE Full Text

[43] 43.↵
S. Takahashi, B. Greenhouse, I. Rodríguez-Barraquer, Are seroprevalence estimates for severe acute respiratory syndrome coronavirus 2 biased?, J Infect. Dis. 222, 1772–1775 (2020).
OpenUrl

[44] 44.↵
H. E. Randolph, L. B. Barreiro, Herd immunity: understanding COVID-19. Immunity. 5, 737–741 (2020).
OpenUrl

[45] 45.↵
S. M. Moghadas, P. Sah, A. Shoukat, L. A. Meyers, A. P. Galvani, Population immunity against COVID-19 in the United States. Ann Intern Med. doi:10.7326/M21-2721 (2021).
OpenUrl CrossRef

[46] 46.↵
Science Brief: COVID-19 vaccines and vaccination. (2021). https://www.cdc.gov/coronavirus/2019-ncov/science/science-briefs/fully-vaccinated-people.html (accessed 8 Sep 2021).

[47] 47.↵
V. Stadnytskyi, C. E. Bax, A. Bax, P. Anfinrud, The airborne lifetime of small speech droplets and their potential importance in SARS-CoV-2 transmission. Proc Natl Acad Sci U.S.A. 117, 11875–11877 (2020).
OpenUrl Abstract/FREE Full Text

[48] 48.↵
M. Echternach et al. Impulse dispersion of aerosols during singing and speaking: A potential COVID-19 transmission pathway. Am J Respir Crit Care Med. 202, 1584–1587 (2020).
OpenUrl

[49] 49.↵
D. B. Larremore, B. K. Fosdick, S. Zhang, Y. H. Grad. Jointly modeling prevalence, sensitivity and specificity for optimal sample allocation. bioRxiv [Preprint] (2020). https://www.biorxiv.org/content/10.1101/2020.05.23.112649v1 (accessed 8 September 2021).

[50] 50.
A. Gelman, B. Carpenter. Bayesian analysis of tests with unknown specificity and sensitivity. J R Stat Soc Ser C Appl Stat. 69, 1269–1283 (2020).
OpenUrl

[51] 51.↵
E. Bendavid et al., COVID-19 antibody seroprevalence in Santa Clara County, California. Int J Epidemiol. 50, 410–419 (2021).
OpenUrl PubMed

Bayesian Inference of State-Level COVID-19 Basic Reproduction Numbers across the United States

Abstract

Introduction

Materials and Methods

Model

Simulations

Calculation of epidemic parameters ℛ0 and λ

Bayesian inference

Results

Bayesian uncertainty quantification

Region-specific basic reproduction numbers and herd immunity thresholds

Estimates of initial region-specific epidemic growth rates

Sensitivity of β to the surveillance data used in inference

Global asymptotic stability of the disease-free equilibrium

Progress toward herd immunity

Discussion

Data Availability

Author contributions

Supplementary Information

Reduced Model

Basic Reproduction Number

Epidemic growth rate

Progress toward herd immunity

SI Figure Legends

Acknowledgements

References

References

Citation Manager Formats

Subject Area

Calculation of epidemic parameters ℛ₀ and λ