Abstract
The United States has the highest numbers of confirmed cases of COVID-19 in the world. The early hot spot states were New York, New Jersey, and Connecticut. The workforce in these states was required to work from home except for essential services. It was necessary to evaluate an appropriate date for resumption of business since the premature reopening of the economy would lead to a broader spread of COVID-19, while the opposite situation would cause greater loss of economy. To reflect the real-time risk of the spread of COVID-19, it was crucial to evaluate the population of infected individuals before or never being confirmed due to the pre-symptomatic and asymptomatic transmissions of COVID-19. To this end, we proposed an epidemic model and applied it to evaluate the real-time risk of epidemic for the states of New York, New Jersey, and Connecticut. We used California as the benchmark state because California began a phased reopening on May 8, 2020. The dates on which the estimated numbers of unidentified infectious individuals per 100,000 for states of New York, New Jersey, and Connecticut were close to those in California on May 8, 2020, were June 1, 22, and 22, 2020, respectively. By the practice in California, New York, New Jersey, and Connecticut might consider reopening their business. Meanwhile, according to our simulation models, to prevent resurgence of infections after reopening the economy, it would be crucial to maintain sufficient measures to limit the social distance after the resumption of businesses. This precaution turned out to be critical as the situation in California quickly deteriorated after our analysis was completed and its interventions after the reopening of business were not as effective as those in New York, New Jersey, and Connecticut.
1 Introduction
The outbreak of novel coronavirus disease (COVID-19) has spread over 200 countries since December 2019 (National Health Commission of the People’s Republic of China, 2020). It is unprecedented to have over 7 million cumulative confirmed cases of COVID-19 worldwide at the beginning of June, 2020 (World Health Organization, 2020c). The “battle” against COVID-19 in China has provided experience and likely outcomes of certain interventions to the ongoing hard-hit areas. As a novel and acute infectious disease, the transmission mechanisms of COVID-19 were unknown at the early stage of the epidemic, and the Chinese government implemented relatively strict non-pharmaceutical interventions in the hot spot areas, where the public transportation was suspended within and outside of the cities in Hubei province since January 23, 2020 (Chinese Center for disease control and prevention, 2020). All nationwide residents were recommended to stay at home except for essential needs. The holiday season of the Chinese Spring Festival had been prolonged until late February when essential services were recommenced operating gradually outside Hubei province (The State Council, The People’s Republic of China, 2020b). In April 2020, a comprehensive resumption of business started in China (The State Council, The People’s Republic of China, 2020a).
In late January 2020, the United States began reporting confirmed cases of COVID-19 (Holshue et. al., 2020). There were over 1,000 cumulative confirmed cases on March 13, 2020 (World Health Organization, 2020a), when the White House declared a national emergency concerning COVID-19 outbreak (The White House, 2020b) and issued a “call to action” coronavirus guidelines on March 16, 2020 (The White House, 2020a). The United States has become the most severe country of COVID-19 with 366,346, 154,154, 40,468, and 94,743 cumulative confirmed cases in the states of New York, New Jersey, Connecticut, and California by May 24, 2020 (The Center for Systems Science and Engineering (CSSE) at Johns Hopkins University, 2020), respectively. Making things worse, New York and California are the top two states that contribute to the real gross domestic product (GDP) in the United States (Figure 1) (The United States Census Bureau, 2020).
The percentages of Gross Domestic Product of New York, New Jersey, Connecticut, and California in 2018.
The state of New York reported the first confirmed case of COVID-19 on March 1, 2020, and proclaimed an executive order on March 16, 2020, including reducing half of the local government workforce, allowing the statewide nonessential workforce to work from home starting on March 17, 2020, and closing all schools starting on March 18, 2020 (State of New York Governor Andrew Cuomo, 2020g). Due to the rapidly increasing number of additional cases of COVID-19 in the state, the governor announced an aggressive policy of “New York State on PAUSE” on March 20, 2020 (State of New York Governor Andrew Cuomo, 2020c), and required all people in the state to wear masks or face covering in public since April 15, 2020 (State of New York Governor Andrew Cuomo, 2020f). On May 1, 2020, all statewide K-12 schools and college facilities continued to close for the remaining academic year (State of New York Governor An drew Cuomo, 2020a). The guide of the “NY Forward Reopening” Plan was available on May 11, 2020, and three regions reopened businesses for phase one on May 15, 2020 (State of New York Governor Andrew Cuomo, 2020b). By May 24, 2020, seven regions reopened for businesses (State of New York Governor Andrew Cuomo, 2020e).
The state of New Jersey reported the first positive case of COVID-19 on March 4, 2020 (State of New Jersey Governor Phil Murphy, 2020a). The governor of New Jersey recommended the cancellation of statewide public gatherings over 250 individuals from March 12, 2020, suspended visiting state prisons and statewide halfway houses effective on March 14, 2020, and closed restaurants, bars, movie theaters, gyms, casinos from March 16, 2020 (State of New Jersey Governor Phil Murphy, 2020e). The governor announced an order including the statewide stay at home and closure of statewide non-essential retail industries on March 21, 2020 (State of New Jersey Governor Phil Murphy, 2020c). State parks and golf courses reopened on May 2, 2020 (State of New Jersey Governor Phil Murphy, 2020g). Among other activities, construction and non-essential retail provisions were allowed to resume on May 18, 2020 (State of New Jersey Governor Phil Murphy, 2020f). In-person sales were authorized to reopen at the car, motorcycle, and boat dealerships and bike shops by appointment only and with social distancing measure on May 20,2020 (State of New Jersey Governor Phil Murphy, 2020b).
The state of Connecticut reported the first positive case on March 8, 2020 (State of Connecticut Governor Ned Lamont, 2020c). New York, New Jersey, and Connecticut announced measures to slow the spread of COVID-19 throughout the tri-state area on March 16, 2020 (State of Connecticut Governor Ned Lamont, 2020f). The governor signed an executive order for businesses and residents “stay safe, stay home” on March 20, 2020 (State of Connecticut Governor Ned Lamont, 2020e). Statewide K-12 schools were announced to remain closed for the rest of the academic year on May 5, 2020 (State of Connecticut Governor Ned Lamont, 2020b). The restaurants, offices, retail stores, outdoor museum, and zoos were authorized to reopen as the first phase on May 20, 2020 (State of Connecticut Governor Ned Lamont, 2020d) and hair salons and barbershops were aligned to reopen in early June (State of Connecticut Governor Ned Lamont, 2020a). New York, New Jersey, and Connecticut signed a multi-state agreement on the reopening of beaches on May 22, 2020 (State of New Jersey Governor Phil Murphy, 2020d).
The state of California reported two positive cases of COVID-19 on January 26, 2020 (Office of Public Affairs, 2020d). One month later, there was the first possible case of local transmission of COIVD-19 in California (Office of Public Affairs, 2020a). The “stay home except for essential needs” order was issued on March 19, 2020, where all individuals were required to stay at home except for the workforce in 16 critical infrastructure sectors (Office of Public Affairs, 2020b). On May 7, 2020, the state public health officer determined that the entire state gradually moved to Stage 2 of California’s Pandemic Resilience Roadmap, i.e., reopening the lower-risk workplaces and other spaces (Office of Public Affairs, 2020c). Therefore, California state had begun a phased reopening on May 8, 2020.
Due to the epidemic of COVID-19, the entire social system was slowed down and the unemployment in total nonfarm dramatically increased (Figure 8 in the supplement). The unemployment rates were 14.7% and 13.3% in April and May 2020 (U.S. Department of Labor, 2020). Thus, the time to restore the economy in the United States, especially for the states of New York, New Jersey, Connecticut, had been the most significant and consequential decision for the president of the United States and the governors of the states. For people to return to work, safety is a key concern. On April 13, 2020, seven states, including the states of New York, New Jersey, and Connecticut, joined an effort to form a multi-state council to reopen the economy while combating COVID-19 (State of New York Governor Andrew Cuomo, 2020d). The timelines of interventions in New York, New Jersey, Connecticut, and California are displayed in Figure 2.
The timeline of interventions against COVID-19 in the states of New York, New Jersey, Connecticut, and California.
2 Methods
2.1 Data collection
This work began in April and initially completed in May 2020 when there were no widespread pretests of social injustice in the United States (Minnesota Daily, 2020). To emphasize the effectiveness of our model for evaluating the real-time risk of the epidemic, we chose to focus on the data before June 2020. This strategy also allowed us the opportunity to contrast our prediction and recommendations to what took place after June 2020.
We collected the epidemic data from March 13, 2020 when the national emergency concerning COVID-19 was proclaimed to May 24, 2020 in New York, New Jersey, Connecticut, and California. The data were made available by the New York Times (The New York Times, 2020).
2.2 Bayesian modelling of epidemic
Based on a WHO report, the transmission of COVID-19 could be caused by the individuals infected with the virus before significant symptoms developed (World Health Organization, 2020b), or even the carriers who did not develop symptoms. Pre-symptomatic transmission and asymptomatic transmission interfere with our ability to understand the real magnitude of COVID-19 because of the lag from the time of catching the virus to the time of being confirmed by testing. To overcome this issue, we divided a concerned population into four compartments: susceptible (S), unidentified infectious (I), self-healing without being confirmed (H), and confirmed cases (C).
The susceptible (S) individuals have no immunity to the disease and are the majority of the population at an early stage of the epidemic.
Unidentified infectious (I) individuals are infectious but not confirmed individuals, and can be divided into two groups: those who would eventually develop symptoms and the others who would not develop symptoms called asymptomatic carriers.
Self-healing individuals without being confirmed (H) are assumed to be no longer infectious and resistant to COVID-19.
Confirmed cases (C) include two groups of individuals: patients in the hospital and asymptomatic carriers who are supposed to stay at home, and unable to transmit the disease.
We assumed that, individuals in compartment I would move into either group H or C eventually.
We introduced Susceptible (S)-Unidentified infectious (I)-Self-healing without being confirmed (H)-Confirmed cases (C) (SIHC) model to accommodate the four compartments above. Figure 3 presents the flowchart among them.
The transition diagram among compartments S, I, H, and C.
We assumed the following dynamic system in terms of the numbers of individuals in compartments S, I, H, and C at time t, denoted by S(t), I(t), H(t), and C(t), respectively,
where ρ is the transmissiblility and θ(t) is the average contact number per person at time t (Blackwood and Childs, 2018), which was assumed to be time-varying; DH is the average duration from catching the virus to self-healing without being confirmed; DC is the average duration from catching the virus to be confirmed by testing; and N is the total number of population.
Let α(t) = ρθ(t). θ(t) can be controlled by policy interventions. Typically, it is a constant at an early stage of the epidemic and decreases as the interventions are implemented until the interventions take full effect when it reaches or nears the lowest level. Based on this point of view, we further assumed that α(t) was a monotonically decreasing curve (Tan et al., 2020) with four parameters α0, d, m, and η:
where α0 denotes the maximum of α(t) at an early stage of the outbreak, d is the time that takes for the control measures to start their effects and for α(t) to start declining, (1 − η) is the ratio of α0 to the minimum of α(t), which is α0(1 − η). λm was chosen as
so that α(t) first reaches the minimum at d + m. Figure 4 illustrates the shape of α(t).
An illustration of α(t).
Next, we derived the time-varying reproductive rate Rt (Anderson et al., 1992; Jones, 2007) by our SIHC model:
![Embedded Image](https://www.medrxiv.org/sites/default/files/highwire/medrxiv/early/2020/09/07/2020.05.16.20103747/embed/graphic-7.gif)
Let Ω = (α0,η,m,DC) and Θ = (Ω,d, DH), where d and DH were given and the others needed to be estimated. Note that
and Zt = (It, Ht, Ct) was assumed to be a latent Markov process:
where
was the evolving operator to determine the values of I, H and C at time (t + 1) given the deterministic dynamic system (1) with the initial value of (N − It − Ht − Ct, It, Ht, Ct) and parameters Θ. Pois(·|υ) is the mass of a multi-dimensional independent Poisson distribution with mean vector υ. Similarly, we defined
as the components of
. Since N, St ≫ It,Ht,Ct, the conditional independent Poisson likelihood of (It+1, Ht+1, Ct+1) is the Poisson approximation for the multinomial likelihood of (St+1,It+1,Ht+1,Ct+1), whose incident rate is
and the total number is N (Tan et al., 2020).
The observable data were C1:T and N, where T was the time period of observation. Let H1 = 0, I1:T and H2:T were treated as the latent variables. For simplicity, we assumed that α(t) = α(k), if k ≤ t < k + 1 and since the value of DH was given. We employed a Bayesian procedure in our parameter estimation. The posterior distribution of the parameters was (Bolstad and Curran, 2016):
where π(·) represents the prior distribution of corresponding parameter and π(·| *) represents the posterior distribution of corresponding parameter given the observed data “*”.
For prior distributions, notice that DC was governed by the mean of the incubation period. Based on the related literature on the incubation period of COVID-19 (Lauer et al., 2020), we chose an informative prior: the log-normal distribution with the log-mean of log(5.1) and logstandard deviation of log(1.05) as the prior distribution of DC. Similarly, we chose the log-normal distribution with the log-mean of log(30) and log-standard deviation of log(1.05) as the prior distribution of m. In other words, the interventions were assumed to take the full effect after one month, The priors of the remaining parameters were chosen to be non-informative or flat priors, i.e., π(α0),π(η) ∝ 1.
For the fixed parameters d, and DH, recall that d was the waiting time for interventions to begin. We chose d as the start of implementing certain interventions, i.e, d = 8 for New York (see “New York State on PAUSE” in Figure 2), d = 9 for New Jersey (see “statewide stay at home” in Figure 2), d = 8 for Connecticut (see “stay safe, stay home” in Figure 2), and d = 7 for California (see “stay home except for essential needs” in Figure 2). DH was assumed to be 9.5 according to the clinical study of asymptomatic cases(Hu et al., 2020).
Since π(Ω,I1:T,H2:T|C1:T, N, d, DH,H1) has no closed form, we used Markov Chain Monte Carlo (MCMC) (Ghosh et al., 2006; Andrieu et al., 2003; Chib and Greenberg, 1995; Soubeyrand, 2016) to approximate posterior distributions of parameters Ω and latent numbers of I and H at each iteration. The Supplement provides the details of the sampling algorithm. The data and the R files relevant to the analysis in this study are available at https://github.com/tingT0929/Resumption-of-business. The point estimates of Ω, I1:t and H2:t were the medians of the posterior distribution while 95% credible intervals were constructed with 2.5% and 97.5% quantiles.
3 Data analysis
3.1 Model estimation
We used publicly available data to simulate the possible outcomes of the outbreak of COVID-19 by varying the dates when the business reopens. Table 1 presented the parameter estimates. These estimates were then used in our simulation models for potential second waves of COVID-19 while we assessed the risk for people to return to work. The data after our models were built were used to assess the validity of our simulation models.
The estimated parameters.
Briefly, the average relative errors between our predicted and the observed numbers of cases from May 25 to June 20, 2020:
which were 0.34%, 0.45%, 0.59%, 3.56% in New York, New Jersey, Connecticut, and California, respectively. These low levels of errors suggested accurate prediction from our models. The estimated trends for the four states are given in Figure 5.
The state-specific trends of simulated confirmed cases (C) and unidentified infected (I) per 100,000 from March 13 to June 6, 2020. The points represent the observed cumulative confirmed cases.
The estimated rates of infected individuals without being confirmed on May 24, 2020 (i.e. the estimated (It + Ht)/(It + Ht + Ct) for time t to be May 24, 2020) were 22.82% with 95% CI (21.86%, 24.68%), 32.38% with 95% CI (31.36%, 33.09%), 34.95% with 95% CI (33.04%, 36.32%), 37.74% with 95% CI (35.72%, 38.53%), respectively, in the four states. It had been reported that the numbers of tests per 100,000 in New York, New Jersey, Connecticut, and California on May 24, 2020, were 9312, 7434, 6321, and 4396, respectively (Jin, 2020). It appeared that the rate of testing was inversely proportional to the rate of unidentified infected.
To visualize the importance of the unobserved number of unidentified infectious individuals, rather than the observed confirmed cases, for the assessment of epidemic risk, we displayed the comparison between the estimated number of new daily infected individuals and observed new daily confirmed cases in Figure 9 in the Supplement. Note that there were gaps between the number of new daily confirmed cases and new infected individuals in Figure 9 as a result of the pre-symptomatic and asymptomatic transmissions of COVID-19. This phenomenon was the lagging effect for new daily confirmed cases, reflecting the time interval from catching the virus to being confirmed by testing. Figure 9 indicated that we cannot ignore the lagging effect at an early stage of the epidemic, although it seemed to disappear over time.
3.2 Evaluate the risk for reopening the economy
To appreciate the potential risk of COVID-19, we considered the estimated numbers of unidentified infectious individuals per 100,000 for New York, New Jersey, Connecticut, and California; see Figure 6. This figure indicated that the peak of unidentified infectious individuals in New York, New Jersey, and Connecticut had passed. So far, this remained to be the case.
California reopened the lower-risk workplaces as Stage 2 on May 8, 2020. This decision of California was used as a reference in our consideration of resuming the business in New York, New Jersey, and Connecticut. Compared with California, the numbers of unidentified infected individuals in New York, New Jersey, and Connecticut were higher before May 8, 2020, but on clearly descending trajectories. However, while the number was low, the steady upward trajectory in California was concerning. This upward trajectory coupled with insufficient interventions might be the major cause for the resurgence of COVID-19 in California after the reopening.
The trend of the numbers of unidentified infectious individuals per 100,000 for states of New York, New Jersey, Connecticut, and California from March 13 to July 20, 2020. The corresponding 95% credible intervals were represented in each state.
To balance the risk of epidemic versus resuming business, we considered a few choices of Mondays in June 2020 as possible dates of reopening. We used our simulation models to predict the numbers of unidentified infectious individuals per 100,000 for New York, New Jersey, Connecticut, and California on June 1, June 8, June 15, June 22 and June 29, 2020. The results were given in Table 2. Common wisdom is that the risk is regarded as reasonably low for a population when the number of infected persons per 100,000 is closed to 20. This guideline has been used to form cross-state travel guidelines in the United States. This was then used as a rationale for a safe resumption of business.
The estimated numbers of unidentified infectious individuals per 100,000 in the four states in the different Mondays.
To appreciate the potential risk of resuming business on different Mondays, we simulated the possible outcomes after people returned to work. We assumed that stringent interventions would be re-enforced one week after the business was reopened. We used the estimated parameters in Table 1 as the parameters underlying the COVID-19 transmission.
The simulated results are given in Figure 7. Brief, for New York, the simulated numbers of cumulative confirmed cases per 100,000 on July 20 after the business was resumed on June 1, June 8, June 15, June 22 and June 29 were, respectively, 20.26%, 14.76%, 10.37%, 6.98%, and 4.37% higher than those if the business was not resumed. For New Jersey, the simulated numbers of cumulative confirmed cases per 100,000 on July 20 after the business was resumed on June 1, June 8, June 15, June 22 and June 29 were, respectively, 43.91%, 33.07%, 24.15%, 16.72% and 10.70% higher than those if the business was not resumed. For Connecticut, the simulated numbers of cumulative confirmed cases per 100,000 on July 20 after the business was resumed on June 1, June 8, June 15, June 22 and June 29 were, respectively, 37.18%, 28.08%, 20.45%, 14.29%, and 9.12%, higher than those if the business was not resumed.
The possible outbreak of epidemic after people returned to work on different Mondays in New York, New Jersey, and Connecticut. The points represented the observed cumulative confirmed cases (per 100,000) in these states.
The over-the-month-change of employment in the total nonfarm sectors in the United States.
The magnitude of the outbreak of COVID-19 in New York (a), New Jersey (b), Connecticut (c), and California (d) from March 13 to July 20, 2020. The daily confirmed new cases from March 13 to May 24, 2020, were observed data, and the data from May 25 to July 20, 2020, were projected by our models.
4 Discussion
The decision for the resumption of business is not only a public health issue but also an economic issue. What we focused on was the epidemiological feasibility of returning to work at an early date. There were obviously many other factors to consider (Centers for Disease Control and Prevention, 2020). To analyze the epidemic data of COVID-19 in New York, New Jersey, and Connecticut, we proposed an epidemic model by considering pre-symptomatic and asymptomatic transmissions of COVID-19. This model provided estimates for the numbers of new daily infected individuals and new confirmed cases. The higher number of unidentified infected individuals, the higher risk for the resumption of business was expected. From Figure 7, we concluded that the resumption of businesses on June 1, 2020, was premature for New York, New Jersey, and Connecticut. If the governors of those states delayed the resumption of businesses for one week or more, the simulated magnitude of the second wave of the infection was much lower. However, the added benefit of delaying the reopening appeared less beyond one week and even more so after.
Because California began the process of reopening economy on May 8, 2020, we used the data in California at that time as a reference for reopening New York, New Jersey, and Connecticut, even though we noted that the trajectory in California was clearly in a wrong direction as opposed to the descending trajectories New York, New Jersey, and Connecticut. While the three east coast states have become the lowest risk states, California has been suffering from a resurgence, underscoring the importance of maintaining public health practice after reopening business.
Data Availability
This work began in April and initially completed in May 2020 when there were no widespread pretests of social injustice in the United States. To emphasize the effectiveness of our model for evaluating the real-time risk of the epidemic, we chose to focus on the data before June 2020. This strategy also allowed us the opportunity to contrast our prediction and recommendations to what took place after June 2020. We collected the epidemic data from March 13, 2020 when the national emergency concerning COVID-19 was proclaimed to May 24, 2020 in New York, New Jersey, Connecticut, and California. The data were made available by the New York Times.
https://www.mndaily.com/article/2020/06/pf-the-george-floyd-protests-a-visual-timeline
https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html
Supplement
Bayesian computation
Let be the initial samples and
, and
be the k-th samples. N[0,∞](μ, σ2) represents the truncated normal distribution with mean μ and variance σ2 and logN(μ,σ2) represents the log-normal distribution with log-mean μ and log-variance σ2. Algorithms 1, 2, and 3 provide detailed updating steps.
For sampling I1, notice that S1 ≈ N, which implied that
![Embedded Image](https://www.medrxiv.org/sites/default/files/highwire/medrxiv/early/2020/09/07/2020.05.16.20103747/embed/graphic-20.gif)
Then the full conditional distribution of I1 was approximated with
![Embedded Image](https://www.medrxiv.org/sites/default/files/highwire/medrxiv/early/2020/09/07/2020.05.16.20103747/embed/graphic-21.gif)
We use Gamma (I2 + 1, exp[α0 − (1/DC + 1/DH)]) as the proposal distribution of I1 in the Metropolis-Hastings algorithm.
Moreover, there are two strategies to sample I2:T and H2:T: sequential sampling (sequentially sample from It and Ht in each Metropolis-Hastings step or from the combined sample I2:T) and group sampling (sample from H2:T, and accept the samples in one Metropolis-Hastings step). The group sampling is high dimensional if T is large. Hence, moving from one iteration of MetropolisHastings algorithm to the next is computationally intensive. The sequential sampling requires more time for Markov chain to converge. We utilized the mixture of MCMC kernels Andrieu et al. (2003) to combine these two sampling strategies to balance the trade-off between the acceptance rate and the convergence time of MCMC.