Modelling to Predict Hospital Bed Requirements for Covid-19 Patients in California

Santanu Basu

doi:10.1101/2020.05.17.20104919

ABSTRACT

A model to predict hospital bed requirements six weeks in advance to treat Covid-19 patients in California is presented. The model also gives prediction for number of cases and deaths during the same period. The model is versatile and can be applied to other countries and regions as well.

INTRODUCTION

The new coronavirus, SARS-CoV-2 for the disease Covid-19 has spread rapidly through many countries in the world in the first four months of 2020 causing widespread deaths. Mathematical modeling of spreading of infectious diseases is a well-established field [1–2]. There have been a large number of recent efforts in modelling Coronavirus spreading as reported in the literature [3–6]. The Covid-19 cases have put significant strain in the hospital systems around the world. It would be of great benefit to the medical care facilities if the Covid-19 patient load could be known six weeks in advance. This would enable distributing case load among hospitals and to adequately equip the hospitals with staffing, PPE and life support equipment. This is the objective of this paper.

A mathematical model has been recently developed [3] to analyze the Covid-19 case data (time dependent number of cases, deaths and hospitalization) in any country or state. We first present the prediction of Covid-19 cases and hospitalization requirements for the state of California, USA using the model. Next we present the methodology and the model in detail.

PREDICTION OF HOSPITAL BED REQUIREMENTS IN CALIFORNIA

Covid-19 case data for California was obtained from [7] and the numbers of regular and ICU hospital beds were obtained from the California Health and Human Services websites [8,9]. The first confirmed case of Covid-19 in California was on January 26^th, 2020. Reliable case numbers became available from March 4^th, 2020 [7]. Reliable numbers for hospital bed usage since March 29^th are available in [8].

The mathematical model used to predict number of cases and hospital loads will be presented in the next section. In this section, we will present the results using the model for the state of California. The model is parametric and transparent so that predictions are based on parameters which are in principle measurable and the predictions can be continuously corrected as more accurate data on parameters become available. Normal curve fitting models which are not based on any real-life parameters tend to have high degree of error in making projections. Also the standard machine learning based and curve-fitting models cannot accurately do what-if experiments, such as what would be the effect of timing of lifting of social distancing measures on the number of cases.

The model has twenty one user-supplied variable parameters, three of which influence the results greatly. These three are (1) the first date of implementation of social distancing protocols since the viral outbreak, (2) the rate at which the rate of transmission decreases by using the government measures and (3) the final rate of transmission in the long run. The California data could be fitted well with March 14^th to be the first date (n₁) of transmission rate decrease, 16 days for exponential time constant for transmission rate decrease (G) and 5% for the final transmission rate (T_f). The comparison between the model predictions until June 30^th and actual data until May 15^th on number of cases and deaths are shown in figures 1 and 2 respectively. The figures demonstrate that the model fits the actual data very well.

Figure 1.

Data and model predictions for Covid-19 cases in California

Figure 2.

Data and model predictions for Covid-19 deaths in California

Having anchored the model to actual data until May 15^th, we are in a position to make reliable projections for six weeks in advance. The model predicts the number of cases to rise to 146,000 (range 136,000-151,000) and the number of deaths to rise to 5,900 (range 5600-6,100) on June 30^th, 2020. The model also predicts the daily number of cases to fall to 1,200 (range 960-1,400) and the number of daily deaths to fall to 45 (range 37-51) on the same date.

Figure 3.

Data and model predictions for Covid-19 hospitalizations in California

Figure 3 shows the comparison between the model predictions until June 30^th and actual data until May 15^th [8] on number of hospitalizations. We analyzed the data in [8] to determine daily variation of hospitalization rate and the ratio of ICU beds to the total number of hospital beds. The model fits the data very well as shown in figure 3 by assuming 35% of the patients require ICU beds. The model predicts the number of hospitalizations to decrease from 3,248 on May 10^th to 2,000 (range 1,600-2,250) on June 30^th, 2020. The number of ICU beds also will decrease from 1,093 on May 10^th to 700 (range 560-790) on June 30^th, 2020. It is to be noted that the model does not take into account anticipated reopening of businesses in California around the last week of May. It will most likely increase the transmission rate parameter in the model that can be sensed by a perturbation in the trend in the number of cases. The model enables one to make what if exercises such as the effect of relaxing of social distancing.

METHODOLOGY

In this section we study the propagation of a new infectious disease through an isolated country’s population, P. With some simple modifications, the procedure can be applied to recurring outbreak of the same virus and also to the normal state in which there is population movement among countries.

Figure 4 is a schematic of our model. Let n be the number of days since the onset of virus infection. The entire population of virus infected people is classified in six categories:

I – people in whom the virus is in incubation

A – people who are infected with the virus but who do not exhibit any symptoms

S – non-hospitalized patients who exhibit mild symptoms of the viral infection

H – hospitalized patients who are in acute condition

D – people who have died from the disease

R – people who have recovered from the illness and awaiting to be declared clinically recovered

R^C – people who are counted as clinically recovered

Figure 4.

Schematic of the mathematical model for the virus infected cases

On any day n, the total number of people who have been infected with the virus is N_n, where

We denote the incubation period for the virus in number of days by j. The model assumes each infected person spends j days in incubation and is capable of infecting others. Let I_n be the total population of infected people who are still in the incubation period on day n. Let be the population of those infected people in the incubation period on day n who have been infected for i number of days (i ranges from 1 to j).

We assume that after the incubation period of j days, each infected person either is asymptomatic or shows symptoms of the disease. Let be the population on day n of those asymptomatic people who have been carrying the virus for j+i number of days (1 ≤ i ≤ q) and let A_n be the total population of asymptomatic people who are capable of infecting others.

Let be the population on day n of non-hospitalized patients who have been showing mild symptoms for i number of days (1 ≤ i ≤ k) and let S_n be the total population of non-hospitalized patients.

For each person with symptoms, there are three outcomes. A fraction, f_H is hospitalized, a small fraction dies at a rate of δ_S per person per day, and the remaining fraction recovers. The average time between showing symptoms and hospitalization is h ≤ k. We assume each patient with symptoms is capable of infecting others for a total of j+k number of days.

Let H_n be the total population of patients who are hospitalized (in acute condition) on day n. Let be the population on day n of those infected patients who have been hospitalized for i number of days (i ranges from (1 ≤ i ≤ m).

Out of the hospitalized patients, a fraction ficu is admitted to the ICU, the remaining fraction (1-f_ICU) requires a regular hospital bed. The number of patients in the ICU is

We denote the probability of death among the hospitalized patients to be per day and the length of hospital stay to be m number of days. The patients who survive m days in the hospital are recovered but generally needs to wait p more days before being counted as clinically recovered.

The total number of people who are currently infected on day n and who are able to transmit the virus is (I_n + A_n + S_n +H_n). Out of this population, only (S_n + H_n) number of people show outward signs of the disease. Ironically in the beginning of the outbreak, when (S_n + H_n) is small, the problem is sometimes underestimated without realizing that the disease is spreading exponentially everyday by the entire infected population (I_n + A_n + S_n +H_n). The ratio (I_n + A_n + S_n +H_n) / (S_n + H_n) can be very high at the onset of the viral outbreak. The time to take swift, decisive and uniform measures to prevent spreading of the disease is when (S_n + H_n) is small (for example less than 10).

We denote the rate of transmission of the virus from an infected person to a healthy person per day by T_n. Another way of quantifying the rate of transmission that has been used in the literature is infection reproductive number which for SARS-CoV-2 is reported to be ~ 2.5 [10–11]. Let N₀ be the number of infected people on the first day. The numbers of infected people on the first two days of the outbreak are:

In general, the number of newly infected people on day n is given by where is the rate of transmission from an hospitalized patient to a healthy person. In a well-managed hospital, . A fraction, f_a, of the newly infected people is asymptomatic of the disease. An estimate of f_a = 30% has been given in [12] which has been used in this model calculations. The number of new patients with mild symptoms on day n is given by

Similarly the number of new asymptomatic patients on day n is given by

The number of new hospitalized patients on day n is given by

For simplicity, we can assume for i ≠ h. The numbers of patients who die and recover from mild symptoms on day n are given by and respectively. Similarly the numbers of hospitalized patients who die and recover from acute conditions on day n are given by and respectively.

The number of deaths caused by the virus on day n is given by

The number of patients who recover from the disease on day n is given by

After an infected person first recovers either from the mild symptom state or after being hospitalized, it takes p number of days for her or him to be counted as clinically recovered which may vary from county to county and state to state. The number of new cases of clinically recovered ex-patients on day n is given by

Before the first person showing symptoms of the disease caused by the new virus and shortly thereafter up to day n₁, the transmission rate T_n is assumed to be constant at an initial value T₀:

Typically the authorities take steps to prevent the spreading of the disease such as patient isolation and testing, contact tracing and social distancing. As a result T_n decreases with time. The model is capable of including actual time dependent transmission data. However at this time such data for the Coronavirus case is not available. A reasonable approximation is exponential decay of the rate of transmission T_n from infected persons to the general population. Let G be the number of days it takes for T_n - T_f to decrease by 63% of T₀ - T_f, and let T_f be the long term steady state value of T_n.

The transmission rate from a hospitalized patient is assumed to be a fraction of T_n

Three parameters n₁, G and T_f in equation (20) were found to be the most significant parameters affecting the number of infected cases and number of deaths from the virus. The data from countries with smaller number of cases could be fit well with smaller n₁, G and T_f. In this paper, we will limit the analysis to the first outbreak of the virus in which T_n reduces to a final value T_f. It may also happen that social distancing measures are lifted due to economic or social necessities before the virus is completely extinguished within the population, which would cause T_n to rise again from T_f which may lead to a recurrence. The recurrence can be modelled in future using the methodology given in this paper.

Let σ_n per day per person be the rate of testing of the entire population, P of the country including the infected people without symptoms (I_n + A_n) on any day n. Let β_n per day per person be the testing rate of patients with symptoms. Except during the first few days of the virus outbreak, when testing is being developed, β_n = 1 in general for people with symptoms.

Cumulative number of cases with symptoms on day n is given by

Cumulative number of reported cases which have been positively tested for the virus is given by

Where e_i is the error in reporting the number of patients on i th day. For this paper, we have assumed that the source of data [7–9,13–16] is accurate as of now, and assume e_i = 0 for all i. Due to imperfection in the process, often times there is error in reporting the number of cases and also the data is modified from time to time as the accounting procedure changes. The model presented here is flexible and it can incorporate changes in the data set as it is corrected from time to time.

However this model is not capable of predicting suppressed case data.

Cumulative number of deaths on day n is given by

Cumulative number of reported deaths is given by

The number of active infected cases which are confirmed by testing and reported on day n can be expressed as:

The number of reported hospitalized patients on day n is given by:

The number of reported hospitalized patients who are in ICU on day n is given by:

In the model, the testing rates a_n and p_n are considered time dependent as follows:

The rationale for choosing different testing rates is as follows. During the first j number of incubation days of the outbreak there is no testing and β_n = 0. During the first outbreak of the virus, it takes several days to identify the new virus and to develop an initial test, during which period the testing rate of the patients showing symptoms is low, denoted by β₀. Let n_β be the number of days to ramp up the testing of the patients with symptoms to 100% final testing rate. The testing rate σ_n for the general population including those who are infected and not showing symptoms is assumed to be zero until n = n₂ + n_β and let W be number of days it takes to build up the testing capability to test the general population at a final rate of σ_f. This recognizes that large scale production of testing supplies and development of infrastructure and resources for administering of tests in large numbers take time.

Model parameters

We test this model to explain the number of Covid-19 cases in a small number of countries at first – Italy, South Korea and New Zealand. These three countries in three continents had very different outcomes of Covid-19 cases and provide a good basis to anchor the model which is designed to be applicable to any country. We collected the data on cumulative total number of cases and deaths for these countries from reliable sources [7–9,13–16] for the period January 1^st, 2020 to May 8^th, 2020. We assumed that the aforementioned data sources were reliable and did not verify the accuracy of the data in the datasets. The actual data used in this paper is included in table I in the appendix for the reader to verify. In some cases there is available data on number of recovered cases and active cases. Due to different definitions used in different countries for the criteria used for recovered cases and active cases, we placed much less emphasis on the data on active and recovered cases.

There are twenty one variable parameters in the model description. These are j, k, m, h, p, N_o, T_o, T_f, n₁, G, β₀, e, n₂, n_β, δ_H, δ_S, σ_f, W, f^TH, f_H and f_ICU. The strength of this model is that it is based on parameters which can be measured. In the beginning of the virus outbreak, the parameter values are not well known. However in the long run, the parameters values can be extracted from the actual data and the errors in model predictions can be reduced. We start with reasonable assumptions for each parameter and then use the model to fit the published data on total and daily numbers of cases and deaths for each country to iteratively optimize the values of some of the parameters. At the end of iterations, the parameter values that best fit the country data are obtained. With these parameters values, we are in a position

to predict the number of cases and required number of hospital beds and
to carry out what if analyses on some of the parameters (such as lifting of social distancing requirement)

The initial values of the parameters j, k, m, h, p, f_H and f_ICU are obtained from data presented in the literature. In this model, we have used j = 5 [18–20], k = 14 [18,20–22], m = 11 [18,20], h = 10 [18,20,22–23], p = 3 [18,21–22], q = 14 [18,20–22], f_H = 17% [19–20,24–25], f_ICU = 33% [20,24–25]. In addition we used e_i = 0 for all i and f_a = 30% [12]. The value of p varies widely due to different standards at different places.

It is to be noted that there is daily variation and probability distribution function for each of these model parameters which in principle can be measured. The time and location dependent data on these parameters for Covid-19 in each country is not available yet. When the daily values of these parameters become available on a later date, these can be inserted in the model. For each parameter such as j and m mentioned above there is a range of values. In the model, we used one constant value for each parameter which is very reasonable approximation at this stage of data availability.

We carry out multivariate optimization to fit the data for each country between Jan 1^st and May 8^th of 2020. This can be considered as the training set. Starting with reasonable initial conditions in the model, iterations are made in the algorithm until the residual errors in fitting the published data during Jan 1^st and May 8^th 2020 for each country are minimized. At that point the final values of all parameters for this predictive model for each country become available. Once the optimization is completed, the model becomes powerful to yield information such as (1) predicted number of future cases, (2) sensitivity analysis such as the effect of delay in taking initial measures, and (3) recurrence of the disease. We discuss the results in the next section.

RESULTS AND DISCUSSION

Figure 5 shows the comparison between the Covid-19 case data and the model prediction for Italy. The first reported Covid-19 case in Italy was on Feb 15^th, 2020. There were 3 total cases on Feb 19 and the first death was on Feb 21. The model fitted the Italy Covid-19 case data with Feb 22 start date for transmission factor decrease when the number of cases was 79 and a transmission decay rate of 1/G, where G= 14. The model fits the data very well. The model predicts the number of cases to rise to between 232,000 and 239,000 and the number of deaths to rise to between 34,000 and 35,000 on June 30^th, 2020.

Figure 5.

Data and model predictions for Covid-19 cases in Italy

Figure 6 shows the comparison of the actual data and the model predictions for South Korea which had a factor of twenty less number of cases than Italy. The daily case data for South Korea has been available since Feb 15^th. There were 28 total cases on Feb 15^th and the first Covid-19 death was on Feb 20^th. The model fitted the South Korea Covid-19 data with Feb 18^th start date for transmission factor decrease when the number of cases was 31 and G= 6.

Figure 6.

Data and model predictions for Covid-19 cases in South Korea

Unless there is a recurrence, the model does not predict any significant rise in number of cases or deaths by June 30^th, 2020. Comparing model parameters for Italy and South Korea, we found that the number of cases were lower when T_n started decreasing and G was smaller for South Korea.

Figure 7 shows the comparison of the actual data and the model predictions for New Zealand which had a factor of seven less number of cases than South Korea. The daily case data for New Zealand became available since Mar 6^th on which date there were 4 total cases. The first Covid-19 death was on Mar 28^th. The model fitted the New Zealand Covid-19 data using the start date of reduction in T_n when the total number of reported cases was only 8, and the transmission factor to decay in G= 10 days. Unless there is a recurrence, the model does not predict any significant rise in number of cases or deaths in New Zealand by June 30^th, 2020.

Figure 7.

Data and model predictions for Covid-19 cases in New Zealand

Comparing data from Italy, South Korea and New Zealand, the model makes it obvious that when a country takes timely, swift and uniform virus containment measures at the point when the total number of cases is small, the number of infected cases and deaths are small. It is very similar to containment strategy for wildfire: sooner to detect, sooner to start containment, the less the spread and the resulting destruction. Even a day of waiting makes a big difference in the case of Coronavirus. In the model, transmission factors started reducing for Italy, South Korea and New Zealand when the total cases were 79, 31 and 8 respectively.

In conclusion, we have developed a mathematical model that provides insight into the Covid-19 viral outbreak cases in any country. The model fits the data very well for Italy, South Korea, New Zealand and California with different levels of cases and casualties. We also presented predictions from the model for hospital bed requirements six weeks in advance which can be very valuable for providing medical services to reduce casualties. Since conditions change daily which causes the model parameters to change, we do not predict more than six weeks in advance. The model is transparent and is based on parameters which are measurable. This allows the model calculations to be updated as the actual data of the parameters become available. Overall the model makes the hopeful conclusion that Covid-19 can be effectively contained by taking timely social distancing and patient isolation measures before ultimately a vaccine becomes available.

The author wishes to acknowledge valuable suggestion from Professor Nigam Shah of Stanford University to look into the hospitalization needs.

Data Availability

The data is available in the paper.

Additional Information

Correspondence and requests for materials should be addressed to the author.

APPENDIX

View this table:

Table I.

Covid-19 case data for South Korea, Italy, New Zealand and California (Jan 1- May 8, 2020)

References

1.↵
Anderson, R.M. and May, R.M., Infectious diseases of humans: dynamics and control (Oxford University Press, Oxford, 1991)
2.↵
Hethcote, H.W. “The mathematics of infectious diseases” Society for Industrial and Applied Mathematics 42, 599–653 (2000)
OpenUrl
3.↵
Basu, S. (2020, April 28) “A Mathematical Model to Analyze Data on Coronavirus Cases” https://doi.org/10.31224/osf.io/95arg
4.
Lin, Q., Zhao, S. et al., “A conceptual model for the coronavirus disease 2019 (COVID-19) outbreak in Wuhan, China and individual reaction and governmental action” International journal of infectious diseases. 93, 211–216 (2020)
OpenUrl CrossRef
5.
Kucharski, A.J., Russell, T.W. et al., “Early dynamics of transmission and control of COVID-19: a mathematical modelling study”, www.thelancet.com/infection Published online March 11, 2020 https://doi.org/10.1016/S1473-3099(20)30144-4
6.↵
Wu, J.T., Leung, K. and Leung, G.M., “Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study”, www.thelancet.com Vol 395, p. 689, February 29, 2020
7.↵
https://en.wikipedia.org/wiki/COVID-19_pandemic_in_California
8.↵
https://public.tableau.com/views/COVID-19PublicDashboard/Covid-19Hospitals?%3Aembed=y&%3Adisplaycount=no&%3AshowVizHome=no
9.↵
https://data.chhs.ca.gov/dataset/california-covid-19-hospital-data-and-case-statistics
10.↵
He, X. et al., “Temporal dynamics in viral shedding and transmissibility of COVID-19”, Nature Medicine, vol 26, pp. 672–675, May 2020, www.nature.com/naturemedicine
OpenUrl PubMed
11.↵
https://www.cdc.gov/coronavirus/2019-ncov/hcp/clinical-guidance-management-patients.html
12.↵
Nishiura, H., Miyama, T. et al., “Estimation of the asymptomatic ratio of novel coronavirus infections (COVID-19)”, to appear in International journal of infectious diseases (2020).
13.↵
World Health Organization website https://www.who.int/emergencies/diseases/novel-coronavirus-2019
14.
World Health Organization website https://www.who.int/docs/default-source/coronaviruse/who-china-joint-mission-on-covid-19-final-report.pdf
15.
U.S. Center for Disease Control and Prevention website https://www.cdc.gov/coronavirus/2019-ncov/index.html
16.↵
https://www.worldometers.info/coronavirus/
17.
https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/index.html
18.↵
Chang, D. et al., “Time Kinetics of Viral Clearance and Resolution of Symptoms in Novel Coronavirus Infection”, DOI: 10.1164/rccm.202003-0524LE on March 23, 2020
OpenUrl CrossRef
19.↵
Li, Q. et al., “Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus–Infected Pneumonia”, n engl j med 382; 13 nejm.org March 26, 2020
20.↵
https://www.cdc.gov/coronavirus/2019-ncov/hcp/clinical-guidance-management-patients.html
21.↵
Sethuraman, N., Jeremiah, S.S. and Ryo, A., “Interpreting Diagnostic Tests for SARS-CoV-2”, May 6, 2020. doi:10.1001/jama.2020.8259
OpenUrl CrossRef
22.↵
Zhou, F. et al., “Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study”, Lancet 2020; 395: 1054–62
OpenUrl CrossRef PubMed
23.↵
https://www.cdc.gov/coronavirus/2019-ncov/hcp/disposition-hospitalized-patients.html
24.↵
CDC COVID-19 Response Team, “Severe Outcomes Among Patients with Coronavirus Disease 2019 (COVID-19) — United States, February 12-March 16, 2020”, MMWR, vol 69(12), March 27, 2020
25.↵
Kashyap, S., Gombar, S. et al., “Measure what matters: counts of hospitalized patients are a better metric for health system capacity planning for a reopening”, medRxiv preprint doi: https://doi.org/10.1101/2020.04.19.20072017