How Many COVID-19 Patients Will Need Ventilators Tomorrow?
==========================================================

* J. D. Dai
* Mark Gluzman
* Alyf Janmohamed
* Yaosheng Xu

## Abstract

This paper develops an algorithm to predict the number of Covid-19 patients who will start to use ventilators tomorrow. This algorithm is intended to be utilized by a large hospital or a group of coordinated hospitals at the end of each day (e.g. 8pm) when the current number of non-ventilated Covid-19 patients and the predicated number of Covid-19 admissions for tomorrow are available. The predicted number of new admissions can be replaced by the numbers of Covid-19 admissions in the previous d days (including today) for some integer *d* ≥ 1 when such data is available. In our simulation model that is calibrated with New York City’s Covid-19 data, our predictions have consistently provided reliable estimates of the number of the ventilatorstarts next day. This algorithm has been implemented through a web interface at covidvent.github.io, which is available for public usage.

Utilizing this algorithm, our paper also suggests a ventilator ordering and returning policy. The policy will dictate at the end of each day how many ventilators should be ordered tonight from a central stockpile so that they will arrive by tomorrow morning and how many ventilators should be returned tomorrow morning to the central stockpile. In 100 runs of operating our ventilator order and return policy, no patients were denied of ventilation and there was no excessive inventory of ventilators kept at hospitals.

## 1 Predicting number of new patients on vents

Throughout this paper, the term “patient” refers to Covid-19 patient and the term “hospital” refers to one large hospital that treats Covid-19 patients or a group of hospitals in a region that have some central coordination on ventilators. Let S be the number of new patients who will require ventilator support next day (tomorrow) at this hospital. We will call these patients “vent-starts” or “vent-start patients”. We provide the estimate of the number of ‘vent-starts’ at the end of each day when the hospital’s daily numbers of on-vent and not-on-vent patients become available. We estimate this quantity using the dynamic data (available today from the hospital) and parameters (available historically, not necessarily from the hospital) as described below.

### Dynamic data

The following dynamic data can be observed or estimated daily.

*   (D.1) *L*: the number of hospitalized patients who are not on vents today.

*   (D.2) *A*: an estimate of the next day’s number of new hospital admissions. We consider a time-series method for predicting *A* in Section 4.4.

### Parameters

The following parameters are static and can be estimated from historical data.

*   (p.1) The probability for a newly admitted patient to belong to one of these types
    
    type 1: never-vent patient; patient will never use a vent.
    
    type 2: vent-cure patient; patent will use a vent and be cured.
    
    type 3: vent-die patient; patient will use a vent and die later.
    
    These probabilities are denoted by ![Formula][1]</img>  respectively. We assume that each hospitalized patient belongs to one of the these three types even though the type is not observable at admission time.

*   (P.2) The average length-of-stay (LOS) ![Formula][2]</img>  for each type of patient. Specifically,
    
    1.  For type 1 (never-vent), the length-of-stay is equal to the number of days in hospitalization, from admission to discharge.
    
    2.  For type 2 (vent-cure) and type 3 (vent-die) patients, the length-of-stay is equal to number of days in hospitalization *before* ventilation, from admission to ventilation. Therefore, for a type 2 or type 3 patient, the length-of-stay can be better called time-to-ventilation.

In this paper we argue that the expected number of vent-starts next day can be estimated as ![Formula][3]</img>  where

![Formula][4]</img> 
Of course, the number of vent-starts next day *S* is random. We will argue that *S* can be modeled as a random variable that follows Poisson distribution with mean ![Graphic][5]</img>:

![Formula][6]</img> 
Using this model, one can easily compute an upper confidence interval bound *U* that satisfies

![Formula][7]</img> 
In Section 3, we will use the upper confidence bound *U* to design a policy for ordering and returning ventilators.

## 2 Methodologies to justify (1.4)

Suppose that we are at the end of day *t*. We identify two groups of patients who potentially can become vent-start patients on day *t* + 1: hospitalized patients who are not on the vent support (not-on-vent patients) on day *t* and new patients who will be admitted on day *t* + 1.

We denote *Lt* as the number of non-ventilated patients at the end of day *t* who might need the vent support in the future. (We assume *Lt* does not include non-ventilated patients who have previously been on vent support.) On day *t*, there are *Lt* not-on-vent patients in hospitals; a fraction of them, ![Graphic][8]</img>, will turn into vent-patients on day *t* + 1.

We assume that the number of vent-start patients ![Graphic][9]</img> should be proportional to the number of non-vent patients *Lt*. We model vent-starts ![Graphic][10]</img> as a random variable that follows binomial distribution with parameters *Lt* and *pL:* ![Formula][11]</img>  where *pL* is a fixed probability that does not change over time. We discuss how to compute probability *pL* in Section 2.2.

In addition, on day *t* +1, there will be *At+*1 new hospital admissions, and some number of them, ![Graphic][12]</img>, will turn into vent-patients on the same day *t* + 1.

Similarly we assume ![Graphic][13]</img> is an independent binomial random variable with parameters *At+*1 and *pA*: ![Formula][14]</img>  where *pA* is a fixed probability that does not change over time. We discuss how to compute parameter *pA* in Section 2.3.

Then we estimate the number patients who start the ventilator support on day *t* +1 as

![Formula][15]</img> 
A binomial distribution Binom(*n, p*) can be approximated by Poisson distribution with mean *np* when *n* is large and *np* is moderate. We use this fact to approximate the distribution of the number of vent-start patients as Poisson distribution with mean *pLLt* + *pAAt+*1:

![Formula][16]</img> 
In the following sections we propose a method of estimating parameters *pL* and *pA*.

### 2.1 Patient types

Each patient is assigned to one of the three patient types: never-vent patient (type 1), vent-cure patient (type 2), and vent-die patient (type 3). The type of a patient is not observable, but does *not* change over time.

We assume that the probability distribution (*p*1, *p*2, *p*3) for a patient to belong to one of these types is known (exogenously, meaning that they do not depend on the congestion levels in hospitals; of course, overly congested hospitals increase death rate.)

We assume that the probability of a type 2 patient becoming a ventilated patient tonight is *q*2. Similarly, we use *q*3 to denote the probability for a type 3 patient to become a ditto patient tonight. To estimate *qi* for *i* = 2, 3, we note that

![Formula][17]</img> 
A patient cannot change type; a unknown type will eventually be revealed.

### 2.2 Conversion from *Lt*

Among *Lt* patients who are not on vent yet on day *t*, we need to separate them into three types: ![Graphic][18]</img>, ![Graphic][19]</img>, and ![Graphic][20]</img>.

Motivated by Theorem 1 of [3] in the setting of *Mt*/*G*/∞ queues, one expects that *Lk* (*t*) is “close” to a Poisson random variable with mean that is proportional to ![Formula][21]</img>  where LOS1 is the average length-of-stay for those patients (type 1) who never need a vent, LOS2 = 1/*q*2 and LOS3 = 1/*q*3 are average time-to-vent for type 2 and type 3 patients, respectively. Thus, we propose ![Formula][22]</img>  where

![Formula][23]</img> 
Now we can estimate parameter *pL* in (2.2) as

![Formula][24]</img> 
We expect (2.4) can be properly formulated as a functional strong-law-of-large-numbers: for any *t* ≥ 0,

![Formula][25]</img> 
where λ > 0 is a scaling parameter representing the “size” of the hospital. Limits in (2.7) is also known as fluid limits as the “market size” λ goes to infinite. See, for example [4, 5], for a discussion of “large-capacity” scaling. The limit in (2.7) exhibits one form of “state space collapse”, a term coined by [7], meaning that the three-dimensional process {(*L*1 (*t*), *L*2(*t*), *L3*(*t*)), *t* ≥ 0} is a deterministic multiple of the one-dimensional process {*L*(*t*), *t* ≥ 0} when the “market size” A is large.

### 2.3 Conversion from *At+*1

We classify the *At+*1 admissions on day *t* + 1 by patient type. Thus, ![Formula][26]</img>  are the number of admitted type 1,2, and 3 patients on day *t* + 1, respectively. Following our assumption, ![Formula][27]</img>  where *p*1, *p*2, *p*3 are given in (1.1). Among ![Graphic][28]</img> type *k* patients admitted on day *t* + 1, ![Formula][29]</img>  will turn into type *k* vent-patient by the end of day *t* + 1, *k* = 2,3. Thus, among *At+*1 admissions on day *t* + 1, ![Formula][30]</img>  will turn into type *k* vent-patient by the end of day *t* + 1, *k* = 2, 3. Therefore, we can estimate parameter *pA* in (2.2) as

![Formula][31]</img> 

## 3 Ordering and returning ventilators

As before, we assume we are at the end of day *t*. In this section we propose a method that provides the recommended number of ventilators *Vt*+1 to order or return. This tool is intended to ensure that the medical facilities have enough ventilators on day *t* + 1. The tool can also be used to detect surplus of ventilators on day *t* + 1, so that they can be returned to a stockpile or be transported to other locations that require them.

The number of free and ready-to-use ventilators on day *t* + 1 has to to meet the demand of vent-start patients with probability close to 1. According to equation (1.4), number of vent-starts *St*+1 follows Poisson distribution with mean ![Graphic][32]</img>. We denote *Ut+*1 the upper confidence bound on the vent-starts on day *t* + 1 as defined in (1.5).

We also recommend to have a *safety-stock* pile with *G* ventilators on the spot that might be used in emergency, unforeseen situations. The size of the safety-stock pile is a hyperparameter and should be determined by a hospital manager who takes into account availability of vent storage facilities, vents delivery speed, etc.

Let *Rt* be the number of free and ready-to-use ventilators on day *t* at 8pm. Then we recommend to order ![Formula][33]</img>  ventilators.

If number *Vt*+1 is negative the hospital can return |*Vt*+1| ventilators back to the federal stockpile.

## 4 Appendix

In this appendix, we test the accuracy of our predictions of vent-starts: mean in (1.3) and upper confidence level *U* in (1.5). We could not find data that included the daily statistics for vent-start patients. In order to test accuracy of vent-starts prediction and efficacy of the proposed ordering policy, we use the simulation model described in Section 4.1. We also test the effectiveness of the policy for ordering and returning ventilators.

### 4.1 Simulation model

To verify the effectiveness of our proposed prediction model for vent-starts in a real hospital setting, we need to know the number of new patients on vents each day. We could not get access to this information even though many cities including New York City (NYC) publicize some related hospitalization data for Covid-19 patients. Therefore, we created a simulation model that would generate the number of daily vent-starts and daily hospital census numbers. Our simulation model uses actual NYC daily admissions and is calibrated so that it matches the NYC daily hospital census. We demonstrate that our predicted vent-starts, using (1.3) and (1.5), matches the vent-starts outputted from the simulation model.

The simulation model takes 3 inputs:

1.  Number of hospitalized patients who are not on ventilators on day *t* = 1.

2.  Number of hospitalized patients who are on ventilators on day *t* = 1.

3.  New admissions per day (series) for each day.

Every new patient is independently sampled as type 1, 2 or 3 with probability *p*1, *p*2, and *p*3 respectively, see (1.1). Then, their patient journey (days in hospital, days-to-vent, and days on a ventilator) is independently sampled from the following distributions depending on their type: 

*   Geom(LOS1), LOS for never-vent patients,

*   Geom(LOS2) and Geom(LOS3) days-to-vent for type 2 and 3 patients,

*   ![Graphic][34]</img> and ![Graphic][35]</img> days-on-vent for type 2 and 3 patients,

where LOS1, LOS2, LOS3 are defined in Section 1, ![Graphic][36]</img> is the average time on ventilator support for type 2 patients before recovery, ![Graphic][37]</img> is the average time on ventilator support for type 3 patients before passing away.

We assume that it is impossible to separate type 2 and type 3 patients based on the time from hospitalization to the vent support. Therefore, we assume that ‘time-to-vent’ follows the same distribution for both types of patients:

![Formula][38]</img> 
For patients that are already in hospital at the beginning of the simulation, either on ventilators or not, we make the following assumptions. If they are in the hospital but not on ventilators, they are of type *k* with probability ![Graphic][39]</img>, *k* = 1,2,3, and their patient journey is sampled from the same distribution as a new patient, where ![Formula][40]</img>  are the same one in (2.5). For patients that are already on ventilators when the simulation begins, we use a formula similar to (2.5) to determine their patient type. Specifically, they are a patient of type *k* with probability ![Graphic][41]</img>, *k* = 2, 3, where

![Formula][42]</img> 
Their length of stay is a geometric random variable with mean ![Graphic][43]</img> if they are type 2 or ![Graphic][44]</img> if they are type 3. We also assume they are immediately discharged after their duration on a ventilator ends.

Our simulation model demonstrates high accuracy when predicting the number of patients on ventilators in New York City using the parameters listed in Section 4.2. Figure 1 shows the expected number of patients on ventilators (red dots) and the expected range according to our simulation model (red bars). The range is the maximum and minimum number of ventilators required on a specific day across 100 runs of the simulation. The green triangles represent the real number of patients on ventilators.

![Figure 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/21/2020.05.18.20105783/F1.medium.gif)

[Figure 1:](http://medrxiv.org/content/early/2020/05/21/2020.05.18.20105783/F1)

Figure 1: 
Comparison of the number of patients on ventilators. NYC data is missing on many days.

### 4.2 Simulation Parameters

The parameters for patient type used in the simulation are *p*1 = 0.7, *p*2 = 0.06, *p*3 = 0.24. These numbers are from the CDC website [6] and news reports which said that 80% of all ventilator patients in NYC died [2]. The parameters for length-of-stay used in the simulation are LOS1 = 10, LOSpre-vent = 4.8, ![Graphic][45]</img>, ![Graphic][46]</img>. The LOS1 parameter is from the CDC website for Covid-19 [6] while the other 3 parameters were tuned using the total number of people on ventilators in NYC between March 16th - 21st and March 24th - 30th. The parameters were tuned using Bayesian optimization and the loss function was mean square error over 50 scenarios.

### 4.3 Numerical experiments

As an input to the simulation model we use real daily admissions of hospitalized patients with COVID-19 at New York City from March 3 to April 14. We assume that there were no hospitalized patients with COVID-19 before March 3 and we set initial number of hospitalized patients to zero for the simulation model run. The model parameters are specified in Section 4.2. In Figure 2 we show the number of on-vent and not-on-vent patients that stay in the hospital according to a simulation run.

![Figure 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/21/2020.05.18.20105783/F2.medium.gif)

[Figure 2:](http://medrxiv.org/content/early/2020/05/21/2020.05.18.20105783/F2)

Figure 2: 
The number of on-vent patients and the number of not-on-vent-patents from the simulation model. The simulation output is from one simulation run.

From the same simulated trajectory we can get daily information about vent-starts. In Figure 3 we plot these vent-starts in green triangles. We use these green triangles as benchmarks to test the accuracy of our prediction formulas (1.3) and (1.5). To recapitulate the prediction procedure described in Section 1, at the end of day *t* we observe the number of hospitalized patients *Lt* and the historical admissions including *At*, the admission on day *t*. Using the historial admissions, one can predict the number of new admissions *At+*1 on day *t* + 1 using a time series model described in Section 4.4. Given *Lt* and *At+*1, one predicts *St+*1 the number of vent-start patient for the next day via Poisson model (1.4). In Figure 3 red circles show the expected number of vent-starts, *St*+1, computed by equation (1.3) on each day *t* + 1, *t* = 0,…, 42. The top of the red bars correspond to *Ut*, the upper confidence bound on the vent-starts computed from (1.5).

![Figure 3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/21/2020.05.18.20105783/F3.medium.gif)

[Figure 3:](http://medrxiv.org/content/early/2020/05/21/2020.05.18.20105783/F3)

Figure 3: 
Prediction accuracy of the number of vent-start patients. *The green triangles* shows the number of vent-start patients according to the simulated trajectory, *the red circles* – expected number of the vent-starts according to (1.3), *the top of the red error bar* - upper confidence bound on the vent-starts according to (1.5).

Next we test the ordering policy proposed in Section 3. Using the prediction of the vent-starts *Ut*+1 on day *t* + 1, we either order *Vt*+1 ventilators from a central stockpile that need to be delivered by day *t* + 1 when *Vt*+1 > 0 or return | *Vt*+1 | ventilators to the central stockpile when *Vt*+1 < 0, where *Vt*+1 is computed according to formula (3.1). We assume that if *Vt*+1 < 0, the hospital sends back | *Vt*+1 | ventilators in the morning of day *t* + 1. In Figure 4a we provide the number of ordered/returned ventilators on each day. We set the safety stock level to be equal to G = 10.

![Figure 4:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/21/2020.05.18.20105783/F4.medium.gif)

[Figure 4:](http://medrxiv.org/content/early/2020/05/21/2020.05.18.20105783/F4)

Figure 4: 
Efficiency of the recommend policy. (a) The daily number of ventilators or dered/returned according to the policy, (b) surplus of ventilators at the end of each day.

In Figure 4b we show the number of unused ventilators at the end of each day that are possessed by the hospital. We note that the plot demonstrates that the hospital always has enough ventilators to satisfy the demand from patients who need ventilator support. On the other hand, the hospital is not oversupplied from the central stockpile.

We run 100 independent simulations runs to test the robustness of the proposed policy. In Figure 5 we show minimum, average and maximum number of free (ready-to-use) ventilators observed during each independent simulation run. We observe that the hospital could satisfy the demand of ventilators from vent-start patients and, at the same time, did not accumulate too many free ventilators by the end of each day.

![Figure 5:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/21/2020.05.18.20105783/F5.medium.gif)

[Figure 5:](http://medrxiv.org/content/early/2020/05/21/2020.05.18.20105783/F5)

Figure 5: 
Robustness of the policy for ordering and returning ventilators. No patients were denied of ventilation, no excessive inventory of ventilators kept at the hospital.

### 4.4 Predicting the number of admissions next day

Since we make a short term prediction (tomorrow) of the number of hospital admissions, we use a standard time series algorithm to make this prediction, assuming historical daily admission numbers are available. We adopt the algorithm ARIMA(*p*, *d, q*) in [1], where parameters *p*, *d*, *q* are tuned based on the input of historical data. It has been well developed in many library packages. For example, Python has a library function that implements this algorithm with automatic tuning. We set the algorithm to be adaptive, meaning that every day, as a new data point becomes available, the parameters *p, d, q* are re-tuned and the algorithm gives a new prediction for the next day. After fitting the ARIMA model on the historical admission data, the prediction for the next day’s hospital admissions closely matches the observed value. This suggests that prediction of the ARIMA model could be a reliable input for anticipated hospital daily admission next day. Figure 6 is an example of a 3-day prediction of NYC hospital daily admissions, with 95% confidence interval.

![Figure 6:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/21/2020.05.18.20105783/F6.medium.gif)

[Figure 6:](http://medrxiv.org/content/early/2020/05/21/2020.05.18.20105783/F6)

Figure 6: 
New York City Hospitalized Daily New Admission

## Data Availability

The data is available through a web interface at covidvent.github.io

[http://covidvent.github.io](http://covidvent.github.io) 

## Footnotes

*   * We thank David Shmoys for coordinating Cornell ORIE Covid-19 projects including this one. We thank Shane Henderson and Gloria Shen for improving the exposition of this paper.

*   Received May 18, 2020.
*   Revision received May 18, 2020.
*   Accepted May 21, 2020.


*   © 2020, Posted by Cold Spring Harbor Laboratory

This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at [http://creativecommons.org/licenses/by/4.0/](http://creativecommons.org/licenses/by/4.0/)

## References

1.  [1]. Peter J. Brockwell and  Richard A. Davis. Introduction to time series and forecasting. Springer, New York, 2002.
    
    

2.  [2]. Jonah Bromwich,  Maria Cramer,  Alan Feuer,  Michael Gold, and et al. N.Y. Virus Deaths Hit New High, but Hospitalizations Slow, 2020. [https://www.nytimes.com/2020/04/07/nyregion/coronavirus-new-york-update.html](https://www.nytimes.com/2020/04/07/nyregion/coronavirus-new-york-update.html).
    
    

3.  [3]. Stephen G. Eick,  William A. Massey, and  Ward Whitt. The Physics of the *Mt/G/∞* Queue. Operations Research, 41(4):731–742, 1993.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1287/opre.41.4.731&link_type=DOI) 

4.  [4]. Constantinos Maglaras and  Assaf Zeevi. Pricing and capacity sizing for systems with shared resources: Approximate solutions and scaling relations. Management Science, 49(8):1018–1038, 2003.
    
    

5.  [5]. Constantinos Maglaras and  Assaf Zeevi. Pricing and design of differentiated services: Approximate analysis and structural insights. Operations Research, 53(2):242–262, 2005.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1287/opre.1040.0172&link_type=DOI) 

6.  [6].Division of Viral Diseases National Center for Immunization and Respiratory Diseases (NCIRD). Management of Patients with Confirmed 2019-nCoV, 2020. [https://www.cdc.gov/coronavirus/2019-ncov/hcp/clinical-guidance-management-patients.html](https://www.cdc.gov/coronavirus/2019-ncov/hcp/clinical-guidance-management-patients.html).
    
    

7.  [7]. Martin I. Reiman. Some diffusion approximations with state space collapse. In Modelling and Performance Evaluation Methodology, pages 207-240. Springer-Verlag, 1984.

 [1]: /embed/graphic-1.gif
 [2]: /embed/graphic-2.gif
 [3]: /embed/graphic-3.gif
 [4]: /embed/graphic-4.gif
 [5]: /embed/inline-graphic-1.gif
 [6]: /embed/graphic-5.gif
 [7]: /embed/graphic-6.gif
 [8]: /embed/inline-graphic-2.gif
 [9]: /embed/inline-graphic-3.gif
 [10]: /embed/inline-graphic-4.gif
 [11]: /embed/graphic-7.gif
 [12]: /embed/inline-graphic-5.gif
 [13]: /embed/inline-graphic-6.gif
 [14]: /embed/graphic-8.gif
 [15]: /embed/graphic-9.gif
 [16]: /embed/graphic-10.gif
 [17]: /embed/graphic-11.gif
 [18]: /embed/inline-graphic-7.gif
 [19]: /embed/inline-graphic-8.gif
 [20]: /embed/inline-graphic-9.gif
 [21]: /embed/graphic-12.gif
 [22]: /embed/graphic-13.gif
 [23]: /embed/graphic-14.gif
 [24]: /embed/graphic-15.gif
 [25]: /embed/graphic-16.gif
 [26]: /embed/graphic-17.gif
 [27]: /embed/graphic-18.gif
 [28]: /embed/inline-graphic-10.gif
 [29]: /embed/graphic-19.gif
 [30]: /embed/graphic-20.gif
 [31]: /embed/graphic-21.gif
 [32]: /embed/inline-graphic-11.gif
 [33]: /embed/graphic-22.gif
 [34]: /embed/inline-graphic-12.gif
 [35]: /embed/inline-graphic-13.gif
 [36]: /embed/inline-graphic-14.gif
 [37]: /embed/inline-graphic-15.gif
 [38]: /embed/graphic-23.gif
 [39]: /embed/inline-graphic-16.gif
 [40]: /embed/graphic-24.gif
 [41]: /embed/inline-graphic-17.gif
 [42]: /embed/graphic-25.gif
 [43]: /embed/inline-graphic-18.gif
 [44]: /embed/inline-graphic-19.gif
 [45]: /embed/inline-graphic-20.gif
 [46]: /embed/inline-graphic-21.gif