1 Abstract
We built a mathematical model for SARS-CoV-2 transmission and analyze it using both a deterministic and a stochastic approach. We used this model to project the burden of the disease in refugee camps characterized by peculiar demographic characteristics and a high level of deprivation, including lack of medical facilities and personnel, as well as limited possibility to implement containment and quarantine measures. Most of the parameters in our model were adapted from published literature but we used our own estimates of the basic reproduction number, R0 as well as the lethality by age group and gender. We projected the burden in terms of number of infections, number of deaths and number of bed-days in hospitalization and intensive care, among others. We conclude that the harsh conditions of refugee camps combined with a high share of young people leads to a relatively mild scenario for the burden of the disease.
2 Executive summary
Objective
The report illustrates the results of a series of simulations of the spread of COVID-19 disease in refugee camps using a demo set of data from the Za’atari refugee camp in Jordan. The purpose of this exercise is to illustrate the type of information that can assist the UNHCR and camps’ administrators in preparing for a COVID-19 outbreak.
Model
Simulations are conducted with a compartmental model that we call “SIZ”. This is a newly developed epidemiological model with eight compartments designed specifically to model COVID-19 outbreaks in closed and highly densely populated communities such as refugee camps. The model segregates those infected into symptomatic and asymptomatic categories and provides simulations using both deterministic and stochastic solutions. Whereas deterministic models are based on transition rates between compartments, stochastic models are based on transition probabilities between them. The main difference is that in a deterministic model an individual ‘splits’ among communicating stages, whereas in a stochastic model the individual moves to only one of those stages. Also, transitions from the current state X(t0) to the state X(t0 + t) are random, as well as the time for the transition, t. This allows for the estimation of standard deviations and confidence intervals. Solutions to the models have been programmed in Matlab, Python and Stata to check and validate results.
Parameters
The basic reproductive number R0 used for these simulations is 3. This is a low-end reproductive number as compared to the literature on COVID-19 and a cross-country estimation prepared by the authors based on publicly available information. The rest of the parameters in the model such as residence time in each compartment and transition probabilities between compartments were estimated directly or indirectly from the existing literature or using information collected from a data set from Mexico [10] which has been used to benchmark lethality rates by age and gender.
Policies
Some plausible control policies were included in the model including use of masks, contact tracing and quarantine, promoting social distance and reducing infection time by self-awareness. Results are presented for four sets of parameters described in the Results sections. Two of the simulations include contact tracing policies and two exclude them, and for each of these options we simulate the baseline contact rate (no reduction in contact rate) and a 33% reduction in contact rate, which can be the result of different distancing or isolation policies. Pharmaceutical treatments are not included in the model since there is no known-effective therapy. Nevertheless, if one wishes to do so, the parameters p and w in the model can be changed to reduce the lethality of the disease. The same applies for vaccines which would require a reduction in the contact rate.
Results of the Deterministic Solution
Results of the deterministic solutions are shown in tables 12 to 15. It is shown that, with no reduction in contact rate and no contact tracing, 95% of the population will be infected, although the mortality will be comparatively low (less than 2 in 1000) mainly because of the large share of young people among refugees (60 % is less than 20 years old in the Za’atari refugee camp). The simulations also show that 3/4 of the infections will be reached in about 10 weeks. Since the time to first death is also about 7 weeks, the epidemic will not be noticeable for the first month and a half and will be running underground, with an estimate of 25 % of infections by week 8.
Most of the disease will manifest itself in hospitals and intensive care units. The simulations reveal that, in the worst case scenario of those studied, 1,806 and 1,052 bed-days1 are required in hospital and Intensive Care (IC) units respectively. With the average time of hospitalization used (three days), 49 beds in hospitals and and 25 in IC units will be required to fulfill most of the demand.
If we perform contact tracing and quarantine 1/3 of the infected individuals, the epidemic size will drop from 94% of the population to 85%. If we reduce the contact rate by 1/3, the epidemic size is reduced to 79% of the population. Both strategies combined will reduce the epidemic size to 58%, although achieving this reduction would be difficult in refugee camps [23].
The analysis suggests that it is important to concentrate efforts on reducing the effective contact rate by encouraging the use of face masks and promoting social distance, whereas testing and contact tracing is a less effective strategy due mainly to the high R0 value. Also, promoting a reduction in the infectious time through education and self-awareness is of little effect, mainly because the fraction of symptomatic individuals requiring hospitalization is small, and the epidemics is clearly driven by those asymptomatic or not requiring hospitalization.
Lock-downs are effective to avoid the virus from entering the camp. But the likelihood that this can be actually implemented in an effective way is dubious. The disease will enter the camp if there is some kind of interaction between the camp and any city or village nearby where an outbreak has began. Since a small fraction of individuals will develop symptoms that require hospitalization, this means that, on average, there will be many infections before the first infected individual is noticed (the time to first death is on average over 6-7 weeks), which makes a lock-down ineffective. Individual isolation is also difficult [23], mainly because it requires the isolation of at least all members of a family.
Results of the Stochastic Solution
The average number of deaths and beddays calculated in the stochastic model are similar to those obtained for the deterministic model.2 Nevertheless, there are important differences in the maximum number of hospitalized in a day as well as in the number of beds required to cover the demand of hospital and ICU services. Other differences are related to the time to some events such as the time to reach a given percentage of infections. In a stochastic model, we are discarding those simulations in which the epidemic vanished quickly with no deaths, thus, the distributions are switched to the right of the deterministic model and the time to the occurrence of an event is larger.
In the results sections, we also present figures illustrating the number of beds required to fulfill the demand for bed-days, also including the margins of error. For example, we can see from Figure 7 (top panel) that, with 83 beds, it would be possible to satisfy 90% of the demand for hospital beds, whereas 48 beds would be required to satisfy 90% of the demand for beds in Intensive Care Units (ICUs, bottom panel). Standard errors are included in all figures and represented by vertical bars.
The Web Application
The stochastic solution of the SIZ model is also available on the web where it is possible to reproduce the results presented in this study, or run additional simulations by varying the parameters of interest. The web app is currently hosted at this address:
Validation of results
In an attempt to benchmark and validate our results, we compare them with one other similar study conducted in a refugee camp in the Cox’s Bazar district of Bangladesh, and with the information emerging on COVID-19 outbreaks in refugee camps.
In our knowledge, there is only one other study that estimated the potential impact of COVID-19 in refugee camps. This is a study conducted by John Hopkins University on the Kutupalong-Balukhali Extension Site, Cox’s Bazar, Bangladesh [23]. We first compare the results of this study with the results of our study on the Za’atari refugee camp in Jordan and then compare the two models used by these studies applying our SIZ model to Kutupalong-Balukhali Extension Site. To note that the JHU-SEIR model also estimates the probability of an outbreak (with probabilities ranging from 61 to 92%), whereas the WB-SIZ model only considers outcomes following an outbreak.
Table 1 reports the key parameters and results of the two studies. The two studies result in very similar estimates on the population shares of infected people at the end of the epidemic. The JHU study for the Cox’s Bazar refugee camp reports 91.1% infection rate with an R0 comprised between 2 and 3 whereas our study reports an infection rate of 94.2% with an R0 of 3. Similarly, our model applied to the Kutupalong-Balukhali Extension Site finds an infection rate of 94.3%. Essentially, all simulations concord in finding that, at the end of the epidemics and with no pre-existing immunity (something that is still debated in the literature), the quasitotality of people are expected to be infected.
However, there are significant differences across the two models in terms of overall numbers of hospitalizations, hospitalizations in Intensive Care Units (ICUs) and deaths. Both studies find relatively small shares of people in need of hospitalization (between 0.6 and 4.4 %), ICUs (between 0.1 and 1.4%), and low death rates (between 0.1 and 0.4 %), which are rates consistent with the latest literature. It is also understood that these low rates are mainly explained by the relatively young age of the population living in refugee camps. However, the the JHU study provides estimates that are much higher than our estimates. The main explanations for this difference are the benchmark data used by the two studies for the transitions between infections, hospitalizations and death and the age adjustments. Our study relies on a comprehensive data set of individual observations from Mexico, whereas the JHU study relies on information available in the literature. As it is uncertain what best applies to refugee camps, the most suitable interpretation of these differences is in terms of ranges of possible outcomes. This implies that preparations for the outbreak should rely on the upper estimates.
It is also possible to compare results of these simulations with the observed evolution of the spread of the disease in the Za’atari refugee camp in Jordan and the Cox Bazaar refugee camp in Bangladesh. According to the refugee response coordination platform in Jordan, testing in the Za’atari refugee camp has been underway and a facility for treatment of COVID-19 cases has been already built in the expectation of an outbreak. However, as of July 13th, 2020, no positive cases were detected3 Therefore, either the virus has not yet entered the camp or it is still in an very early stage when casual testing is unlikely to catch positive cases. In either case, the situation on the field cannot provide any information to validate the simulations proposed in this study.
Some initial comparisons can instead be done between the results of the [23] Kutupalong-Balukhali Site simulations and the evolving situation on the ground. The first refugee COVID-19 case in the Cox’s Bazaar refugee camp was registered around May 15th, 20204, and, according to the WHO “As on 28 June, as many as 2,456 cases of infection have been confirmed in Cox’s Bazar District and 50 refugees have tested positive in the Rohingya camps.”5 Therefore, the virus is widely present in the district and in the camp, particularly if one considers the still limited number of tests per day conducted in the region and the number of observed infections went from 1 on May 15th to 50 on June 28th.
We know from studies worldwide that the true number of infections may be as large as ten times what is observed, even in countries where testing is widespread. According to the simulations provided by [23], “On average, in the first 30 days of the outbreak, we expect 18 (95% prediction interval [PI], 265), 54 (95% PI, 3223), and 370 (95% PI, 41,850) people infected in the low, moderate, and high transmission scenarios, respectively.” The study focused on the Kutupalong-Balukhali Expansion Site which is home of about 600,000 of the more than 1 m refugees in the Cox’s bazaar district of Bangladesh, and we do not know how many of the 50 infected cases were found in this particular zone. However, with a rough estimate of 30-40 cases, it is entirely plausible that in the 43 days that passed between May 15th and June 28th, around 300-400 people have been infected in the camp. This would be within the range found in [23]. The WB-SIZ model applied to the KutupalongBalukhali Expansion Site results in similar infection rates at the end of the epidemic but reaches the end much faster than the JHU-SEIR model. This means that we would have expected the spread of the infection to be more advanced by day 43. This may depend of course on the effectiveness of the measures taken in the camp but, regardless of these measures, we should expect continued growth of infections in the next few months.
These findings warn against complacency with the low numbers of COVID-19 infected persons currently observed in Jordan and Bangladesh. Once the virus enters the camp, the simulations in [23] and our study show that the virus can spread quickly and widely, potentially reaching over 90% of the population. Thanks to the age and gender profile of the camps, the number of hospitalizations and deaths will be low relatively to older populations, but the numbers may still overwhelm hospitals and health care facilities. In this respect, preventive measures such as building COVID-19 field hospitals as in the Za’atari refugee camp should be supported and not regarded as excessive measures.
3 The SIZ model
3.1 Compartments
The model proposed is an epidemiological compartimental model structured into 8 compartments as illustrated in Figure 1.
The compartments are described as follows:
All individuals start as susceptible, S. Once an individual is infected, it becomes infectious immediately, either in category I (symptomatic) or Z (asymptomatic). By symptomatic we define those individuals that will develop eventually severe symptoms that will require hospitalization, whereas an asymptomatic individual refers to those that will develop none or mild symptoms that will not require hospitalization. The probability that an infected starts as symptomatic is p while it starts as an asymptomatic with probability (1 – p).
Infected asymptomatic Z, stay infected and infectious for an average amount of time . When they leave the Z stage they move immediately to stage R (removed) where they are alive, non-infected and non-infectious. Individuals in stage R are assumed to be immune, and they play no role in the dynamics of the epidemic. An individual in stage Z can also be moved to stage R without finishing the infectious stage if tested positive for SARS-Cov-2 virus. We assume that testing is implemented according to a cordon sanitaire where some contacts are tested for SARS-Cov-2. Testing depends on a parameter θ that will be described later.
Infected symptomatic I stay infected and infectious for some time and will develop eventually severe symptoms that will require hospitalization. An individual in stage I will move out of the I stage for two reasons: either this individual developed severe symptoms and moves to hospitalization H which occurs at a rate µ2 or it moves to temporary quarantine RT which occurs at a rate µ2α. Parameter α reflects the possible effect on self-awareness. When individuals are more aware of the epidemic, its more likely that they will attribute classical symptoms of the disease as being infected, thus may reduce contact rate and self-isolate. α = 0 implies individuals keep infectious until they develop severe symptoms and move to hospitalization. α > 0 implies individuals will stop being infectious by self-awareness and self-isolation before the onset of severe symptoms.
Infected symptomatic I may also leave stage I by testing, as it was described for individuals in stage Z. The only difference is that individuals tested positive in stage I do not move to stage R since they are quarantined in a compartment RT temporarily, where they remain until the appearance of severe symptoms that will require hospitalization. Testing depends on a parameter θ that will be described later.
Individuals removed temporarily in RT, move to hospitalization H at a rate µ2, due to the memoryless property of exponential distributions.
Hospitalized individuals H remain hospitalized for an average amount of time and they will need intensive care C with probability q or recover R and become immune with probability 1 – q.
Individuals in intensive care C stay in this stage for an average time and may die with probability w or recover and become immune with probability (1 – w).
3.2 Parameters
The parameters in the model are:
λ, the contact rate, that is, the average number of contacts per unit of time.
, the average time in the compartment Z.
, the average time in the compartment I.
, the average time in hospitalization (not in intensive care).
, the average time in the intensive care unit.
p, the probability that an infected individual becomes a symptomatic.
q, the proportion of hospitalized individuals that will require intensive care.
w, the proportion of individuals in intensive care that die from the infection.
θ, the fraction of infected contacts that are removed due to testing. θ = 0 implies no testing, while θ = 1/2 half of the infected contacts of an individual that has been tested positive, are removed.
α, is the effect on the exit rate from compartment I by symptom awareness and self isolation. α = 0 means no reduction in average in stage I due to self-awareness and isolation. The average time in stage I (excluding testing effects) changes from µ2 to µ2(1 + α).
3.3 Comments on the model and parameters
The model does not consider infections caused by individuals in stages H and C (Hospitalized). This has been documented, but nosocomial infection is reduced in comparison to the rest of the susceptible.
The model assumes implicitly that there is never shortage of beds in hospitals, including intensive care units. It is, therefore, an approximation.
In the model, stages I and Z are separated for three reasons:
Individuals in stage I can reduce the average time in that stage with appropriate information, tending to be alert to the onset of symptoms which will allow them to self-isolate totally or partially. This is controlled by α.
It is important to keep the ratio Z/(I + Z) = 1 – p, that is, we need to observe the fact that a proportion of those infected are asymptomatic.
Separating I and Z would allow to change the infectiousness of the two kind of infected, if necessary.
Although it is clear that the outcome of the epidemics is affected by the age structure of the population (seems that lethality is higher in older people), if we are willing to concede that all age categories are equally likely to become infected [24] but they respond differentially to the disease, then there is no need to write a model that include a compartment for each age category (with the necessary associated parameters) but to apply the distribution of differential lethality a posteriori. This will be explained in detail later in this report.
The model lacks of a latent or exposed class for two reasons:
There are reports [20] that infected individuals become infectious quickly, sometimes in the first 1-2 days.
A latent class does not change R0 nor the final epidemic size, only delay events in the rest of the compartments a little bit.
It is clear that the model cannot be solved explicitly, in the sense that we cannot derive analytic expressions for S(t), Z(t), I(t), H(t), C(t), R(t), RT (t) and D(t). We will solve those expressions numerically and simulate a stochastic version of the model. But we will find analytic expressions for the long range behaviour of R(t) and D(t).
All parameters above will be estimated or deducted from medical literature so far, except λ.
Testing in the model refers to testing for actual infection with SARS-CoV-2, not for IgG or IgM (immunity tests). Laboratory testing always is implemented several weeks after the outbreak when the strains are identified and kits produced, so, it is necessary to set θ = 0 at the beginning of the epidemic.
Parameter θ, the testing capacity needs special explanation: it is assumed that when individuals in stage I develop symptoms, they are tested and some of its contacts are tested. θ is the fraction of all infected contacts that are removed from circulation by testing. This shortens the infectious time of tested individuals (independently if they are in stage I or Z) by half, by property of exponential distributions.
Choosing an appropriate parameter α is as follows: the average time spent in stage I with no testing is µ2 + αµ2 = µ2(1 + α). If we assume that certain informative policy may reduce the average time to a fraction f of its original value, then α = 1/f – 1.
4 The Deterministic Solution of the Model
4.1 Transitions between compartments
All rates between compartments are the product of three elements: an individual event rate, the number of individuals affected by the previous rate and a thinning probability.
Since the individual rate is the rate of a Poisson process, the resulting product is the rate of a Poisson event which is important for the future stochastic simulations. In this model, it is assumed that both types of infected, I and Z are equally infectious.
4.2 The differential equations
The set of differential equations for the model in Figure 1 (eliminated the time index for simplicity) is: with initial conditions S(0) = N – 1, Z(0) = 1, all others equal to 0.
4.3 Adjusted lethality by age group and gender
Parameter p in the model, the probability that an individual will eventually develop severe symptoms that will require medical attention, is one of the most important parameters that define the outcome of the epidemic in terms of fatalities. The relevance of p is that it is a property of the disease. However, p seems to depend on the age of the infected, p being higher for older individuals. In this section, we calculate the lethality for age group which will allow us to estimate a global p at the population level. We will then show how to obtain the p for a population with a different age structure, as in refugee camps.
Tables 3 and 4 show the characteristics of hospitalization, intensive care and deaths by age group for Mexico’s COVID-19 epidemic using Mexico’s IMSS database6. Column f contains the relative frequencies of each age group. Column h shows how the hospitalizations where distributed among the different age groups. Column c shows, for each age group, the proportion of individuals that needed intensive care from those who went to hospital. Column d shows, for each age group, the proportion of individuals that died from those who went to hospital.
From these, we obtain the raw lethality by age group u = h º d. Overall lethality is f’1u. Nevertheless, we cannot estimate the lethality with f’ u since we are including only data from individuals that went to the I compartment and then to hospitalization whereas lethality should consider all infections. In other words, we do not have information on how many people went to compartment Z.
Estimates of the overall lethality are in the range 0.001 – 0.003 [21, 8]. We will use the lethality reported by [8] of 86 in 100, 000 (about 0.001) because it is a more recent study with a large number of blood donors (the possibility that conditions in refugee camps can worsen the severity of the outbreak will be considered later in our model).
Based on this lethality, we need to adjust u so that f /u = 0.001, which implies a normalization factor k = 0.001/(f /u) = 0.0347. We then obtain the adjusted lethality θ as θ = ku. By performing this analysis by gender and age group, we obtain Tables 3 and 4. The interpretation of adjusted lethality is as follows: if we select a male person at random from the population in age group 30 – 39, the probability that this person will die from COVID-19 given that the person is infected is 0.000811, whereas if it is a female, the probability is 0.000381
4.3.1 Adjusted lethality in the Za’atari refugee camp
If the information necessary for the estimation of the adjusted lethality rate is not available, as for areas not yet reached by the virus such as the Za’atari refugee camp, one can use the adjusted lethality of a reference population. For example and for the case of Za’atari camp, we can use the adjusted lethality estimated for Mexico by age group and adjust these lethality rates using the age and gender structure of the Syrian population living in the camp. The composition of the population structure of Syrian refugees in Jordan is shown in Table 5. Using this age and gender structure, the adjusted lethality for the camp is obtained with: which is about 2.3 times smaller than the calculated from the Mexican IMSS data, due to the higher relative frequency of younger ages.
4.4 The parameters p, q and w
The adjusted lethality that we estimated for the Za’atari camp (0.000428) starting from the IMSS data can be expressed as the product pqw. From the same IMSS data, we can also estimate the product qw (the fraction of individuals that will die, given that they entered the I compartment) as: which is 0.1675 (note that the number of recoveries is only among those that went through the I compartment). It is then straighforward to estimate p as p = pqw/qw = 0.000428/0.1675 = 0.0026. This implies about 1 in 400 infected will become symptomatic infected, in this particular camp. The fraction of hospitalized that require intensive care (q) is derived from data on Iceland7, where this parameter is estimated at about 0.25. Finally, from q, we can derive w as qw/q = 0.1675/0.25 = 0.67.
4.5 Estimation of the exit rates µi
The exit rates are the inverse of the average time in a given stage. For the deterministic model, if L is the average time in some state, then the exit rate µ = 1/L.
4.5.1 Infected symptomatic (I)
The most recent and comprehensive report on the duration of the period a person remains infectious comes from the National Centre for Infectious Diseases in Singapore8. This study suggests that this period may begin around 2 days before the onset of symptoms, and persists for about 7 - 10 days after the onset of symptoms. That is a total period of between 9 and 12 days. However, people who are symptomatic become aware of the infection by definition and should be expected to take measures such as isolation to contain the spread of he virus. If this factor is taken into account, one can estimate the period of infectiousness at 5 days as suggested in [13]. See also [2] and [15]. Therefore, we take µ2 = 1/5
There is no data on the variance of this parameter, but if we use a gamma distribution and consider that subjects have no more symptoms after 12 days (see [13]) we can estimate a standard deviation. We need n and γ such that n/γ = 5 and CDFX(n, γ; 12) = 0.95. This yields n = 6, γ = 1.1765, for an average time of 5 days and a standard deviation of 4.33.
4.5.2 Infected asymptomatic (Z)
For the infected asymptomatic, we can use a value of µ1 = 1/11 based on the Singapore study already cited.9, alhtough we are aware than other studies such as [25], [9] and [11] have higher estimates, around 20 days.
There is not much information on the variance of the duration in the asymptomatic period and we will use a standard deviation of 7 days.
4.5.3 Hospitalized (H)
This parameter strongly depends on the availability of human resources, equipment and infrastructure. From IMSS data, we estimate it to be 3 days with S.D. of 2.91 days. Thus our estimate of µ3 = 1/3.
4.5.4 Intensive care unit (C)
From IMSS data, we estimate it to be 7 days with S.D. 4 days, for µ4 = 1/7.
4.5.5 Temporary quarantined (RT)
There is no way to estimate this parameter, since it depends on the time to symptoms onset after the individual has left the I stage by self-awareness of testing. We use the memoryless property of exponential distributions. The individual will remain an exponential amount of time in this stage, with parameter equal to the inverse of the average time in the previous stage, which is 1/5, but reduce the SD to 4 days.
4.6 Estimation of λ and R0
The baseline R0 used for the estimations in the Za’atari refugee camp is equal to 3 and is derived from the following equation:
In the equation above, λ is an arbitrary parameter which is based on the COVID-19 literature and a specific exercise that we devised to estimate λ with publicly available cross-country data. This section explains this exercise and the specific choice of R0 = 3 we made.
The contact rate λ we attempt to estimate is the raw contact rate. That is, it represents the true potential transmission of the disease, and can be measured at the onset of the epidemic. Countries first affected by the pandemic provide a more precise estimate of the pandemic since other countries are expected to be more prepared and are likely to have implemeted measures that may affect λ. The raw contact rate λ is close to the transmission potential of a disease when the individuals do not react to reduce it, by means of pharmaceutical interventions or by reducing exposure activities. It is important, therefore, to estimate λ using existing crosscountry data. This will give us information not only on the true potential of the disease but also on how much λ can be reduced with preparedness. Once the λs are estimated, we can estimate the respective R0s.
To estimate λ, it is customary to use observations on reported cases, mainly infections, which are recorded mostly when individuals report the onset of symptoms. In our model, that would exclude the Z (asymptomatic) individuals and would only consider the I (symptomatic) individuals. In most cases, the date reported as the date of a confirmed case is the day a test was made, or even the day the laboratory results were given, and rarely the date of onset of the symptoms. This adds considerable noise to the timing of the cases. Figure 5 shows an hypothetical example of the lag between the time when individuals enter the I compartment and the reported time.
As an alternative to using the number of infections, here we use the growth rate in compartment D (deaths), which we believe is less prone to error in the timing at which an individual entered that stage. The problem of underreporting is still present but reduced as compared to infections, and the time of death is much more accurate than the time of infection.
In general terms, once all other model’s parameters have been set, we choose an initial λ and solve equations 8. With this numerical solution, we obtain the best fit to the observed data in cumulative deaths and then repeat this procedure until the best λ value is found within a predefined region. The need to seek the best fit of the solution D(t) for a given λ is due to the fact that the beginning of the outbreak is unknown and thus we need to find a k value so that D(t + k) is the best match to the observed data on cumulative deaths.
For this exercise, we use the accumulated number of deaths in 185 countries taken from the data set of Johns Hopkins University [6]. For each country we use the first 15 days of data at the onset of the epidemic starting the day when the accumulated number of deaths was at least 5. Countries with a low R2 were rejected leaving only 40 countries with reliable estimates. For all countries but Germany, the calculated beginning from our model occurred before the date of first confirmed case.
Table (6) contains the main statistics derived from the cross-country estimation of λ and R0. Estimates of λ vary between 0.172 and 0.366 with a mean value of 0.251 with countries that experienced the epidemic at earlier stages showing larger values. From these values of λ one can then derive the R0 as explained above.
Figure 6 shows the relationship between the age of the pandemic and R0. For each country, we took as index the day the epidemic reached 5 confirmed cases. Then we took all dates and subtracted the minimum date, so that the indexes are now relative to the country with the first outbreak (excluding China), which is Italy. There is an almost linear relationship between the age of the pandemic and the R0 in analyzed countries. The fact that R0 is reduced for countries that experienced the epidemic at later stages indicates the importance and impact of preparing for the epidemic.
In conclusion, the cross-country exercise proposed derived estimates of λ that resulted in R0 varying between 2.73 and 5.789, which are values consistent with the COVID-19 literature. The same exercise also shows that higher values correspond to countries that experienced the epidemic at later stages as compared to Italy. These findings should be taken into account when we consider refugee camps still untouched by the pandemic. We should use a baseline value closer to the lower bound estimated with the cross-country exercise. Hence, our choice of a R0 = 3. However, as R0 is an input parameter, all results can be quickly reproduced with alternative R0s.
4.7 The role of demographics and control policies on the parameters
There are factors affecting outcomes that are characteristics of the demographics of the population. As mentioned before, differential morbidity by age and gender has been amply documented for COVID-19. Also, population size and density and other factors that affect camps specially, like poverty and other health issues may be very important. Control policies like the use of face masks, reducing contacts, contact tracing and testing (cordon sanitaire), pharmaceutical treatments, improvement of medical assistance through increasing hospitalization capacity and adding more personnel or even vaccination if it becomes available are all policies that can affect outcomes. In this section, we discuss how all of these changes can be simulated in our model (or are implicit already) without the need of adding more parameters.
Demographics: The model can accomodate demographic changes as already described in section 4.3 for the estimation of the adjusted lethality by age and gender. In essence, demographic changes affect the estimation of p, the likelihood that an individual will become a symptomatic infected. As p is the first parameter to enter the model, all other estimations take the demographic structure into consideration.
Population density: this can be included in the contact rate. There is a linear relationship between contact rate and R0, since this is just the contact rate times the average infectious time. Although we do not exactly know how much the density will affect the contact rate, this effect can be taken into account by multiplying λ for a constant (C3), which is the number of times we believe the contact rate is increased because of crowdedness.
Household size: The household size is an important effect to consider, and there are two reasons why we did not include it into the model explicitly. First, household size is implicit in the population density, so, the larger the household size, the larger the density per square km, and thus we can increase the contact rate, as pointed out before. Second, in the particular case of COVID-19, which is highly contagious, if we are willing to assume that all members of the household will be infected when a single member of the household becomes infected, then it is possiblle to show that R0 increases by a factor equal to the average household size, which will take the epidemic size to a much higher level. Since, as it will be discussed later, we can expect a very large fraction of the population to become infected without modeling households, adding this level of complexity is unnecessary.
Health issues There are several studies on how the lethality of SARS-CoV-2 is increased due to health issues like obesity, diabetes, respiratory diseases or habits like smoking [18, 22, 1]. Most odds ratios in those studies are large with a large SD. Including this differential lethality in our model for RC is impossible because there is no information available on the incidence of these conditions among the refugees. We can only think that the likelihood of becoming symptomatic (parameter p), the chances of moving to ICU while in hospitalization (parameter q), and the chances of dying while in ICU (parameter w) in our model all increase if health issues are present. However, the amount is left to speculation.
Poverty Poverty includes a wide range of factors that affect the outcome of the epidemic in a given community. Poverty is related to reduced education (which affects the ability of camp administrators to implement control measures), diminished nutrition and reduced medical attention. There are no specific studies that analyze how poverty affects the outcome of an epidemic but surely a reduced level of education tends to reduce the perception of the importance of the disease and thus may not reduce the contact rate λ, whereas lack of nutrition and medical attention affect p, q and w in our model. Again, the amount to increase is left to speculation. An analysis of the relationship between the number of doctors in a city as a surrogate of wealth, and the number of cases is shown in [19].
Social distancing and lock-down Social distancing is reflected in the model as a reduction in the contact rate λ. In Figure 6 we saw how the R0 is reduced in time across countries due to learning, which allowed some countries to prepare for the arrival of the pandemic. This implies that it is possible to prepare by reducing the contact rate through reinforcing and encouraging social distancing. However, in refugee camps measures such as social distancing are evidently more difficult to apply. Also, the possibility that the disease will not enter a refugee camp is very low as camps’ supplies strongly depend on the interaction with people outside the camp.
Face masks The effect of face masks is similar to that of an altruistic vaccine. It is not very effective in reducing infection of the mask-user, but reduces the ability to transmit the infection. Therefore, regarding the epidemic size, they act as a vaccine. The use of surgical masks has been reported to be effective in the transmission of the disease [7, 5], and [12] even suggests that frequent mask use in public venues, frequent hand washing, and disinfecting the living quarters were significant protective factors of infections (Odds Ratio 0.36 to 0.58). Nevertheless, the large number of infections reported among nurses and doctors that certainly worked wearing a surgical mask, imply some limit to this protective effect in crowded environments with a lot of infected people. If wearing a mask reduces the emission of coronavirus droplets from an infected person to a fraction f, then lambda is reduced to λf. If we assume that masks protect equally susceptible persons by reducing the virus droplets by the same amount, then λ is reduced to λf2. There is no information on the value of f, and we only can speculate on some positive value. Nevertheless, we must be cautious since by considering that λ reduces to λf2 we are implicitly assuming that a reduction in virus droplets by a half will reduce the likelihood of being infected by a half, which is not necessarily true.
Contact tracing, testing and quarantine As mentioned earlier, the cordon sanitaire consists on locating people that had contact with an infected individual and quarantine them. Parameter θ in our model is an attempt to account for this effect, in the sense that θ is how good we are at detecting and isolating infected contacts. However, θ is a parameter we need to input as a guess estimate.
Vaccination A vaccine that has efficacy β reduces the contact rate to λβ. There is no current candidate vaccine and is seriously challenged that it will be developed within a year. Even if a candidate vaccine is available now, the testing time through phase IV takes over a year, even if all the permits are speeded up. Therefore, modelling the impact of a vaccine is currently premature.
Pharmaceutical interventions Pharmaceutical interventions or any other treatment is included in parameters q and w in the model. Any increase in survival of individuals due to medical treatment should be included as a decrease in q or w.
Herd immunity The herd immunity is just the thinning term S/N in the first two rows of Table 2. The larger the infected group, the smaller the probability that the next contact of an infected person will be with a susceptible person. The highest peak of herd immunity is reached when the product (I +Z)∗S/N is maximum, which is achieved when the number of active infected is equal to the number of susceptible. Although sometimes herd immunity has been referred to as a policy or resource, this is not the case. Exposing the population by not implementing social distancing or lock-down will only accelerate the time to final outcome and will increase the demand of medical services at any time.
There has been some claims that there is no guarantee that immunity is achieved. Still, the number of cases reported at the global level of a second infection are scarce and dubious. If immunity is a fact, then, surveillance for immunity is useful in order to avoid a full collapse of economic activities in the camp. Random testing, and in particular testing those who recovered, is important to measure the size of compartment R at all times. There is the possibility of deducting this by analyzing the ratio Recovered/Deaths. At the worldwide level, on May 19, JHU reported 319, 213 deaths and 1, 805, 093 recovered, that is, for every death there are at least 5.6 recovered. This is a very conservative estimate, because the number of undetected recoveries is larger than the number of undetected deaths, since most of the recoveries detected were more among symptomatic than among asymptomatic, and the first group has a larger mortality than the second.
4.8 Sensitivity analysis
A sensitivity analysis consists in analyzing the effect of changes in the parameters on the output of the epidemic. We concentrate on the effect of the parameters on R0. With LZ = 1/µ1, LI = 1/µ2 where LI and LZ are the respective average times in compartments I and Z, the equation for R0 can be expressed as follows:
We can see in (10) that R0 depends on λ, p, µ1, µ2, θ and α. As mentioned before, the value of p is determined by the nature of the disease and the demographic structure of the population and, in this sense, p can be considered as a fixed and exogenous parameter. Changes in λ are possible by reducing the contact rate with active policies such as social distancing or the use of face masks [7, 5, 12]. Changes in µ1 and µ2 are possible by testing and isolation policies for the infected, which reduce the average time in compartments Z and I, and increase the exit rates from these compartments.
How sensitive is R0 to changes in λ, θ and α? Figure 4 shows changes in R0 when λ = 0.3, LI = 5, LZ = 11, p = 0.0172, θ = 0.25 and α = 0.25. The left-hand panel shows absolute changes whereas the right-hand panel shows percentage changes (elasticities). Results show that λ is the most elastic and important parameter (a 1% increase in λ resuts in a 1% increase in R0), which implies that we need to focus on reducing the contact rate. θ and α are much less elastic, implying that it is much more difficult to contain the spread of the disease with testing, contact tracing or self-isolation due to symptom awareness. This is mainly due to the fact that there is a smaller amount of symptomatic individuals that could reduce their infectious time and have a significant effect on the epidemic.
4.9 Summary of parameters for the Za’atari camp estimations
Table 7 summarizes the values of the parameters discussed in this report and used for the estimation of the SIZ model using 2015 data for the Za’atari refugee camp.
5 The Stochastic Solution of the Model
In a stochastic model, the current state of the process is a vector X(t0) that will change to some state X(t0 + t) in some time t. Whereas in deterministic models the word simulation is used to refer to the numerical solution of a system of differential equations, in stochastic models it has a different interpretation: the transition from the current state X(t0) to the state X(t0 + t) is random, as well as the time for the transition, t. Given some current state of the process, the change in the state that will occur next as well as when the change will take place are calculated using transition rates expressed in terms of probabilities.
The main difference between deterministic and stochastic models reside in the algorithms for the transitions between compartments. Suppose that, after leaving a compartment, a person can move to two different compartments at rates α1 and α2 respectively. In a deterministic model, a fraction α1/(α1 + α2) moves to one compartment and another fraction α2/(α1 + α2) moves to the other. In a stochastic model, the individual moves to each compartment according to probabilities. If we take a look at the model in Figure 1, we can see for instance that an individual in stage H can move to stage C or R at rates µ3q and µ3(1 – q) respectively. Therefore, the probability that this person will move to C is µ3q/(µ3q + µ3(1 – q)) = q, which is the relative share of those rates. To decide if the individual moves to R or C we generate a random number u from a uniform distribution in (0, 1) and if u < q the individual moves to stage C, otherwise this individual moves to R. In the deterministic model, a fraction q of the individual moves to stage C and a fraction 1 – q to stage R.
Given the current state of the process as a vector containing the number of individuals in each class of: we first construct a list of all possible events that can take place at the next step, along with their rates. In our particular model, this list is given in Table 8. Observe that the list is obtained from the rates of the differential equations model. To decide which of the 12 possible events will take place, we take a single observation from a multinomial random variable with parameters where gi = rate of event and G = ∑i gi. Once we decide which event takes place, we update the number of individuals in each compartment and recalculate the rates, then update parameter k in the multinomial distribution and sample it again until ∑i gi = 0 or up to a specific time t. Table 8 provides all possible transitions with the corresponding rates.
5.1 Time to event simulation
In the previous section we explained how to simulate a series of transitions in a continuous-time discrete-state stochastic process, that produces a vector X(t) at each iteration. Nevertheless, we need to simulate the time between transitions, that is, the index t.
In a stochastic model, when an individual is in some compartment and this individual leaves the compartment at a rate µ, it means that the individual will remain in that compartment according to a random variable T that follows an exponential distribution with parameter µ. That is:
If there are n individuals in that compartment, the time for the next individual to leave the compartment is clearly the minimum of n exponential random variables, which is also exponential with parameter nµ, that is, the time for the next individual to leave the compartment is:
Therefore, we can simulate the time to the next individual to leave this compartment using: where u is a random number from an uniform distribution in (0, 1). Using this, the t e to the next transition is an exponential random variable with parameter R = ∑i Ri. Therefore, the time to the next transition can be simulated as:
It is important to notice that the inter-event times are independent of the event that will take place. That is, the time to the next transition and the type of the next transition are independent.
5.2 Accelerating the simulations
Time to event simulations are generally inefficient computationally since only one individual moves at the time. The simulation stops when no more individuals can leave a compartment which in our case is when all individuals are distributed among S, R or D. The number of transitions depends largely on the population size, and for a population of size N it could be up to 5N, since in our model approximately N individuals will move to up to 5 stages.
There is a way to speed up the simulations, based on the relationship between exponential and Poisson distributions. This can be explained by looking at Figure 5. In this figure, there are n = 8 events that take place at the same rate. In time to event simulations the system will change state when the first event occurs, which is event E5 in Figure 5. In our approximation, we select a window of size W, and calculate the probability that an event takes place in that window, call this p. Then, the number of events that takes place in that window is a Binomial random variable with parameters (n, p). Since the number of individuals in a compartment is, in general, large and W can be very small, we can then approximate the binomial distribution with a Poisson distribution with parameter np. Clearly, the limit when W → 0 is the same as performing time to event simulation. In short, instead of simulating one single event at the time, we generate all the events that will occur in a window of size W. The smaller the size of the window, the better the approximation, because the events inside W are not independent.
In our simulations, we take W = 1/24, that is, we change the system at unit steps of an hour. This will require about 365 × 24 = 8, 760 transitions to simulate an outbreak with a duration of one year, instead of the approximate half a million simulations for a refugee camp of 100, 000 people. Another advantage is that this Poisson approximation is independent of the population size.
5.3 Matching the mean and variance in stochastic simulations
Next, we calibrate simulations to reproduce real data. The idea is to match the reported (observed) mean and variance of different stages, for instance, for the Z and I stages. Assuming that the time in a compartment is exponential yields a mean equal to the standard deviation, which is undesirable. For instance, if we assume that the time in stage Z lasts on average 11 days, then an ordinary stochastic approach to the model in Figure 1 will cause an individual to stay in that stage for an average 11 days with a standard deviation of 11 days, which is different from the reported (observed) 7 days that can be found in the literature and are described in the deterministic report.
The method consits of dividing each stage in several compartments with an exponential duration each, so that the passage through all of them has the desired mean and variance. The number of compartments and the individual rates depend on how precise is the approximation we want to those two first central moments. This method of approximation is called Phase distribution. [16, 17, 3, 14, 4]. In theory, it is possible to match as many moments as we want, although in here we only try to match the first two central moments, and we sacrifice precision in order to keep the number of total compartments relatively small. Observe that our method is different from the one reviewed in [4] in which a stage is subdivided in compartments with identical exit rates. Phase distribution methods are not restricted to identical exit rates.
Figure 6 shows the number of compartments required for each stage, Table 9 shows the required rates for each compartment and Table 10 shows the match achieved between the mean and variance reported in the literature and the one achieved with our approximation.
A Results - Deterministic model
B Results - Stochastic Model
6 Acknowledgments
The report is part of the DFID-UNHCR-World Bank program “Building the Evidence on Protracted Forced Displacement: A Multi-Stakeholder Partnership”. The program is funded by UK aid from the United Kingdom’s Department for International Development (DFID), it is managed by the World Bank Group (WBG) and was established in partnership with the United Nations High Commissioner for Refugees (UNHCR). The scope of the program is to expand the global knowledge on forced displacement by funding quality research and disseminating results for the use of practitioners and policy makers. This work does not necessarily reflect the views of DFID, the WBG or UNHCR. We are grateful to Ann Burton (UNHCR) and Paul Spiegel (John Hopkins University) for comments and clarifications.
Footnotes
email: pverme{at}worldbank.org
email: sradyakin{at}worldbank.org
e-mail. efren.murilloza{at}imss.gob.mx
↵1 A bed-day is a day during which a person is confined to a bed and in which the patient stays overnight in a hospital. (OECD Glossary of Statistical Terms https://stats.oecd.org/glossary/detail.asp?ID=194)
2 Note that the parameter dealing with reducing the contact-rate by self-awareness of infection had little effect on the outcome of the model and was excluded from the stochastic simulations.
4 https://www.rescue.org/video/first-covid-19-case-confirmed-coxs-bazar-bangladesh
6 We used a database of suspected and confirmed COVID-19 cases maintained by the IMSS in Mexico (Instituto Mexicano de Seguro Social, or Mexican Institute of Social Security) that serves about 12 million persons, non-government workers, and cover roughly 10% of the Mexican population. The data on SARS-CoV-2 we used contains records of 26,039 confirmed cases from March 2, 2020 to May 25, 2020. We discarded all cases with symptoms not starting in the previous 14 days to the last reported case (May 25, 2020) and from these, included only cases with disease outcome as dead or recovered, leaving a database with 10,330 cases.
8 shorturl.at/mR267
9 shorturl.at/mR267