Abstract
Brazil’s continental dimension poses a challenge to the control of the spread of COVID-19. Due to the country specific scenario of high social and demographic heterogeneity, combined with limited testing capacity, lack of reliable data, under-reporting of cases, and restricted testing policy, the focus of this study is twofold: (i) to develop a generalized SEIRD model that implicitly takes into account the quarantine measures, and (ii) to estimate the response of the COVID-19 spread dynamics to perturbations/uncertainties. By investigating the projections of cumulative numbers of confirmed and death cases, as well as the effective reproduction number, we show that the model parameter related to social distancing measures is one of the most influential along all stages of the disease spread and the most influential after the infection peak. Due to such importance in the outcomes, different relaxation strategies of social distancing measures are investigated in order to determine which strategies are viable and less hazardous to the population. The results highlight the need of keeping social distancing policies to control the disease spread. Specifically, the considered scenario of abrupt social distancing relaxation implemented after the occurrence of the peak of positively diagnosed cases can prolong the epidemic, with a significant increase of the projected numbers of confirmed and death cases. An even worse scenario could occur if the quarantine relaxation policy is implemented before evidence of the epidemiological control, indicating the importance of the proper choice of when to start relaxing social distancing measures.
Introduction
At the end of 2019, the world was taken by the news about the outbreak in China of a new Coronavirus called SARS-CoV-2, which stands for Severe Acute Respiratory Syndrome Coronavirus 2. The associated disease, called COVID-19 by the World Health Organization (WHO), rapidly spread around the world and is currently considered a pandemic. As of May 26, there have been 216 countries and territories with confirmed cases, the majority of them with local transmission [1]. Coronavirus belongs to a group of viruses that are common in humans and are responsible for 15% to 30% of cases of common cold [2]. The SARS-CoV-2 has spread rapidly and has already affected a significant part of the world’s population [3]. The within-host dynamics seem to be very heterogeneous and the severity of the disease varies widely. Based on currently available information, elderly people and people of any age with comorbidities, such as hypertension, cardiovascular disease, and diabetes, are among those at most risk of severe illness from COVID-19. In most locations, the burden of COVID-19 cases requiring medical intensive care exceeds the available medical resources, worsening the situation. A great effort has been made worldwide to better understand the disease and possible mechanisms to control it. Measures such as widespread testing, and reducing social contacts for some age segments, or for the wider population, are known to decrease the speed of the disease spread and the case fatality [4]. However, countries often have to deal with limit testing capabilities for COVID-19 which increases the uncertainty associated with how the disease behaves and is likely to complicate the management of the policies to mitigate and control it. One form to deal with this lack of information is through mathematical modeling, using diversified modelling techniques (see [5] for a review). Population-based models that split individuals into classes are widely used, and is the approach followed in this work.
WHO has already recognized that testing for COVID-19 is a key way to know how the virus spread and to provide insights on how to respond to it, although widespread testing is low in most countries in the world. Current Brazilian policy is to test only severely ill and healthcare practitioners people. Even so, there is great uncertainty in the released data. Sub-notification of infected cases is a major problem: BR currently conducts around 4,000 tests per million of the population [3], one of the lowest rates in the world. Likewise, the number of deaths may be underestimated for the same reason. According to [6], only 8% of the infected cases are reported/diagnosed. Overall, one of the main difficulties in developing predictive mathematical models of the COVID-19 in Brazilian localities is the lack of reliable data to support parameter choices [7, 8]. In the face of the current scenario, we developed a seven-compartment model that implicitly considers the social distancing policy that isolates individuals from the infection for a period of time. We focus on understanding the model response to perturbations/uncertainties. We perform sensitivity analysis and quantify the uncertainties in model outcomes, mainly the cumulative numbers of confirmed and death cases, and the effective reproduction number. We also investigate how to obtain a desired effect/control via a modification to the system subject to uncertainties. We first apply the model to the whole BR scenario and then investigate its behavior for the state of Rio de Janeiro (RJ), which was one of the first states in BR to recommend and implement social distancing policy, beginning on March 17, 2020 [9].
Results
We consider a generalized SEIRD mathematical model containing the following epidemiological compartments: Susceptible (S), Exposed (E), Asymptomatic Infected (A), Symptomatic Infected (I), Recovered or Removed (R), Positively Diagnosed individuals (P) [10], and Cumulative Dead (D). There is also a variable that is not part of the dynamical system and corresponds to the cumulative number of Confirmed Cases (C). Due to the current Brazilian policy only severely ill and healthcare practitioners people are tested. Therefore, it is important a separation in P individuals most likely to need hospitalization, in contrast to a limited number of hospital beds. Moreover, a proportion of S, E, A, and I individuals are removed at a fixed rate to the R compartment, which corresponds to quarantine measures and recovered individuals, as proposed in [11]. This mathematical model is denoted SEAIRPD-Q, in which Q stands for the quarantine (Materials and Methods). In order to address the understanding of quarantine policies, in the next sections we analyze different scenarios related to quarantine measures through modifications in the aforementioned removal rate.
Current stage and trends for COVID-19 in BR and RJ
To properly perform projections and study the quarantine scenarios of interest, we fitted the SEAIRPD-Q model (Materials and Methods) to the data of BR and RJ (separately). We employed a Bayesian calibration of some model parameter values, with remaining parameter values gathered from the literature.
Considering the maintenance of current social distancing policies, the predictions for the COVID-19 pandemic in BR is showed in Fig. 1. According to the result of Fig. 1a, the peak of compartment P will occur on the simulation day 81 (95% CI: 78–83) with around 75.7 thousand (95% CI: 67.7–85.8) people simultaneously infected (denoted “active cases”, hereinafter), and D is expected to be around 34.7 thousand (95% CI: 32.4–37.3) in the considered time range (Materials and Methods). According to Fig. 1a, C in the considered time range will be around 364.2 thousand (95% CI: 325.7–414.8).
The basic reproduction number, , is a parameter that indicates how fast the disease will spread in an epidemic early stage, caused by secondary infections after one individual is infected [12]. We calculated this quantity and the corresponding effective reproduction number, (SI-SAppendix A.3). The calculated by means of MAP for BR is 4.09, and the distribution of along time is displayed in Fig. 1b (black). If the current quarantine measures continue, is expected to be around the simulation day 71 (95% CI: 69–73), which indicates that the disease is controlled [12].
We also fitted the model with the available data for RJ. Considering the current social distancing policy, the projections for COVID-19 infection in RJ are showed in Fig. 2. According to Fig. 2a, the peak of active cases will occur on the simulation day 80 (95% CI: 76–85) with around 7.3 thousand (95% CI: 6.3–8.7) active cases, and D is expected to be around 7.0 thousand (95% CI: 5.8–8.7) in the considered time range. Fig. 2a also shows that C in the considered time range will be around 45.0 thousand (95% CI: 37.7–55.5). The is 2.83 for RJ, and the distribution of along time is displayed in Fig. 2b (black). Similarly as for the broader scale of the Brazilian scenario, if the quarantine measures in RJ continue, is expected to be around the simulation day 72 (95% CI: 67–76).
Sensitivity analysis and influence of uncertainty in early COVID-19 epidemic
We address the sensitivity (Materials and Methods) for both C and D (QoI1 (t)), as well for. Our first order sensitivity coefficient results suggest that uncertainties on ICs can drive significant changes in the early epidemic dynamics. However, the influence of such uncertainties diminishes as the epidemic evolves, indicating that other involved infectious mechanisms have more impact in the long term. Indeed, notice that parameter uncertainties have strong impact on both epidemic numbers (QoI1 (t) and QoI2(t)) along the dynamics. The importance of the recovery and the quarantine removal rates increase as the time evolves with respect to QoI1 (t). Overall, the most influential parameter is the quarantine removal rate (ω), and its impact increases as the dynamics evolves. The second more influential parameter is the proportion of I with symptoms (ρ). We remark that the latter parameter is fixed due to the limited testing capacity in BR.
How would different social distancing measures affect the disease spread?
We have shown that the rate at which S, E, and I are removed due to quarantine measures significantly affect , C, and D. Such average values under the actual social distancing policy are an idealization since actual values are dynamic and spatially heterogeneous. In order to study the qualitative effects of changes in social distancing policies, we propose to model ω as an exponential decay function, which can represent different social distancing relaxation scenarios (Materials and Methods). We consider three cases: (I) a sudden release from quarantine/social distancing, which is modeled as a decay with half-life as t1/2 = 0.1 days; (ii) a gradual release, with half-life as t1/2 = 15 days, and (iii) the current social distancing policy, with constant removal rate. Regarding such idealized scenarios and considering that the beginning of the social distancing release policy is at simulation day 100 after the first time stamp considered in the calibration dataset, Fig. 3 shows the consequences of relaxing social distance measures in terms of C and D for BR. Additional results are displayed in SI-SAppendix A.6 and SI-SAppendix A.7 for both BR and RJ scenarios. Remarkably, a sudden release can induce an increase in C, extending the crisis duration due to a slower decrease in active cases. This effect can be noted in Figs. 1b and 2b. Social distancing release policies make the decrease of far slower, implying in a slow control of the disease. Thus, determining an appropriate moment to begin a social distancing release policy demands special care, since applying such policies in inappropriate times can maintain a crisis status for a longer-term. Moreover, due to the small changes in for a longer time, more cases of C and D will occur, worsening the health damage in the total population.
We also remark the importance of the quarantine release date in the following less favorable scenario. Considering that social distancing is released 20 days before the peak of active cases for BR (simulation day 61), disease spreading can yield a critical scenario even when a smooth and gradual release strategy (half-life as t1/2 = 20 days) is adopted, causing an increase in C and D, and a drastically longer epidemic period, as shown in Fig. 4. In this case, the peak of active cases will occur on the simulation day 95 (95% CI: 88–107) with around 85.7 thousand (95% CI: 73.4–105.0) active cases, and D is expected to be around 84.3 thousand (95% CI: 69.6–105.9) in the considered time range, with C around 0.8915 million (95% CI: 0.6834–1.246).
Discussion
As in other countries that adopted isolation measures, the spread of COVID-19 in BR had a slow start. Despite that, after about two months since the first confirmed case, on May 20, 2020 BR reaches second place among the countries that present more confirmed cases of the disease, only behind the United States of America [3]. The results shown in this work point to the necessity to quantify the effectiveness of social distancing interventions for the control of the disease in BR, specifically in its most affected localities. Besides, it is essential to take rapid responses to the emergence of new cases and their impact on the public health system. The Brazilian continental dimension and great social and demographic differences are factors that hinder the elaboration of public policies for the disease control.
The present paper aims to contribute to the development of a model framework to investigate the expansion of COVID-19 in Brazilian locations and the impacts of different measures of social distancing. We apply the developed approach to model the COVID-19 dynamics in BR and RJ. This high populated Brazilian state was one of the first states to adopt mitigation actions, such as the suspension of classes, cancellation of events, and home isolation [13]. According to the current data released by the Brazilian Ministry of Health, RJ is one of the most affected states, both in the number of registered cases of COVID-19 and in the number of deaths. For this reason, the present study analyzes the spread in this state, as well as in the country as a whole, in order to assess the particular characteristics of the pandemic at those different spatial scales. Extensive research for the spread of COVID-19 in BR with multiple perspectives has been reported, e.g., [7, 13, 8]. The present work takes into account factors that are predominant in the Brazilian reality, such as the current limited testing capacity and the policy to test only severely ill hospitalized individuals. We hope that the present investigation brings some insights or guidelines for public health and policy-makers.
Due to data paucity, parameter identifiability is a major difficulty. To overcome this issue, only four model parameters were calibrated using a Bayesian approach. Other parameters, as well as model ICs, were set based on available information on COVID-19 (SI-Table A.3). Our simulation forecasts that the peak of active cases occurs on May 24, 2020 (95% CI: 21–26) and May 28, 2020 (95% CI: 24–2) for BR and RJ, respectively. This difference can be explained due to the discrepancy of the spatial scale, and the way the disease has spread along the different Brazilian locations, considering, for instance, the different political measures taken and the social and demographic structure of each state. The social measures implemented in RJ at the beginning of the pandemic seemed to be able to flatten the epidemic curve and postpone the peak of active cases [13].
Aiming to evaluate the influence of uncertainties in hard-to-track populations such as undiagnosed infected individuals (I and A), as well those who carry the disease and are unable to transmit (E), we performed a sensitivity analysis to understand which model factors (parameters and ICs) play important roles at the various stages of the epidemic for both BR and RJ. The analysis confirms that a proper understanding of how the disease spreads can provide insights and aids to elaborate containment decisions in order to reduce . In this sense, considering both BR and RJ, sensitivity analysis suggests that the most influential parameter for a long term perspective is the quarantine removal rate parameter (ω).
We have shown that the rate at which S, E, I and A individuals are removed due to quarantine measures significantly affects , C and D. Such average values under the actual social distancing policy are an idealization since actual values are dynamic and spatially heterogeneous, as mentioned in the last paragraph. In order to study the qualitative effects of changes in social distancing policies, we propose to model ω as an exponential decay function, which can represent different social distancing relaxation scenarios. When more abrupt social distancing relaxation is implemented after the occurrence of the peak of active cases, it accompanies a longer extension of the duration of the disease, with approximately 19% increase in the projected numbers of C and D. If implemented before the peak, the consequences can be devastating, as indicated by our results. The hypothetical scenario built by considering a slow and gradual release implemented 20 days before the original peak indicates a delay of 14 days in the occurrence of the peak of active cases, with more than 890 thousand C and about 85 thousand D accumulated over less than nine months of the presence of the disease in BR. Our simulations highlight the importance of relaxing social distancing measures only under a very careful follow-up.
We note that the analysis performed in this paper should be viewed from a qualitative perspective. The conclusions for the considered hypothetical scenarios (with and without distancing relaxation) are based on model predictions and current employed policies. Model simplifications and the calibration procedure of model parameters can explain potential quantitative discrepancies between our predictions and data. Such simplifications include: (I) homogenization of age, social, and spatial structure, (ii) some model parameters are fixed, with values obtained from the literature, and (iii) model parameters are constant along time (except ω, that in some simulations varied in time). Moreover, data have limited information due to sub-notification, since only hospitalized cases are tested in BR. These simplifications are inherent of the modeling procedure, and should be viewed as part of the scientific process of the understanding of the natural phenomena. The present paper can be extended in forthcoming studies, for example, by considering models with spatial heterogeneity, parameter dependence in time and space, and also analysing data considering sub-notification. Data considering other BR states could also be analyzed employing the same procedure adopted in the present paper. Another possible extension could be related to the determination of limit thresholds in the number of P cases under different relaxation social distancing measures.
It is important to highlight the impacts of social distancing relaxation in order to control the epidemic. Adoption of this type of measure directly affects the evolution of , as shown in our results. For both BR and RJ, it implies a much slower stagnation or decrease in from the time the measures are implemented. Since is in a controlled situation for both scenarios, i.e. , a direct consequence is that the disease would need more time to be eradicated. Our analyses suggest that policies based on short-term social distancing are not enough to control the evolution of the pandemic. If social distancing policy measures are released before the optimal time, a second peak should be experienced [7]. Some authors argue that longer or even intermittent social distancing will be necessary to avoid recurrent outbreaks. Specifically, [14] examined a range of likely virus transmission scenarios until 2025 and assessed non-pharmaceutical interventions to mitigate the outbreak. They concluded that if the new coronavirus behaves in the same matter to similar viruses we can expect the disease to return in the coming years, depending on the level and duration of immunity, an aspect that remains to be clarified in the future. The discovery of a vaccine, or new treatments, coupled with the testing of the population could alleviate the need for severe social distancing measures to control the disease. Until then, the need to maintain, even if intermittently, social distancing measures must be carefully addressed.
Materials and Methods
The SEAIRPD-Q model
Here we consider a generalized SEIRD model that includes protective quarantine measures and whose development is based on the following assumptions: (i) the analysis time is small enough such that natural birth and death are disregarded; (ii) all positively diagnosed individuals are hospitalized. The reason is that only badly ill infected individuals are subject to tests in BR; (iii) only symptomatic infected and positively tested (diagnosed/hospitalized) individuals may die due to complications from the disease, (iv) the hospitalized individuals are under treatment and remain isolated, so that they are considered not infectious; (v) quarantine measures are restrictive so that quarantined individuals are isolated and are not likely to be infected; and (vi) the recovered/removed individuals acquire immunity.
The previous assumptions reflect our current knowledge of the COVID-19 and are the base of our SEAIRPD-Q model that has seven compartments, including the population of positively diagnosed (P) individuals who are under medical treatment, and the dead (D) individuals. The infected class is modeled as two separate compartments encompassing individuals with and without symptoms, denoted by I and A, respectively. The removed compartment includes the recovered individuals as well as those under social distancing measures. The mathematical description (differential equations) of the SEAIRPD-Q model and its schematic representation are shown in Fig. 5 while model parameters and related meanings are exhibited in Table 1 and SI-Table A.2. Information regarding the numerical differential equation solver employed can be seen in SI-SAppendix A.2.
Susceptible individuals become exposed to COVID-19 when in contact with infected people. The rate at which this happens may vary depending on whether the infected individual has symptoms or not. After an incubation period, exposed individuals become infected. The fraction of infected people who present symptoms can vary widely. Due to the reduced testing capacity in BR, asymptomatic individuals, as well as symptomatic infected individuals with mild symptoms, are not likely to be hospitalized and therefore will not be diagnosed. Those who present symptoms are either isolated or hospitalized. We assume that only symptomatic infected individuals with stronger symptoms are diagnosed quickly and require hospitalization. This assumption better reflects the Brazilian policy on only testing severely ill individuals. Following the WHO guideline on social distancing, most Brazilian localities are adopting social distancing policies at some extent. An interesting feature of our model is the assumption that susceptible, exposed, and infected individuals can be removed due to quarantine policy. Thus, the removed individuals’ compartment includes individuals who have recovered from the disease as well as those put in quarantine. At a removal rate ω, quarantined individuals are removed and kept isolated. It is known that a quarantine period above 14 days is sufficient to recover exposed and infected individuals, who are idealized immune after such period. Asymptomatic and symptomatic individuals can recover without medical treatment at rates γA and γI, respectively, and hospitalized individuals recover at a rate of γP.
Time dependent quarantine policy modeling
To represent and understand quarantine policy effects on COVID-19 epidemic, we propose a time-dependent removal rate, which we call ωr. This function replaces the constant quarantine removal rate (ω), for each quarantine policy relaxation scenario. In this respect, we consider ωr a smooth decreasing function , with a decay constant λ = ln 2/t1/2, t0 being the time at which the relaxation policy is implemented, and t1/2 the half-life time for the quarantine release policy.
Bayesian calibration
Model calibration (SI-SAppendix A.4) is performed using a Bayesian approach. The code implementation is written in Python language (version 3.7), using PyMC3 as a Probabilistic Programming framework [15]. Due to the fact that solving ODE systems – such as SEAIRPD-Q proposed in the present work – can be computationally intensive, we choose as calibration method the Cascading Adaptive Transitional Metropolis in Parallel [16], a Transitional Markov chain Monte Carlo available in PyMC3. For the sake of reproducibility, code and scripts are publicly provided at [17].
As indicated in Table 1, we calibrated the following parameters: β = μ, ω, dI, and dP. These parameters were fitted using cumulative data for C and D as observed quantities. The model likelihood function assumes a normal distribution considering these observed quantities as its mean values. The standard deviation for C (σC) and D (σD) are tuned as hyperparameters within the calibration procedure.
Data
Since Brazilian policy for testing in COVID-19 is still restricted to severe cases, model fitting aims at matching C and D, without accounting the recovered outflow from P. We gathered data for C and D from [18] (BR) and [19] (RJ). We assessed the epidemiological data of 63 days for BR, ranging from March 5, 2020 to May 6, 2020, and 53 for RJ, ranging from March 10, 2020 to May 1, 2020. These data ranges were chosen considering a minimal requirement of at least five individuals in the C compartment at the initial date.
Sensitivity analysis
In order to quantify how changes in the model parameters and some ICs affect certain quantities of interest (QoI), we apply the Elementary Effects method [20] present in the SALib library [21]. This method allows a global sensitivity analysis (SI-SAppendix A.5) by sampling initial points over all parametric space, computing for each of them how the QoI varies along generated trajectories (which are changes in the paramaters) and computing the associated sensitivity coefficients. The first order sensitivity coefficient of a certain parameter expresses how important it is to the considered QoI. The analyzed quantities are and the normalized sum of the squares of C and D, . A 50% variation around the fixed and MAP values in Table 1 is considered to construct parameter ranges. Regarding the setup of the Elementary Effects procedure, we use a discretization level of p = 4 and a number of generated trajectories m = 40.
Data Availability
All related code and data are available at Zenodo.
Acknowledgments
ACMR, LA, and JVOS would like to thank the support through CNPq projects 132591/20146, 301327/2020-3, and 166171/2018-2. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), Finance Code 001.
Appendix A. Supporting Information
This Supporting Information (SI) text is organized as follows. Sections SI-SAppendix A.1, SI-SAppendix A.2, and SI-SAppendix A.3 give additional modeling details to complement the SEAIR-PD-Q description reported in the main document. Sections SI-SAppendix A.4 and SI-SAppendix A.5 expand the Bayesian calibration and sensitivity analysis sections of the main text (Materials and Methods). Sections SI-SAppendix A.6 and SI-SAppendix A.7 expand the results section of the main document with additional figures for the BR and RJ scenarios, respectively. The last part of the document shows the calibration data used in this work.
Appendix A.1. Model Description
Mathematical models can help the investigation of different scenarios, devising alternative strategies to control virus spreading. The SARS coronavirus epidemic in 2002-2003 generated several publications using diversified modeling techniques (see [5] for a review). Some of these publications employed mathematical models to provide some understanding of the propagation pattern and ultimately to infer control measures for the disease [22, 23, 24]. Similar mathematical models were employed for other diseases, for example: (I) the 1918 influenza pandemic using data from cities of the USA [25, 26], England and Wales [27]; (ii) the 2009 influenza pandemic was modeled in local regions in Japan [28]; and (iii) the 2014-2016 Ebola epidemic was modeled for the West African countries with reported cases of the disease [29, 30]. Overall, the modeling efforts for representing virus spreading share common challenges that have been pointed out in the literature recently [31, 32], including those related to the emergence of “novel infectious agents” [33]. Issues like virus genetic diversity, within-host variability, social, demographic, and spatial heterogeneity, among others, can have a significant impact on model predictions and control programs. In any case, model granularity and ability to capture the desired phenomena are linked to available data for model calibration, which may hinder the development and practical use of more complex models.
Several mathematical models have already been published to better understand the disease spread and the effectiveness of different mitigation programs. Two variations of the Susceptible-Infectious-Recovered (SIR) model are used in [7] to describe the early evolution of COVID-19 in Brazil (BR), using data from February 25, 2020 to March 30, 2020. Short and long term forecasts are performed, which highlight the need of keeping social distancing policies for longer periods to actually flatten the epidemic curve, and the necessity of testing to have a better picture of the amplitude of the peak of symptomatic infected individuals. In [34], an eight compartment model with nine age stratification groups was developed to investigate the effect of social distancing policies, specifically those related to specific (older) age groups. Starting from a 60% lockdown scenario and many predefined parameterizations based on available references, their analysis indicates the ineffectiveness of older age group confinement policy as a way of controlling the confinement exit. A similar model approach is developed in [35] to investigate the transmission and isolation measures in São Paulo state. They found that both young and older people must be isolated to reduce hospitalization. The two latter studies explicitly included quarantined individuals, either due to confinement of susceptible individuals or because of a positive testing isolation measure. More recently, the work [36] develops a model that accounts for either reported and unreported cases. Besides analyzing the impact of different public health mitigation measures, the work suggests the use of a mixture approach for estimating model parameters. At the early onset of the disease, confirmed reported cases are assumed to follow an exponential law to ease getting initial conditions (ICs). Later on, a Bayesian approach is used for model calibration. Since the unreported infected cases are major issue to control outbreaks, the approach for identifying them seems promising and has already been successfully applied to the Hong Kong seasonal influenza epidemic in New York City in 1968–1969 for a SIR model [37].
Another class of compartmental models widely used to describe the spread of virus diseases worldwide is SEIR, that consider the total population (N) is split into sub-groups of Susceptible (S), Exposed (E), Infected (I), and Recovered or Removed (R) individuals, such that S+E+I+R = N. The compartments S and I contain all individuals likely to be infected and those already infected, respectively. E individuals are usually considered to have low viral load and are not likely to be infectious, while R individuals can or cannot be infected again depending on the virus disease. This class of dynamical model may be used to ultimately identify the processes that interplay in pandemic waves and has been largely used recently to assess the dynamics of COVID-19 in many countries [38, 39, 40, 41, 42, 43], including BR [44, 7, 34, 35, 36, 45, 46, 13].
Inspired in the generalized SEIR model proposed in [38], Jia et al. [47] employed the SEAIRPQ model with seven compartments for modeling the Chinese outbreak of COVID-19, including populations of positively diagnosed (P) individuals who are under medical treatment and of quarantined (Q) individuals. A schematic description of the SEAIRPQ is shown in Fig. A.6, with its mathematical formulation given by the following system of ordinary differential equations:
Model parameters are non-negative constants and most of them are described in Table A.2. The three remaining are the fraction of infection rates between A and I, θ, the time required for an individual to be quarantined, 1/p, and the quarantine period, 1/λ. An interesting feature of the SEAIRPQ model is the explicit inclusion of a compartment for Q individuals. Longer periods of isolation yields a decrease in S population, ultimately decreasing the infected population. S individuals may be exposed to contact with A and I individuals. For the Chinese considered scenarios, it was assumed that A individuals are less likely to spread the disease, so that θ << 1. The SEAIRPQ model assumes that both A and I individuals are eventually positively diagnosed at rates εa and εI, respectively. They can also recover without medical treatment at rates γa and γI, respectively.
Although SEAIRPQ model was able to describe quite well the outbreak in China, the Brazilian reality may prevent its use in Brazilian localities. BR testing capacity is still very limited, one of the lowest in the world. In April 2020, BR tested around 296 per million people. According to [3], this number increased to 4,104 in May 2020, although it can be significantly lower in some Brazilian localities due to huge demographic and social heterogeneities. Thus, available data on testing is very scarce: the present Brazilian policy is to test only people with severe symptoms. Such evidence has driven the development of the model proposed in this paper.
Here we consider a modification of model (A.1) taking into account protective quarantine measures as an implicit variable. The proposed model, called SEIARPD-Q, has also seven compartments since it includes the compartment of dead (D) individuals to explicitly keep track of them. The notation used indicates that Q individuals are not separated in a specific compartment. Instead, quarantine mechanisms are implicitly considered, with individuals in isolation included in R compartment. Likewise SEIARPQ model, the infected class is modeled as two separate compartments encompassing I and A individuals. R compartment includes individuals who have recovered from the disease as well as those who have been removed to this compartment as a result of isolation measures.
For completeness, we repeat here the mathematical formulation of the SEAIRPD-Q model, which is given by the following set of differential equations:
The model is schematically shown in Fig. A.7. All model parameters meanings are described in Table A.2. Data sources are informed in Table A.3 and Table 1 (Materials and Methods). Information regarding the numerical differential equation solver employed can be seen in SI-SAppendix A.2.
Looking at the mathematical description of model (A.2), it is important to note that the transformation of S into E individuals occurs through interaction with individuals from the infected groups (I and A) according to β and μ rates, respectively. The remaining mechanisms are linear, depending only on the corresponding transfer rates. As already mentioned, an important mechanism introduced in the model describes the application of social distancing measures: individuals from compartments S, E, A, and I are removed to compartment R at a rate ω. In this way, it is possible to exclude R individuals from the infection dynamics. A similar approach was adopted in [11]. We emphasize that we have chosen an implicit quarantine/social distancing approach, that is, we do not model a specific compartment of Q individuals, unlike other studies [47, 38]. The advantage of this implicit representation is to obtain a simpler model, with less computational cost and fewer parameters.
The previous assumptions reflect our current knowledge of the COVID-19 in BR. We remark that the adaptive immunity to SARS-CoV-2 of people who recovered from the disease has not been established yet as well as how long the immunity could eventually last. Actually, a follow-up study on 262 recovered patients in China revealed that 15% were tested positively for SARS-Cov-2. They were all younger than 14-years old and had mild and moderate conditions at the time of the primary infection. During the follow-up, they did not show clinical symptoms and it is speculated if either more sensitive test is required or indicates a sign of relapse rather than reinfection. Another study performed in previously infected rhesus macaques found that reinfection of SARS-Cov-2 could not occur [48]. Overall, the current understanding, based on what is known on coronaviruses in general, is that antibodies produced by infected patients may give them immunity to the specific virus for short term or even years. For example, previously infected SARS patients can become susceptible to reinfection only after around three years [49].
Appendix A.2. Ordinary Differential Equations Solving Method
The mathematical formulation of the SEAIRPD-Q model described by the system of seven ordinary differential equations (A.2) is solved using the LSODA method [50] from SciPy [51]. This package provides an interface for such routine from ODEPACK [52].
Appendix A.3. Evaluation of the Reproduction Number
Following the ideas introduced in [53], we apply the Next Generation Matrix (NGM) method to obtain for the SEAIRPD-Q model. Firstly, we use a reduction process to arrive at a linearized infection model, under the disease-free steady state. The identification of the infected variables (E, A, and I) allows defining the vector xT = [E, I, A], where the superscript T stands for the transposition of a matrix. Thus, we consider that S, R, P, and D individuals are not able to transmit the disease. Denoting by T and Σ the transmission and transition matrices, respectively, the three-dimensional linearized infection sub-system is given by: where
The reproduction number is the dominant eigenvalue of the next generation matrix K = −TΣ−1 [53], which yields:
The corresponding effective reproduction number is:
Appendix A.4. Bayesian Calibration
In this work we adjust our model parameters to make results consistent with available observations (i.e., C and D for both BR and RJ scenarios). With Bayesian calibration, we are capable of determining the most likely uncertainties for input parameters as well as their MAP values described by the a posteriori probability distribution πpost. Considering that our aim is to adjust a set of parameters θ given y observations, Bayes’ theorem states: where πprior is the a priori probability distribution that represents the initial (a priori) knowledge over θ, πlike is the likelihood function, and πevid is the evidence (information) encompassed in the data (see [54] for a more detailed description). We assume that all data follows a Gaussian distribution centered on simulated of observable quantities values with variances and , which yields the following form for the likelihood function:
The standard deviations for the cumulative confirmed (σC) and dead cases (σD) are tuned as hyperparameters within the calibration process. Regarding πprior, a list of prior distributions and MAP estimates of marginal distributions can be seen in Table 1 (Materials and Methods). The frequency histograms for the calibrated parameters of both BR and RJ scenarios are shown in Figs. A.8 and A.12, respectively. The posterior distributions are obtained by a tempered adaptive Bayesian updating procedure proposed in [16], the Cascading Adaptive Metropolis in Parallel, used here through PyMC3 [15] implementation in Python language (version 3.7).
Despite the fact that we have mentioned which observed data are used as observations for the calibration procedure, a proper definition of such quantities is absent. The estimate of C is calculated from the model as: which is the cumulative value of inflow individuals in the P compartment, since only diagnosed individuals are recorded on Brazilian confirmed case databases due to the lack of large-scale testing in the overall population. Using Eq. A.10, we can estimate the amount C at t days after day t0. Analogously, in order to compute the D from SEAIRPD-Q, we calculate: which is the integration of Ḋ (see system (A.2) for reference) over the time range of interest. C and D time-stamps are gathered from data sources specified in the Materials and Methods section. Such quantities are used as observed data in the calibration procedure.
Appendix A.5. Sensitivity Analysis
Due to paucity data and possible model identifiability issues, not every parameter and unknown initial conditions (ICs) are estimated using the Bayesian calibration procedure. Based on previous studies, we assume values for every parameter (with the exception of β, μ, ω, dI, and dP), and IC (with the exception of S(0), P(0), D(0), and R(0)). Relevant information on model settings is provided in Tables 1 (Materials and Methods), A.3 and A.2. In order to quantify how these choices affect the outcome of the dynamics, we perform a sensitivity analysis. Specifically, we are interested in the sensibility of the quantities:
As previously stated in the main text, we employ the Elementary Effects method [20] to carry out this analysis. In contrast to local sensitivity methods, which evaluate how small changes around a singular point affect a QoI [55], the Elementary Effects method is a global sensitivity method. Considering the θ ∈ ℝk as the set of analysed parameters, we construct a sample of m initial points θ(1), ···, θ(m) from a k-dimensional p-level grid. Afterwards, a metric defined as an elementary effect EEi, i = 1,…,k, is computed for every parameter: where is the magnitude of change in the ith parameter in the k-dimensional p-level grid. Values of p as an even number and are usual suggested choices to ensure an equal sampling probability in the parametric space, and are the default option in the SALib [21] library used. After computing the elementary effects, the process ends by calculating the global sensitivity coefficients for each parameter. Our work dedicates the analysis only to the first order sensitivity coefficient:
Parameters with high scores are the most influential to the QoI. In turn, the first sensitivity index values allow ranking the parameters with respect to their order of importance. Sensitivity analysis is performed for both scenarios (BR and RJ) considering all parameters and ICs E(0), A(0), and I(0). The settings p = 4 and m = 40 are used for all experiments. A 50% variation around the fixed and MAP values in Tables 1 (Materials and Methods) and A.3 is considered to construct parameter ranges. The QoIs are observed in days 2, 25, 50, 75, 100, 125, 150, 175, 200, 225, and 250 in order to evaluate the scores si along all simulation period. For the sake of presentation, the scores are normalized (sum to 1).
The sensitivity analysis results for BR are depicted in Fig. A.10. The sensitivity indexes for the cumulative C and D (QoI2(t)) are heavily influenced by the initial conditions in the early stages of the disease propagation, although their influence decreases as time evolves. In contrast, they are not influential for along all the simulation time. The influence of the parameter related to social distancing (ω) on both QoIs is notable. More importantly, its influence increases with the evolution of time, which emphasizes the need for care when establishing social distancing relaxation measures. On the other hand, the proportion of I individuals is also one of the most influential parameters on both QoIs, although it is more relevant considering C and D. This points out the need of having a more widespread testing policy. Regarding the RJ scenario, Fig. A.14 reveals very similar scores. One noteworthy distinction is the rank of influence of E(0) and I(0) associated with QoI2(t). The former is more influential in the BR scenario (Fig. A.10b), while the latter has a higher score for RJ (Fig. A.14b).
Appendix A.6. BR: Additional Results
Fig. A.9a shows the prediction of P, I, A, D, and C in BR. A total of 75.7 thousand (95% CI: 67.7–85.8) active cases are expected on the simulation day 81 (95% CI: 78–83), with D and C expected to be around 34.7 thousand (95% CI: 32.4–37.3) and 364.2 thousand (95% CI: 325.7–414.8), respectively. The posterior distribution of the peak position is displayed in Fig. A.9b, with the vertical dashed lines corresponding to those displayed in Fig. A.9a. Fig. A.9c depicts the temporal changes of and displays two vertical lines identifying the credible interval (95% CI: 69–73) of the time above which . The same lines are depicted in Fig. A.9d that shows the uncertainty associated with that time value.
Fig. A.11 provides the model forecasts for the considered hypothetical scenarios of social distancing relaxation. Fig. A.11a shows what to expect in case of a sudden release from quarantine after the simulation day 100 (t1/2 = 0.1 days as half-life decay). In this case, C reaches 434.8 thousand (95% CI: 372.1–526.8) and D 41.4 thousand (95% CI: 37.8–46.0) at the end of the simulation (day 262), which corresponds to an increase of approximately 19% in both values when compared to the original scenario with ω fixed at its MAP estimate. For a gradual release after the simulation day 100 (t1/2 = 15 days as half-life decay), the numbers of C and D at the end of simulation are 378.0 thousand (95% CI: 334.7–436.0) and 36.0 thousand (95% CI: 33.6–38.9), respectively, as shown in Fig. A.11b. Figs. A.11c and A.11d compare the variability of C and D at the end of simulation in the form of box plots for all scenarios considered. Outliers appear as individual points and the samples medians are depicted in red.
Appendix A.7. RJ: Additional Results
Fig. A.13a shows the prediction of P, I, A, D, and C in RJ. A total of 7.3 thousand (95% CI: 6.38.7) active cases are expected on the simulation day 80 (95% CI: 76–85), with D and C expected to be around 7.0 thousand (95% CI: 5.8–8.7) and 45.0 thousand (95% CI: 37.7–55.5), respectively. The posterior distribution of the peak position is displayed in Fig. A.13b, with the vertical dashed lines corresponding to those displayed in Fig. A.13a. Fig. A.13c depicts the temporal changes of and displays two vertical lines identifying the credible interval (95% CI: 67–76) of the time above which . The same lines are depicted in Fig. A.13d that shows the uncertainty associated with that time value.
Fig. A.15 provides the model forecasts for the considered hypothetical scenarios of social distancing relaxation. Fig. A.15a shows what to expect in case of a sudden release from quarantine after the simulation day 100 (t1/2 = 0.1 days as half-life decay). In this case, C reaches 57.1 thousand (95% CI: 44.4–80.1) and D 8.8 thousand (95% CI: 6.8–12.3) at the end of the simulation (day 252), which corresponds to an increase of approximately 27% in both values when compared to the original scenario with u fixed at its MAP estimate. For a gradual release after the simulation day 100 (t1/2 = 15 days as half-life decay), the numbers of C and D at the end of simulation are 47.8 thousand (95% CI: 39.3–61.1) and 7.4 thousand (95% CI: 6.0–9.5), respectively, as shown in Fig. A.15b. Figs. A.15c and A.15d compare the variability of C and D at the end of simulation in the form of box plots for all scenarios considered. Outliers appear as individual points and the samples medians are depicted in red.
Footnotes
‡ DTV, ACMR, and JVOS performed computer simulations. All authors contributed to the conceptualization of this study, methodology, writing, and approval of the final version of the manuscript.
‡‡ The authors declare no competing interest.
References
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].
- [57].
- [58].