ABSTRACT
The sharp increase in the number of new COVID-19 patients in India in the second half of April 2021 has caused alarm around the world. A detailed analysis of this pandemic storm is still ahead. We present the results of anterior analysis using a generalized SIR-model (susceptible-infected-removed). The final size of this pandemic wave and its duration are predicted. Obtained results show that the COVID-19 pandemic will be a problem for mankind for a very long time.
Introduction
The daily number of new laboratory-confirmed COVID-19 cases in India exceeded 400,000 in the end of April 2021. This huge figure is frightening, but if we take into account the population of India, the number of new cases per capita is not yet higher than the maximum for some other countries, including Ukraine. It is very important to assess the growing trends in the number of new cases and the ability of the Indian medical system to cope with the huge number of patients and deaths.
Any mathematical modeling of the epidemic dynamics will be of particular value if we make an accurate long-term forecast of its duration and number of diseases using statistics data sets obtained immediately after the outbreak. Many authors are trying to predict the Covid-19 pandemic dynamics in many countries and regions [1-72]. We will not analyze these studies in details and only note that the correct mathematical simulation of the Covid-19 pandemic is very difficult for at least two reasons.
First, data on the number of cases are clearly incomplete immediately after the epidemic outbreak, there are quite long hidden periods [67, 70, 73-77]. In particular, first COVID-19 cases probably have appeared already in August 2019 [67, 70]. The reason is the large number of asymptomatic patients and the lack of skills to detect a new disease. It must be noted that large discrepancy between registered and actual number of cases occurred even for later periods of Covid-19 pandemic [72, 78-80].
The second reason for the limited accuracy of long-term forecasts is the constant changes in the conditions of the pandemic (quarantine measures, social behavior, virulence of the pathogen, etc.). Therefore, a prediction made using statistics for a certain time period is not suitable for other periods of time. To solve this problem a generalized SIR-model and the methods of its parameter identification was proposed in [70, 81, 82]. Since the new pandemic wave in India is not the first one we will use the generalized SIR-model and the method of direct parameter identification (without calculations of the previous epidemic waves) [83]. Corresponding results for the first Covid-19 epidemic waves in mainland China, USA, Germany, the UK, the Republic of Korea, Austria, Italy, Spain, France, the Republic of Moldova, Qatar, Ukraine, the city of Kyiv and for the world are already published in [65, 67, 69-72, 81, 82] and showed rather good accuracy. In this article we will apply the same approach to the case of India.
Data
We will use the data set regarding the accumulated numbers of confirmed COVID-19 cases in India (Vj) from Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU) [84]. The values Vj and corresponding time moments tj (measured in days) are shown in Table 1.
Generalized SIR model and parameter identification procedure
The classical SIR model for an infectious disease [85-87] was generalized in [70, 72, 81] to simulate different epidemic waves. We suppose that the SIR model parameters are constant for every epidemic wave, i.e. for the time periods: . Than for every wave we can use the equations, similar to [85-87] : Here S is the number of susceptible persons (who are sensitive to the pathogen and not protected); I is the number of infectious persons (who are sick and spread the infection); and R is the number of removed persons (who no longer spread the infection). It must be noted that I(t) is not the number of active cases. People can be ill (among active cases), but isolated. In means, that they don’t spread the infection anymore. There are many people spreading the infection but not tested and registered as active cases. The use of number of active cases as I(t) in some papers a principal mistake which may lead to incorrect results. Parameters αi and ρi are supposed to be constant for every epidemic wave.
To determine the initial conditions for the set of equations (1)–(3), let us suppose that at the beginning of every epidemic wave :
In [70, 72, 81] the set of differential equations (1)-(3) was solved with the use of initial conditions and by introducing the function V (t) = I (t) + R(t), corresponding to the number of victims or the cumulative confirmed number of cases. For many epidemics (including the COVID-19 pandemic) we cannot observe dependencies S(t), I (t) and R(t) but observations of the accumulated number of cases Vj corresponding to the moments of time tj provide information for direct assessments of the dependence V (t). The corresponding analytical formulas for this exact solution; the saturation levels Si#x221E; ; Vi∞ = Ni − Si∞ (corresponding the infinite time moment) and the final day of the i-th epidemic wave (corresponding the moment of time when the number of persons spreading the infection will be less than 1) can be found in [70, 72, 81].
The exact solution depends on five parameters - Ni, Ii, Ri,νi,αi. The values Vj, corresponding to the moments of time tj can be used to find the optimal values of this parameters corresponding to the maximum value of the correlation coefficient ri [88]. The details of this approach can be found in [89, 90]. It was successfully used in [65, 67, 69-72, 81-83, 89, 91, 92] to simulate the COVID-19 pandemic dynamics and other phenomena. The exact solution [70, 72, 81] allows avoiding numerical solutions of differential equations (1)-(3) and significantly reduces the time spent on calculations. The new algorithm proposed in [90] allows estimating the optimal values of SIR parameters for the i-th epidemic wave directly (without simulations of the previous waves) with the use of only two independent parameters Ni and νi.
Results
The optimal values of parameters and other characteristics of the severe COVID-19 pandemic wave in India are calculated and listed in Table 2. We have used the number i=2 and the time period: Tc2 - April 10-23, 2021 for SIR simulations of this wave. Corresponding values of Vj and tj are listed in Table 1. The value of the correlation coefficient ri = 0.999959712033103 is very high (see Table 2), nevertheless we are not satisfied with the convergence procedure by isolation of the maximum of this parameter. Probably new simulation with the use of fresher datasets could fix this problem. The estimate of the average duration of the infection spread in India τi = 1/ ρi ≈ 35 days is significantly higher than in Ukraine in the end of March 2021, [72].
Unfortunately the estimations of the pandemic duration in India are very pessimistic (4,434 days or 12.1 years after April 30, 2021). If we suppose that the end of the epidemic corresponds the moment when the number of persons spreading the infection is less than 20, the calculations yield the middle of January 2023. Probably a strict quarantine and vaccination could change this sad trend, but it looks that some new cases of COVID-19 will appear if not always, then for a very long time. The very long duration of the epidemic in India and the large number of cases increase the likelihood of new mutations in the coronavirus, which can make existing vaccines ineffective and pose a threat to all mankind.
Knowing the optimal values of parameters, the corresponding SIR curves can be easily calculated with the use of exact solution [70, 72, 81] and compared with the pandemic observations before and after Tc2. The results are shown in Figure by blue lines: V(t)=I(t)+R(T) – solid; dashed one represents the number of infectious persons multiplied by 5, i.e. I (t) ×5 ; dotted line shows the derivative (dV / dt)×100 calculated with the use of formula: Equation (4) follows from (2) and (3) and yields an estimation of the real daily number of new cases. Red “Circles” and “stars” correspond to the accumulated numbers of cases registered during the period of time taken for SIR simulations Ti2 and beyond this time period, respectively (all taken from Table 1). It can be seen that these values are in good agreement with the theoretical blue solid line.
The blue dotted line shows that the daily number of the new cases will start to diminish after May 10, 2021, but the number of people spreading the infection I(t) will have its maximum only in the end of May, 2021. We can also compare the theoretical curve (4) with the average daily number of new cases which can be calculated with the use of smoothing registered Vj values [70, 81, 82]: and its first derivative: (see, e.g., [70, 81, 82]). The red “crosses” represent the results of calculations of the first derivatives (15) and are in a good agreement with the theoretical dotted line for moments of time before April 22, 2021.
Discussion
Rather large differences between the real (15) and theoretical (13) values of the first derivative after April 22, 2021 can be explained by two factors. The first is related to changes in the dynamics of the epidemic. This is evidenced by the sharp changes in the second derivative of the average number of reported cases, which can be estimated using formulas: and (5) (see, e.g., [70, 81, 82]).
So we can talk about a new wave of pandemic in India, which may be weaker than the one we see in April 2021 due to the sharp decrease in the second derivative (7) after April 20, 2021 (see red dots in the figure). The course of the epidemic and new simulations will help to clarify these findings.
The second reason for discrepancies between formulae (4) and (6) may be the large number of unregistered cases observed in other countries [72, 78-80]. Estimates for Ukraine made in [72] showed that the real number of cases is about four times higher than registered and reflected in official statistics. Similar estimates can be made for the case of India. But the results already obtained show that the COVID-19 pandemic will be a problem for mankind for a very long time.
Data Availability
Data is presented in the text
Acknowledgements
The author is grateful to Dr. Vinod Varghese and Oleksii Rodionov for their help in collecting and processing data.