ABSTRACT
The population, governments and researchers show much less interest in the COVID-19 pandemic. However, many questions still need to be answered: why the much less vaccinated African continent has accumulated 15 times less deaths per capita than Europe? or why in 2023 the global value of the case fatality risk is almost twice higher than in 2022 and the UK figure is four times higher than the global one?
The averaged daily numbers of cases DCC and death DDC per million, case fatality risks DDC/DCC were calculated for 34 countries and regions with the use of John Hopkins University (JHU) datasets and possible linear and non-linear correlations with the averaged daily numbers of tests per thousand DTC, median age of population A, and percentages of vaccinations VC and boosters BC were investigated. Strong correlations between age and DCC and DDC values were revealed. One-year increment in the median age yielded 39.8 increase in DCC values and 0.0799 DDC increase in 2022 (in 2023 these figures are 5.8 and 0.0263, respectively). With decreasing of testing level DTC, the case fatality risk can increase drastically. DCC and DDC values increase with increasing the percentages of fully vaccinated people and boosters, which definitely increase for greater A. After removing the influence of age, no correlations between vaccinations and DCC and DDC values were revealed.
The presented analysis demonstrates that age is a pivot factor of visible (registered) part of the COVID-19 pandemic dynamics. Much younger Africa has registered less numbers of cases and death per capita due to many unregistered asymptomatic patients. Of great concern is the fact that COVID-19 mortality in 2023 in the UK is still at least 4 times higher than the global value caused by seasonal flu.
Introduction
In the fourth year of the COVID-19 pandemic, the population and governments show much less interest in it. In particular, only 39% of countries reported at least one case to WHO in the period from 31 July to 27 August 2023, [1]. Thus accumulated numbers of cases CC and deaths DC per million show stabilization trends [2] and can be used to estimate the impact of different factors on the pandemic dynamics and to answer some important questions. In particular, why the much less vaccinated African continent has accumulated 36 times less cases and 15 times less deaths per capita than Europe (see [1, 2], lines 26 and 27 in Table 1)? Why in 2023 the global value of the case fatality risk is almost twice higher than in 2022 and the UK figure is four times higher than the global one, [3])?
We will apply the linear correlation analysis using accumulated relative characteristics: the numbers of cases and deaths per million (CC and DC), numbers of fully vaccinated people and boosters per hundred (VC and BC), tests per thousand (TC) available in files of John Hopkins University (JHU) [2] for all countries and regions.
The impact of various factors on the COVID-19 pandemic dynamics was estimated in many papers. Some examples can be found in [4-21]. In particular, CC values accumulated as of December 23, 2021 in Ukrainian regions and European countries showed no correlations with the size of population, its density, and the urbanization level, while DC and CFR=DC/CC values reduce with the increase of the urbanization level in European countries, [18]. The increase of income (Gross Domestic Product per capita) leads to increase in CC, VC, BC, and TC values, but DC and CFR demonstrate opposite trend in European countries [19].
Many asymptomatic COVID-19 cases [22 - 26] can cause a big difference between visible (registered) and real pandemic dynamics [15, 27, 28]. That is why the higher testing level can increase the numbers of registered cases. Sometimes the testing level is too low to reveal all the cases predicted by theory. Probably such situation occurred in Japan in summer 2022, [29]. It was shown in [20], that the test per case ratio TC/CC (or test positivity rate CC/TC) is very important characteristic to control the pandemic. Very strong correlation between CC and TC was revealed for values accumulated before August 1, 2022 in European and African countries, [19]. Since the TC values have stopped to be updated by JHU in different days of 2022, [2, 19], in this study, we will investigate a correlation between the averaged daily numbers of cases and test per capita DCC and DTC, respectively.
The severity of SARS-CoV-2 infection increases for older patients [16]; almost half of the infected children can be asymptomatic [30]. With the use of the statistical analysis of 2020 datasets it was shown that younger populations have less clinical cases per capita and it was predicted that “without effective control measures, regions with relatively older populations could see disproportionally more cases of COVID-19, particularly in the later stages of an unmitigated epidemic” [31]. This forecast was confirmed in [21] with the use of CC and DC datasets for 79 countries and regions including 10 so-called Zero-COVID countries [32]. It was shown that one-year increment in the median age yields 12,000-18,000 increase in CC values and 52-83 increase in DC values. To confirm the same trends in 2022 and 2023, we will investigate correlations between median age A and the averaged daily numbers of cases and deaths per capita DCC and DDC, respectively.
The high numbers of circulating SARS-CoV-2 variants [33-35] and re-infected persons [36-38] raise questions about the effectiveness of vaccinations. In particular, many scientists are inclined to think that the pandemic will not be stopped only through vaccination [39]. The non-linear correlations show that CC and DC values increase with the growth of the vaccination level VC, while CFR decreases, [19]. In this study, we will investigate correlations between VC and BC values registered in 2022 and 2023 and DCC, DDC and CFR figures. We will try also to answer the question why the number of cases and death per capita are higher in more vaccinated countries.
Materials and Methods
We will use the accumulated numbers of laboratory-confirmed COVID-19 cases CCi and deaths DCi per million, accumulated numbers of tests per thousand TCi, accumulated numbers of fully vaccinated people VCi and boosters BCi per hundred for 33 countries (shown in Tables 1 and 2) and the world (i =1, 2,.., 34). The have chosen the countries with highest numbers of accumulated cases and death (according to the recent WHO reports [1]), some other countries and regions listed in COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU), [2] (version of file updated on September 28, 2023). Since Chinese statistics shows some contradictions (see, e.g., [40] or compare JHU files updated on September 28 and March 9, 2023), we have used only figures for Taiwan and Hong Kong. In particular, in Table 1 we show corresponding CCi and DCi values from the March-9-version of JHU file (not available on September 28). The CCi and DCi values for Taiwan and Hong Kong for 2023 were calculated with the use of [41-44]. We ignore data from Ukraine and Russia to exclude the influence of military operations on the COVID-19 statistics. Nevertheless, the aggregated figures for large regions, continents and the world include information from these two countries and China ( see lines 25-34 in Tables 1 and 2). To take into account the average age of population, we have used the information about the median ages Ai from [45, 46] (see Table 1).
To calculate the averaged daily numbers of cases DCC and deaths DDC per million in 2022 and 2023 we will use simple formulas: where T1 =365 and T2 =252. The CFR values corresponding to 2022 and 2023 can be calculated as follows: The total average CFRi* levels (during the entire period of the COVID-19 pandemic) can be calculated for every country and region: To estimate the average daily numbers of tests per thousand, we will use the formula: Durations of corresponding periods in days Ti(TC) are listed in Table 2. Unfortunately, the testing data is not available for many countries and regions.
We will use the linear regression to calculate the regression coefficients r and the optimal values of parameters a and b for corresponding best fitting straight lines, [47]: where explanatory variables x will be A, DTC, VC, and BC and dependent variables y will be DCC, DDC, CFR, DTC, VC, and BC.
We will use also the F-test for the null hypothesis that says that the proposed linear relationship (6) fits the data sets. The experimental values of the Fisher function can be calculated using the formula: where n is the number of observations (number of countries and regions taken for statistical analysis); m=2 is the number of parameters in the regression equation, [47]. The corresponding experimental values F have to be compared with the critical values FC (k1, k2 ) of the Fisher function at a desired significance or confidence level ( k1 = m −1, k2 = n − m, see, e.g., [48]). If F / FC (k1, k2 ) < 1, the correlation is not supported by the results of observations. The highest values of F / FC (k1, k2 ) correspond to the most reliable correlation.
We will use also non-linear regression: which can be reduced to the linear one with the use of new variables, [19]:
Results
The results of calculations with the use of formulas (1)-(5) are listed in Table 3 and shown in Figures 1-3 versus median age, testing level DTC, percentage of fully vaccinated persons (for 2022) and (for 2023), and numbers of boosters per hundred (for 2022) and (for 2023). The averaged daily numbers of cases decreased drastically in 2023 in comparison with corresponding values in 2022 (compare DCC (2) and DCC (1) figures in Table 3 or “triangles” and “circles” in Fig. 1; the only exception is Nigeria). The global average daily number of cases has diminished 7.4 times in 2023 (see the last row of Table 3). Turkey has stopped to show new cases in 2023. USA, China, Japan do not report any COVID-19 cases and related deaths since May, 15, 2023, [1].
In 2023 the averaged daily numbers of deaths significantly decreased in all countries an regions (compare DDC (2) and DDC (1) values in Table 3 or “triangles” and “circles” in Fig. 2), yielding 3.6 times decrease in global DDC figures (see the last row of Table 3). Seasonal global influenza mortality is between 294 and 518 thousand in the period from 2002 to 2011, [49]. After dividing the presented figures over the world population 8060.5 million [45] and 365 days, the corresponding averaged daily number of deaths per million DDC(infl) will range between 0.1 and 0.18. The global value of is comparable with the influenza mortality, but in 2023 in many countries (including the UK) the corresponding DDC (2) values are much higher than DDC(infl) (see Table 3).
The global case fatality risk in 2023 is approximately twice higher than in 2022 despite of the increase in percentages of fully vaccinated people and boosters (see the last rows of Tables 2 and 3). In 2023 the CFR values were lower only in India, South Korea, Mexico, South Africa, Nigeria, Vietnam and Africa (compare corresponding columns in Table 3). There are countries with drastic growth of the CFR values in 2023 in comparison with 2022 (for example, almost 7 times for the UK and Peru). In 2023 only Egypt, Peru and Iran have higher case fatality risks than one in the UK (see Table 3; the markers corresponding to the UK are placed inside red circles in Figs. 1-3).
There are countries with traditional high levels of CFR. To smooth temporarily fluctuations, the total average CFRi* values (during the entire period of the COVID-19 pandemic) were calculated with the use of eq. (4), listed in Table 3 and shown in Fig. 3 by blue “stars”. The value CFR11* corresponding to the UK is only slightly higher than the global one. Many countries (e.g., the US, India, Mexico, Peru, Iran, Indonesia, Canada, South Africa, Egypt, Nigeria) have higher CFRi* values.
To remove the influence of the seasonal factors in the UK, let us calculate the values of and for the period January 1, 2022 – May 19, 2022 and the same values and and for the period January 1, 2023 – May 19, 2023 with the use of eqs. (1)-(4) and JHU datasets: In 2023 we can see huge decrease in the number of cases and moderate diminishing in the number of death. As the result, the case fatality risk in the beginning of 2023 exceeded the 2022 figure around 11 times. The reason could be explained by the fact that testing and reporting most cases we stopped in UK since April 2022. The only people who get recorded as cases are those that are tested in hospital, and even there not everyone with respiratory infections gets tested. The death data may include all those where COVID-19 is listed on the death certificate, which might even include those that have not tested positive but where the doctors suspect COVID-19.
It must be noted, the growth of CFR in the UK occurred in the period of increasing the vaccination and booster levels, [2]:
To investigate possible correlations between DCC, DDC and CFR values and explanatory variables A, DTC, VC(1), VC(2), BC(1), and BC(2), the linear regression (6) and Fisher test were used. The results of calculations of optimal values of parameters a and b, correlation coefficients and experimental values of the Fisher function F (eq. (7)) are listed in Table 4. The F values were compared with the critical ones FC (1, n − 2) at the confidence level 0.01. The numbers of observations are different for different correlations due to the absence of data for some countries and regions.
Since many COVID-19 patients are asymptomatic [25-30], the high testing level (DTC or TC) could help to reveal more cases and COVID-19 related deaths. This trend was supported statistically (see rows 4 and 11 in Table 4 and black lines in Figs. 1 and 2). Nevertheless, the linear regression yields unacceptable non-zero values of parameter a, which mean that some cases and deaths could be revealed at zero testing level. To remove this discrepancy, the non-linear approach (8)-(9) was applied for the same countries listed in Table 3 (n=23).
Equations (10)-(12) represent the best fitting curves (see red lines in Fig. 1-3), correlation coefficients and experimental values of the Fisher function: The values of r2 and F for variables z and w are higher than for y and x (compare corresponding values in (10)-(12) and in rows 4, 11 and 18 of Table 4). The relationships (10) and (11) are supported at significance level 0.001 ( Fc (1, 21) = 14.6 ). The similar very strong correlation between the numbers of cases and tests per capita accumulated in European and African countries as of August 1, 2022 was found in [19]: Nevertheless, 16 European countries with the highest testing level (TC>3000) have demonstrated no correlation between CC and TC even at significance level 0.05, [19].
Only 5 countries and territories listed in Table 2 (Hong Kong, France, Italy, the UK, and Israel) had TC values higher than 3,000 in 2022. The DCC(1) values are rather high and vary from 436 to 1,241 in these countries (see Table 3). Nevertheless, many infectious persons were probably not detected. This is evidenced not only by the higher numbers of cases per capita in South Korea (DCC8(1)=1,502.9, Table 3) at lower testing level (compare corresponding DTC values in Table 3), but also by the results of total testing in some countries and institutions, which revealed many previously unregistered COVID-19 patients [24-26]. Taking the maximum of DCC8(j) values (corresponding to South Korea) as estimations real number of cases per capita in 2022 and 2023, we can calculate the visibility coefficients as the ratios of real and registered numbers of cases (similar relationship can be obtained using the accumulated numbers of cases per capita, [19]). For example, figures corresponding to the UK are ; Europe - , and Africa - .
An experimental estimation of the visibility coefficient can be obtained from the results of total testing in Slovakia (89.5% of population was tested on October 31-November 7, 2020 and a number of previously undetected cases, equal to about 1.63% of the population was revealed, [24, 25]). Since the number of detected cases in Slovakia was approximately 1% of population [2], we can estimate the visibility coefficient β ≈ 2.63 for that period. As of September 10, 2023 the ratio of CC values for South Korea and Slovakia (667,207.1/330,868.413, [2]) yields the visibility coefficient 2.02. The results of a random testing in two kindergartens and two schools in Chmelnytskii (Ukraine) revealed the value of visibility coefficient 3.9 in December 2020, [26].
The generalized SIR models and algorithms of their parameter identification [27, 29, 50] allowed theoretical estimating of the visibility coefficients. In particular, values from 3.7 to 20.4 were obtained for Ukraine [15, 27] and 5.4 for Qatar [28] in different periods of the COVID-19 pandemic. The lack of appropriate testing did not allowed detecting the first SARS-CoV-2 cases, which probably appeared long before December 2019 [51]. In particular, theoretical estimates give the date of the appearance of the first case at the beginning of August 2019, [50].
Dependence (12) can be accepted at significance level 0.05 ( Fc (1, 21) = 4.43 ; a similar equation can be obtained by dividing (11) over (10)) and shows that the case fatality risk increases with diminishing of the testing level even in period of the high interest in the SARS-CoV-2 infection (as it was in 2022). In 2023, when the people paid attention to severe cases only and maked tests correspondingly, CFR values can increase drastically. Therefore, one should probably not be afraid of a significant increase of the case fatality risk in the UK in 2023. Of much greater concern is the fact that COVID-19 mortality in this country ( , see Table 3) is still at least 4 times higher than the global value caused by seasonal flu, [49].
Equations (10) and (13) may give the illusion that the low number of cases per capita in Africa is due only to the low testing level typical for low-income countries (see [19]). The visibility coefficients and another characteristic - the ratio of the number of tests to the number of cases DTS-will allow us to understand the situation and draw the right conclusions. High DTS values mean that many persons surrounding the detected infectious patient (e.g., family members, colleagues, neighbors) were tested and isolated (this causes a decrease in the number of new infections, i.e. DCC). For example, very high tests per case ratios (DTS >100) in Hong Kong in 2020 and 2021 allowed controlling the COVID-19 epidemic completely [20] (the smoothed daily numbers of new cases per million did not exceed 20, [2]). After January 18, 2022, the daily numbers of new cases started to increase, but the daily numbers of tests remained almost constant yielding drastically diminishing of the daily tests per case ratio [20] and very high DCC values in February-March 2022, [2].
It follows from (10) that the averaged daily test per case ratio: increases for countries with low DCC figures (in particular, for African ones, see Table 3). The similar relationship follows from (13) for the accumulated characteristic TS=1000*TC/CC. For example, DTS values (calculated using the information available in Table 3) are equal to 26. 9 (the UK); 39.4 (India); 123.5 (Nigeria); 4.3 (South Korea); 1.95 (Japan) and demonstrate that the probability to miss an infectious person due to the lack of tests is much higher in Japan or South Korea than in Nigeria or India. During the severe pandemic wave in Japan in summer 2022, the daily numbers tests probably were not enough to confirm COVID-19 in patients with symptoms [29].
Therefore, the reason for the low number of registered cases per capita in Africa or in India should not be found in insufficient testing, but in large values of the visibility coefficients (14), which attribute to large numbers of asymptomatic infections. Since the severity of SARS-CoV-2 infection increases for older patients [16, 31] and almost half of the infected children can be asymptomatic [30], the regions with older population are expected to have much higher accumulated numbers of cases per capita, [31]. It was shown that one-year increment in the median age yields 12,000-18,000 increase in CC values [21]. Rows 1 and 5 in Table 4 and blue lines in Fig. 1 illustrate the same trend for DCC values. One-year increment in the median age increased DCC values by 39.8 in 2022 and by 5.8 in 2023.
The stronger correlations and same trends were obtained for the averaged daily numbers of deaths per capita DDC versus median age of population A (see rows 8 and 12 in Table 4 and blue lines in Fig. 2). One-year increment in the median age increases the DDC values by 0.0799 in 2022 and by 0.0263 in 2023. The characteristics calculated for large regions (EU, continents and the world) are very close to the best fitting blue lines (see large markers in Figs. 1 and 2). We can conclude that the young age of Africa (A27 =18, see Table 1) is the main reason of very low numbers of cases and death per capita registered on this continent.
Opposite and much weaker age trends we can see for the case fatality risks (lines 15 and 19 in Table 4). The decrease of CFR values with increase of the age (supported only by the 2022 dataset) looks unexpected (especially taking into account the fact that in 2020 younger populations had less clinical cases per capita [31]). Probably, the reasons are higher vaccination levels and better medical treatment in the reach countries with high median age.
The numbers of cases and deaths per capita increase with increasing the percentages of fully vaccinated people and boosters (see rows 2, 3, 6, 7, 9, 10, 13, and 14 in Table 4 and green and magenta best fitting lines in Figs. 1 and 2). Re-infections in vaccinated persons are common [52, 53], but a very clear uprising trend with increasing VC and BC values is unexpected despite the similar result for smoothed daily numbers of cases reported in [17] (JHU datasets with 7-days-smoothing corresponding to September 1, 2021 and February 1, 2023 were used for statistical analysis). Obtained trends could be a result of age influence; since the most vaccinated countries have higher Ai values (see Tables 1 and 2). We will discuss this correlation in the next Section. Another reason could be the introduction of special passports that removed restrictions for vaccinated persons. Many vaccinated people in countries with high VC and BC values started to visit crowded places, travel despite they can spread the infection. In many countries (in particular, in Ukraine) the vaccination procedure was associated with overcrowding in hospitals, which could contribute to the spread of the infection too.
As expected, the case fatality risks decrease with increasing the percentages of fully vaccinated people and boosters (see rows 16, 17, 20, and 21 in Table 4 and green and magenta best fitting lines in Fig. 3). Similar result was obtained in [17] with the use of JHU datasets for European and some other countries. In 2023, the decreasing trend was not supported by Fisher test. Probably, this is due to the more chaotic data. In particular, the different days correspond to and values listed in Table 2; no CFR value can be calculated for Turkey.
Discussion
The explanatory variables A, DTC, VC(1), VC(2), BC(1), and BC(2), used in our analysis can be also dependent on each other. We have used the linear regression (6) and Fisher test to find correlations between DTC, VC(1), VC(2), BC(1), and BC(2) values and explanatory variable A. The results of calculations are listed in Table 5 and displayed in
Fig. 4. There are strong correlations between VC(1), VC(2), BC(1), and BC(2) versus median age of population A (see rows 2-5 in Table 5; green and magenta best fitting lines in Fig. 4). The correlation between A and DTC is supported at the confidence level 0.05 (see the first row in Table 5 and the black best fitting line in Fig. 4). The growth of the median age leads to the increase of testing level and the percentages of vaccinations and boosters. These correlations can be a result of higher incomes in aged countries and more vaccinations and boosters in older people.
Now we can explain why the numbers of cases and deaths per capita can increase with increasing the percentages of fully vaccinated people and boosters (see rows 2, 3, 6, 7, 9, 10, 13, and 14 in Table 4 and green and magenta best fitting lines in Figs. 1 and 2)? Values VC(1), VC(2), BC(1), and BC(2) are not independent and definitely increase with the age. On the other hand, DCC and DDC values also increase with growth of Ai (see rows 1, 5, 8, and 12 in Table 4 and blue best fitting lines in Figs. 1 and 2). To remove the influence of age in correlations between vaccinations and DCC and DDC values, let us consider the “purified” variations of VC(1) and BC(1) (we limited ourselves only to 2022 with more reliable statistical data): To obtain the “purified” variations the percentages of vaccinations VPi and boosters BPi, we have excluded from variations VC(1) and BC(1) the values predicted by the by the best fitted lines listed in Table 5 (rows 2 and 4).
We have used the linear regression (6) and Fisher test to find correlations between DCC, DDC and CFR values and explanatory variables VP and BP. The results of calculations are listed in Table 5 (lines 6-11). No correlations were revealed at the confidence level 0.01. Thus, the vaccinations and booster themselves do not increase the numbers of cases and death per capita. As expected, the case fatality risks reduce for higher values of VP and BP (see lines 8, 11 in Table 5), but at lower confidence level than versus VC(1) and BC(1) (see lines 16 and 17 in Table 4).
Conclusions
The averaged daily numbers of cases DCC and death DDC per million, case fatality risks DDC/DCC were calculated for 34 countries and regions with the use of John Hopkins University (JHU) datasets for numbers per capita accumulated in 2022 and 2023. Linear and non-linear approaches were used to find correlations with the averaged daily numbers of tests per thousand DTC, median age of population A, and percentages of vaccinations VC and boosters BC.
One-year increment in the median age yielded 39.8 increase in DCC values and 0.0799 in DDC figures in 2022 (in 2023 these increments are 5.8 and 0.0263, respectively). Decreasing testing level DTC can drastically increase the case fatality risk. DCC and DDC values grow with increasing the percentages of fully vaccinated people and boosters. Since VC and BC values definitely increase with at higher A, the influence of age was removed and no correlations between corrected variations of VC and BC and DCC and DDC values were obtained.
The presented analysis demonstrates that age is a pivot factor in visible (registered) part of the COVID-19 pandemic dynamics. Much younger Africa has registered less numbers of cases and death per capita due to many unregistered asymptomatic patients. Of great concern is the fact that COVID-19 mortality in 2023 in the UK is still at least 4 times higher than the global value caused by seasonal flu.
Clarification point
No humans or human data was used during this study
Data availability
All data generated or analyzed during this study are included in this text.
Competing interests
The author declares no competing interests connected with this paper.
Acknowledgement
The study was supported by the Solidarity Satellite Programme of Isaac Newton Institute for Mathematical Sciences, Cambridge, UK. The author is grateful to Professor Robin Thompson, Professor Matt Keeling, and Oleksii Rodionov for their help in providing very useful information.