Retrospective Methodology to Estimate Daily Infections from Deaths (REMEDID) in COVID-19: the Spain case study

David García-García; María Isabel Vigo; Eva S. Fonfría; Zaida Herrador; Miriam Navarro; Cesar Bordehore

doi:10.1101/2020.06.22.20136960

Abstract

The number of new daily infections is one of the main parameters to understand the dynamics of an epidemic. During the COVID-19 pandemic in 2020, however, such information has been underestimated. Here, we propose a retrospective methodology to estimate daily infections from daily deaths, because those are usually more accurately documented. This methodology is applied to Spain and its 19 administrative regions. Our results showed that probable infections during the first wave were between 35 and 42 times more than those officially documented on 14 March, when the national government decreed a national lockdown and 9 times more than those documented by the updated version of the official data. The national lockdown had a strong effect on the growth rate of virus transmission, which began to decrease immediately. Finally, the first inferred infection in Spain is about 43 days before it was officially reported during the first wave. The current official data show delays of 15-30 days in the first infection relative to the inferred infections in 63% of the regions. In summary, we propose a methodology that allows reinterpretation of official daily infections, improving data accuracy in infection magnitude and dates because it assimilates valuable information from the National Seroprevalence Studies.

1. Introduction

The key parameter to understand and model the evolution of the COVID-19 pandemic is the number of daily infections. Despite its importance, however, it is difficult to estimate reliable data due to the bias in official figures¹. For symptomatic patients, there are delays of 5.79 days from infection to symptom onset and 5.82 days from symptom onset to diagnosis². In addition, the presence of asymptomatic people and those with mild symptoms hinder their detection. For example, 86% of infections were undetected in Wuhan, China, prior to 23 January 2020, when travel restrictions were imposed³. Additionally, not all patients with COVID-19 compatible symptoms are tested, especially at the beginning of the pandemic. So, it is widely assumed that the reported infections are just a fraction of the actual ones. In Spain, a National Seroprevalence Study based on ∼55,000 random participants estimated that only 10.7% (CI 95%: 10.1%-11.3%) of the actual infections had been detected during the first wave (before 22 June 2020) in Spain^{4, 5}.

Another problem is the inconsistency of daily infection time series and the variability of the case definitions. For example, on 13 January 2020, China changed the parameters to confirm new cases and ∼15,000 new infections were counted in a single day, while daily infections during the previous week were less than 3,500⁶. In Spain, the method of counting official numbers of infections has been modified several times since the pandemic was decreed. On 22 April 2020, a total of 213,024 cases were reported from positive PCR and antibody tests⁷; however, total cases decreased to 202,990 the next day and thereafter, because only positive PCRs were included in the total cases⁸. Accumulated cases on a given day are obtained by adding the new cases that day to the accumulated cases of the previous day. That has not always been the method, however, because some Spanish regions reviewed and modified the cases in previous days^{9, 10}. This inconsistency of time series hampers any kind of accurate analysis. Additionally, data are continuously being updated. In Spain, official daily infection data available during the first wave reported 1,832 new infected on 14 March 2020, but current official data (as downloaded in February 2021) reported 7,478 new cases for the same date.

High fidelity time series of each parameter of an epidemic are crucial to run reliable epidemiological models. The number of new infections per unit time is one of the essential parameters. The official data do not reflect the actual day of infection nor the real number of infections. To overcome this limitation, we propose a retrospective methodology to infer daily infections from daily deaths, believing the number of deaths to be a more reliable parameter than the official number of daily infections¹. These estimated infections are closer to the date of infection, avoiding delays from the incubation period and symptom onset to diagnosis. Then, the time series of daily infections would help to better understand the pandemic dynamics and to quantify the effectiveness of the different control strategies, avoiding the bias of the official data. In addition, those numbers can be used to feed epidemiological models with more realistic data. Finally, we will apply this methodology in Spain as a whole and each of its 19 administrative regions (17 autonomous communities and 2 autonomous cities).

2. Methods

2.1 REMEDID

We named this methodology REMEDID, REtrospective Methodology to Estimate Daily Infections from Deaths and it is described as follows. Date of infection (DI) of an individual may be estimated from the death date (DD) by subtracting the incubation period (IP) and illness onset to death (IOD) periods, hence Therefore, DI can be estimated from DD as far as IP and DD are known; however, neither IP nor IOD are fixed values. On the contrary, they are random variables that can be approximated by probability distributions. From dozens of cases in Wuhan, Linton et al.¹¹ approximated IP for COVID-19 with a lognormal distribution, X_IP, with a mean of 5.6 days and median of 5 days, and IOD with a lognormal distribution, X_IOD, with a mean of 14.5 days and median of 13.2 days. Therefore, the infection to death period is a random variable, X_IP+IOD, that follows the distribution X_IP + X_IOD.

Although the addition of two lognormal distributions does not follow any commonly used probability distribution, its probability density function (PDF) can be estimated convolving the PDF of the two variables¹². If g(t) and h(t) are the PDF of IP and IOD, respectively, then their convolution defines the PDF of X_IP+IOD, where t₀ is a positive real number representing days from infection. Then, f(t) is the PDF of the time from infection to death, with a mean of 20.1 days and a median of 18.8 days. Figure 1 shows g(t), h(t), and f(t), as well as a lognormal PDF with the same mean and median as f(t) for comparison. Note that the probability that X_IP+IOD is less than or equal to 33 days is 0.95.

Figure 1.

Probability density functions (PDF) of incubation period (IP, blue line) and illness onset to death period (IOD, red line) from Linton et al. (2020). Green line, f(t), is the convolution of both and represents the PDF of infection to death. Black dotted line is a lognormal PDF with the same mean and median as f(t).

Given a time series of deaths produced by the illness, x(t), we estimate the infection time series that produced such deaths, y(t). If we assume a case fatality ratio (CFR) of 100%, y(t) would represent all daily infections. To calculate y(t), we use the following likelihood-based estimation procedure.

Because the relative likelihood that a given infection will produce a death t days later is f(t), the infections at a given time t₀ can be estimated as In practice, x(t) is a discrete time series and could be written as x(n), where n is an integer representing entire days. Let F(n) be a discrete approximation to f(t) as follows: Then, A similar analysis was used previously to estimate a time-variable effective reproduction number for the severe acute respiratory syndrome (SARS) in 2003¹³. For a different CFR, the associated/estimated infections are estimated from y(n) in Equation 1 as Equations 1 and 2 constitute the kernel of the REMEDID algorithm. Validation of the capacity of REMEDID to reproduce the number of new daily infections is in Appendix A.

2.2. Data

2.2.a. Official numbers of COVID-19 infections and deaths

COVID-19 daily infections and deaths for Spain are reported by the Centro de Coordinación de Alertas y Emergencias Sanitarias (CCAES), which depends on the Ministry of Health, Social Services and Equality. Time series for the 19 regions of Spain can be downloaded from the national organization that compiles and publishes the data from all regions, the Instituto de Salud Carlos III (ISCIII; https://covid19.isciii.es/). For consistency, time series for Spain were estimated by adding the time series of the 19 regions. The final access to infections from official data was on 16 February 2021 (hereafter referred as IO₂₁). Those data have been greatly improved with respect to the daily situation reports published by CCAES during the first wave in the first semester of 2020 (hereafter referred as IO₂₀; https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov-China/situacionActual.htm). For example, IO₂₀ cases were given the date of diagnosis, whereas IO₂₁ cases were given the date at symptoms onset. When the latter was not available, it was estimated as follows: infections prior to 10 May 2020 were given the diagnosis date minus 6 days, whereas since 11 May 2020 only 3 days were subtracted from the date of diagnosis. Asymptomatic cases were given the diagnosis date. These corrections brought reported cases closer to the real infection dates, although they were still delayed by the length of the incubation period. In addition, re-evaluation of the suspicious case of the first infectee on 20 February 2020 in IO₂₀ gave it the date of 1 January 2020 in IO₂₁. The first official death was reported on 13 February 2020. However, no deaths were reported until 3 March 2020 and afterwards deaths were continuously reported. Then, time spans we considered for IO₂₁ is from 1 January 2020 to 29 November 2020 and for official daily deaths from 3 March 2020 to 1 January 2021, specifically with 33 extra days as required to apply the REMEDID methodology.

When applying REMEDID the use of a reliable CFR is essential. Thus, to determine a realistic CFR in Spain, we used the total infections data from the longitudinal seroprevalence study conducted by ISCIII in 4 phases. The first three phases were completed from 27 April to 22 June and the results were published on 6 July 2020^{4, 5}. In this longitudinal cohort study, ∼68,000 people were randomly selected and ∼55,000 of them agreed to be interviewed and tested in the three phases. The study found that 5.2% (CI 95%: 4.9%-5.5%) of the participants presented SARS-Cov2 IgG antibodies.

Extrapolation of this result to the ∼47 million of the population in Spain yielded 2.45 million infections until 22 June, which, with 29,674 accumulated COVID-19 deaths on 22 June, would give a CFR of 1.21% (CI 95%: 1.15%-1.29%). We assumed this value in our analysis for Spain until 22 June. The fourth phase of the study was completed from 16 to 29 November 2020 and the results published on 15 December 2020¹⁴. In that study, ∼51,000 people agreed to participate. The study found that 9.9% (CI 95%: 9.4%-10.4%) of the participants presented SARS-Cov2 IgG antibodies at any of the four phases. Then, there were 4.75 millions of accumulated infections in Spain, meaning 2.30 million new infections from 23 June to 29 November. Because there were 47,113 accumulated COVID-19 deaths on 29 November (17,439 new deaths from 23 June), it would give a CFR of 0.76% (CI 95%: 0.71%-0.81%). Thus, we assumed different CFRs for the different periods, in order to smooth the discontinuity effects, we propose a CFR time series segmentation as follows: i) a constant value of 1.21% from 1 January to 22 June 2020, ii) a constant value of 0.76% from 30 June to 29 November 2020, and iii) a linear function from 23 June to 28 June that would correspond to the interpolation of the CFR values for 22 June and 29 June. The CFRs of each region were estimated analogously (summarized in Tables 1 and 2). Prior to 22 June, Asturias, Castilla – La Mancha, Castilla y León, Extremadura, País Vasco, and La Rioja had CFRs greater than 1.5%. In contrast, Andalucía, Canarias, Melilla, and Murcia had CFRs less than 0.7%. After 22 June, the CFR for Spain diminished in 0.45%, probably due to improved medical treatments and experience gained during the first wave. In this period, the number of regions with a CFR below 0.7% increased to 10 and those with a CFR greater than 1.5% decreased to 3. Most of the regions revealed certain improvement, particularly remarkable was La Rioja, with a CFR decrease from 3.13% to 1.12%.

2.2.b. Excess of expected deaths from MoMo

Expected deaths from any cause, with an error band representing a 99% confidence interval, are modelled by the Mortality Monitoring (MoMo) surveillance system for 22 countries in Europe (www.euromomo.eu). In Spain, MoMo depends on the ISCIII¹⁵ and data are available at https://momo.isciii.es/public/momo/dashboard/momo_dashboard.html#nacional. MoMo also collects daily deaths from any cause from the Spanish Instituto Nacional de Estadística (INE) and the notaries and civil registries. Daily deaths within the error estimates of the expected deaths are considered usual. Any deviation allows identification of unusual mortality patterns, as for the COVID-19 pandemic. The useful variable here is the excess deaths (ED), estimated as the difference between actual and expected deaths. The error band for ED can be estimated from the error band of the expected deaths by subtracting the latter from the actual deaths. Note that ED can show negative values (Figure 2). By cautiously assuming that all significant ED is caused by COVID-19, ED can be used to estimate non-reported COVID-19 deaths. In this case, negative values for ED make no sense and are set to zero. The MoMo time series was downloaded on 16 February 2021 and we considered the period from 3 March 2020 (first non-isolated official COVID-19 death) to 1 January 2021.

Given that MoMo ED in Spain was 45,119 (CI 99%: 37,593-55,098) deaths from 3 March to 22 June, and 23,612 (CI 99%: 11,356-39,690) from 23 June to 29 November, the CFR for the country estimated similarly as in Section 2.1 was 1.85% (CI 95%: 1.74%-1.96%) prior to 22 June, and 1.02% (CI 95%: 0.97%-1.09%) after 23 June. For each region, the CFRs were estimated similarly and are provided in Tables 1 and 2. In all cases, we used a piece wise CFR time series segmented as described in Section 2.2a.

3. Results

3.1. COVID-19 vs. MoMo

The COVID-19 official deaths and MoMo ED time series overlaped for the period from 3 March 2020 to 1 January 2021 for Spain and its 19 regions (Figure 2). In general, there was good agreement between both datasets, meaning that most of MoMo ED were related to COVID-19 deaths. During the first wave, the most important differences were observed in Spain, Madrid, Cataluña, Castilla-La Mancha, and Castilla y León. Before 22 June in Spain, MoMo ED showed 15,445 accumulated deaths more than the official COVID-19 deaths, which is beyond the error band. That difference comes basically from the four regions with the largest numbers of deaths (Madrid, Cataluña, Castilla-La Mancha, and Castilla y León). Table 1 shows the accumulated values before 22 June, which were used to estimate the CFR for Spain and its 19 regions according to the third phase of the National Seroprevalence Study^{4, 5}. For all regions, the CFR estimated from MoMo ED was larger than the CFR estimated from COVID-19 deaths. In particular, Asturias, Canarias, and Murcia were twice as large. Ceuta and Melilla dramatically increased their CFR from MoMo ED, although that may be biased due to their small populations and numbers of deaths.

Similarly, the same variables for the period from 23 June to 29 November 2020 are reported in Table 2. In Spain, MoMo ED showed 6,173 accumulated deaths more than the official COVID-19 deaths. This difference is a third of the difference observed prior to 22 June; because this is within the error band, there was a significant improvement in the detection of COVID-19 deaths in this period. Figure 2 also shows a general agreement between MoMo ED and official COVID-19 deaths time series after the first wave, with the exception of late July and early August. These differences were due to two heat waves that were responsible for at least 25% of the MoMo ED¹⁶.

View this table:

Table 1.

Accumulated COVID-19 deaths and MoMo Excess Deaths (ED) for Spain and its 19 regions. Values from 3 March to 22 June 2020. Population data from INE. *: differences beyond the error estimation.

View this table:

Table 2.

Accumulated COVID-19 deaths and MoMo Excess Deaths (ED) for Spain and its 19 regions. Values from 23 June to 29 November 2020. The percentages for this period were estimated as differences between the accumulated percentages on 29 November and on 22 June.

Figure 2.

COVID-19 deaths and MoMo Excess Deaths (ED). a) Spain; b-t) Spanish regions. Red lines are official COVID-19 deaths, solid black lines are MoMo deaths, and thin black lines delimit MoMo death error bands (99% confidence interval).

3.2. Infections estimated from COVID-19 deaths

To illustrate the delay between official daily infections data and REMEDID estimated daily infections, we applied REMEDID from COVID-19 deaths assuming CFR= 100%. Figure 3 shows the current IO₂₁ and the infections associated with COVID-19 deaths for the first wave. The latter in Spain reached a maximum on 13 March 2020 (Table 3), the day before the national government decreed a state of emergency and national lockdown. Thus, the adopted measures had an immediate effect, which was observed in the official data IO₂₁ 7 days later (20 March). This delay is similar to the incubation period (mean 5.78 days²), which could be explained because official infections were reported when symptoms appeared. This delay reached 16 days when we compared with earlier version IO₂₀ (not shown), which highlights the usefulness of the methodology to reinterpret official data from very early stages of the pandemic. On the other hand, the maximum number of deaths was reached on 1 April, which was 19 days after the inferred infection maximum, bringing this delay close to the 20 days expected between infection and death (Figures 1 and 3).

Figure 3.

Official COVID-19 infections and deaths, and estimated infections with case fatality ratio (CFR) of 100% in Spain during the first wave. Left y-axis: COVID-19 daily infections IO₂₁ (blue curve). Right y-axis: COVID-19 deaths (orange curve) and its REMEDID-estimated infections with CFR=100% (red curve). All curves are for Spain. Thin blue and orange curves are daily data, and thick curves are smoothed by 14- day running mean. Arrows show delays between the maximum of inferred infections and maxima from COVID-19 deaths (orange arrows) and COVID-19 infections (blue arrows). Solid arrows are expected delays, dotted while arrows are observed delays.

We applied REMEDID to the official COVID-19 death with the corresponding estimated CFRs (see section 2.2) to obtain the time series of estimated daily infections, hereafter referred to as IR_O. Figure 4 shows IR_O and the accumulated infections for Spain and its 19 regions. Note that in Spain, IR_O are amplified versions of inferred infections in Figure 3. In Spain, the first infection, according to IR_O, is on 8 January 2020 (Table 3), 43 days before the first infection was officially reported on 20 February 2020 according to IO₂₀. By contrast, IO₂₁ places the first infection on 1 January 2020. Spain reached the maximum number of IR_O on 13 March, a day before the state of emergency and lockdown were enforced (Table 4). On 14 March, IO₂₀ = 1,832, and IO₂₁ = 7,478; however, IR_O = 63,727 (CI 95%: 60,050-67,403), 35 and 9 times IO₂₀ and IO₂₁, respectively (Table 5). This implies that on that day, IO₂₀ and IO₂₁ only reported 2.9% (CI 95%: 2.7%-3.1%) and 11.7% (CI 95%: 11.1%-12.5%) of new infections, respectively. Although detection of infections clearly improved from IO₂₀ to IO₂₁, almost 90% of the infections are still not documented in the peak of the first wave. The situation is similar for the accumulated infections before 22 June 2020, as reported by the National Seroprevalence Study^{4, 5}.

View this table:

Table 4.

Date of the most prominent relative maxima, for Spain and the 19 regions, of the REMEDID estimated daily infections from COVID-19 deaths (IR_O) and from MoMo Excess Deaths (ED) (IR_M), and official COVID-19 daily infections (IO₂₁). Maxima of IO₂₁ were estimated from the 14-day running mean time series.

View this table:

Table 5.

REMEDID estimated daily infections from COVID-19 deaths (IR_O) and from MoMo Excess Deaths (ED) (IR_M) on 14 March, and for official COVID-19 daily infections released on June 2020 (IO₂₀) and released on February 2021 (IO₂₁). Infections are inferred from COVID-19 deaths and MoMo ED with their corresponding CFR and 95% confidence interval detailed in Table 1. Percentages are official COVID-19 infections as proportions of the estimated infections.

In almost all regions, IO₂₀ showed a delay of 1 month or more between the first infection and IR_O (Table 3). No delay in IO₂₁ occurred in Islas Baleares, Castilla-León, and Galicia, while in three regions (Cataluña, Madrid, and La Rioja), the first case occurred earlier than the first case of IR_O. However, 6 regions had delays of 15 days and other 6 regions had delays of 1 month. According to IR_O, all regions except Ceuta and Melilla had some infections in January, but in IO₂₁ only 6 regions had infections in that month. In all scenarios, the first infections were in Madrid and Cataluña.

During the first wave, according to IR_O most of the regions had maximum daily infections around 14 March. In Madrid, the maximum was reached on 11 March, coinciding with the educational centres closing and an official warning by the regional government (Table 4). Asturias was the last region to reach peak infections (25-26 March). The maximum percentage of documented cases (12.6%,CI 95%: 9.2%-18.4%) occurred in Asturias on 14 March, but in the other regions, only between 1.2 to 8 % of the infections were documented (Table 5).

Figure 4 shows how the IO₂₁ and IR_O curves of Spain and the 19 different regions fluctuated following the same pattern until the middle of June 2020, but thereafter, they showed different patterns. This reflects the fact that the Spanish government had decreed the control measures for the whole nation until June, but thereafter, each regional government implemented its own control measures. For example, some regions (e.g., Aragón, Islas Baleares, Cantabria, Comunidad Valenciana, Extremadura, Galicia, Murcia, País Vasco, and La Rioja) had two peaks, but others had only one. An apparent maximum on 22 June in Islas Baleares is an artifact produced by the interpolation for transition from the two CFRs. Although beyond the scope of this work, it would be very interesting to investigate the effects of the different control measures implemented on the corresponding IR_O for the 19 regions.

The Spanish COVID-19 second wave reached a maximum of daily infections on 22 October from IR_O and on 26 October from IO₂₁. The delay of 4 days is similar to the mean incubation period (5.78 days²). The estimated number of new infections is still larger than the documented cases, but the shapes of the two curves are more similar in the second wave than in the first wave (Figure 1a). The same is true for the 19 regions, most of which had the largest peak around 22-26 October, with the exceptions of Canarias and Madrid, which reached maxima in late August and early September, respectively.

View this table:

Table 3.

Date of first infection for REMEDID estimated daily infections from COVID- 19 deaths (IR_O) and from MoMo Excess Deaths (ED) (IR_M), and for official COVID-19 daily infections released on June 2020 (IO₂₀) and on February 2021 (IO₂₁). All dates refer to year 2020.

Figure 4.

Daily and accumulated infections for official COVID-19 daily infections (IO₂₁), and daily infections estimated from COVID-19 deaths (IR_O). Lines are daily infections and refer to the y-axis on the right; bars are accumulated infections and refer to the y-axis on the left. Red lines and cyan bars are official COVID-19 data; orange lines and blue bars are inferred infections with case fatality ratio (CFR) in Table 1. Thin orange lines correspond to the CFR confidence interval.

3.3. Infections from MoMo Excess Deaths

Assuming that MoMo ED accounts for both recorded and non-recorded COVID-19 deaths, negative deaths are meaningless, and they were set to zero. Then, the associated daily infections can be estimated, as in Section 3.2, with a CFR of 100% from MoMo ED for Spain (Figure 5). Note two main differences between this time series and that estimated from official COVID-19 deaths: (1) MoMo data present an error band that was inherited by the estimated infections; (2) MoMo ED estimated infections reached a maximum of 1,443 (CI 99%: 1,329-1,547), doubling the 776 inferred daily infections from official COVID-19 deaths in Figure 3. This is because maximum MoMo ED was 1,584 (CI 99%: 1,468-1,686%) and maximum COVID-19 official deaths was 828, both estimated from the 14-day running mean time series. The maximum of inferred infections was reached on 13 March, just one day prior to the state of emergency and lockdown. The expected and observed delays with respect to official infections and MoMo ED were similar to those observed for estimated infections from official COVID-19 deaths. Error bounds of the estimated infections in Figure 5 were computed from the MoMo ED error bounds. However, it should be highlighted that the combination of the error bounds from MoMo ED and the estimated CFRs might lead to unrealistic error estimates. To avoid this, the error estimates in Figure 6 were estimated from the MoMo ED time series (no error bounds) and the error bounds of the estimated CFRs.

Figure 5.

Official COVID-19 infections, MoMo Excess Deaths (ED), and estimated infections with case fatality ratio (CFR) of 100% in Spain during the first wave. Left y- axis: COVID-19 daily infections IO₂₁ (blue curve). Right y-axis: MoMo ED (orange curve) and its REMEDID-estimated infections with CFR=100% (red curve). All curves are for Spain. Thin blue and orange curves are daily data, and thick curves are smoothed by 14-days running mean. Dashed curves represent the error estimate of MoMo ED (orange) and inferred infections (red). Arrows show delays between the maximum of inferred infections and maxima from MoMo ED (orange arrows) and COVID-19 infections (blue arrows). Solid arrows are expected delays, dotted while arrows are observed delays.

The REMEDID was applied to the MoMo ED with the corresponding CFRs (see section 2.2) to obtain the estimated daily infections, which will be referred hereafter as IR_M. The IR_M were calculated for Spain and its 19 regions and are depicted in Figure 6, as well as the accumulated IR_M. In Spain, the first infection shown by IR_M happened on 9 January, with an error estimate from 9 to 10 January, 41 to 42 days before the first documented infection of IO₂₀ on 20 February 2020 (Table 3). The maximum IR_M was 77,855 (CI 95%: 73,364 - 82,347) reached on 13 March. On 14 March, IR_M showed 14,128 infections more than IR_O (Table 5). Notice that the CFR used with MoMo ED data was larger than the one used with official COVID-19 deaths data, which makes this difference even more remarkable, because the larger the CFR the lower the estimated infections. Therefore, if the true CFRs, which are unknown, were used in both cases, IR_M would double IR_O on 14 March, as happened when a CFR of 100% was used (Figures 3 and 5). Notice that with the CFRs used, the IR_M and IR_O resulted in the same accumulated infections on 22 June and 29 November, matching the results of the seroprevalence study. Nevertheless, IR_M showed 42 times more cases than IO₂₀ and 10 times more than IO₂₁ on 14 March, detection of official cases of only 2.4% (2.2%-2.5%) and 9.6% (9.1%-10.2%), respectively.

Table 3 shows the estimated date of first infection for Spain and by region. Note that the first cases of IR_M in Spain were on 9 January and in Aragón, Canarias, and Navarra on 8 January, which is possible because significant excess deaths in a region may not become significant for the whole country. In general, the maxima of daily infections were closer to those on 14 March in IR_M than in IR_O. During the first wave, all regions showed a single maximum, except for Ceuta, Melilla, and Murcia, which showed two maxima (Figure 6). In general, the IR_M time series in all regions were similar during that period. The official data clearly under-detected infections during the first wave. On 14 March, IR_M were comparable to IR_O, overlapping CI in all regions, but not in Spain as a whole (Table 5). During the second wave, there was improved detection of cases with differences among regions.

Figure 6.

Daily and accumulated infections for official COVID-19 daily infections (IO₂₁), and daily infections estimated from MoMo Excess Deaths (ED) (IR_M). Lines are daily infections and refer to the y-axis on the right; and bars are accumulated infections and refer to the y-axis on the left. Red lines and cyan bars are for official COVID-19 data; and orange lines and blue bars are for inferred infections with case fatality ratio (CFR) in Table 1. Thin orange lines represent the error estimate of inferred infections.

4. Discussion

Infection dynamics is key to understand and model the COVID-19 pandemic. Nevertheless, documented infections in Spain (and nearly worldwide) are of limited quality qualitatively and quantitatively. Specifically, reported infections were underestimated and delayed 16 days compared to the estimated date of infection during the first months of the pandemic (according to IO₂₀) and 6 days in the current version of the data (IO₂₁). According to the National Seroprevalence Study, only 10.7% of infections were documented in Spain prior to 22 June 2020^{4, 5} and 64.3% from 23 June to 29 November 2020¹⁴. The huge underreporting during the first wave is mainly due to: (1) the lack of testing for asymptomatic and mildly symptomatic people³, which could have been detected with a “test, track and trace” strategy, as done in South Korea, China, and Singapore¹⁷; (2) deaths outside of the hospitals, with many cases from nursing homes for the elderly, where 6,664 deaths with COVID-19 and 7,082 with similar symptoms, but not officially diagnosed with COVID-19, occurred from January to May 2020¹⁸. This poor quality of data hinders any options to study the infection dynamics based on official data.

To overcome these difficulties, we propose the REMEDID algorithm, which allows calculation of time series for daily infections from daily death time series, a more reliable source, which can be applied in early stages of any pandemic with only 33 days delay. The REMEDID algorithm needs only three inputs:

(1) PDF of the period from date of infection to date of death. In this study, that was calculated by combining the PDF of the Incubation Period and the Illness Onset to Death period, as estimated by Linton et al.11 for cases in Wuhan, China. The incubation period can be assumed to be similar in China and elsewhere, because it is a characteristic of the virus, but the illness onset to death period could be affected by the type of health care and, thus, by each national health system2 or even ethnicity19. No such information is available for Spain to date, but it would be desirable to use a local distribution of Illness Onset to Death when available for Spain. The calculations could be redone easily by applying the MATLAB code available in the Supplementary Material.
(2) Time series of daily deaths. We used two sources, the official COVID-19 deaths and the MoMo ED, because official COVID-19 deaths have been underestimated, especially during the first wave in the regions of Castilla–La Mancha, Castilla y León, Cataluña, and Madrid, because non-hospital deaths and cases during the early pandemic were somewhat undertested (Table 1 and Figure 2). We expect that undocumented COVID- 19 deaths appeared in the excess of expected deaths from any cause calculated from MoMo15. Interpretation of those results must be made carefully because ED is a statistical variable calculated from what happened in previous years and some expected deaths, such as those produced by traffic and work accidents, may have been reduced due to the national lockdown that began on 14 March 2020 and gradually has been eased. Nonetheless, expected fatalities were very low (traffic and work accidents less than 5 deaths daily) compared to the daily death toll from COVID-19 (300 deaths per day average from March to May). Some non-COVID-19 deaths may have been influenced by other pandemic-related factors22, such as breakdown in medical follow-up (especially chronic conditions and among older patients), delays in attending to the healthcare appointments and/or being attended due to the collapse of the Health System, transplant delay and reduction in donors’ numbers from 7.2 to 1.2 per day23, and the decrease of the interventional cardiology activity between 40% and 81%24. Thus, MoMo ED do not exactly correspond to COVID-19 deaths and must be regarded with caution. A detailed study of the changes in other sources of mortality is recommended. Nevertheless, MoMo ED is a valuable source of data that must be explored considering the enormous bias in official data of daily infections.
(3) CFR estimate. CFR was defined as the ratio of deaths among total infections. Thus, the lower the documented infections, the larger the CFR and under-detection of infections leads to overestimation of the CFR. For example, in Spain on 22 June, official data showed 261,111 accumulated infections and 29,676 deaths, a CFR of 11.37%, which was clearly overestimated because the number of infections was underestimated. The National Seroprevalence Study in Spain with random testing showed that 5.2% of the population (∼47 million) had already been infected4,5. Because CFR also depends on death records, we calculated a CFR of 1.21% (CI 95%: 1.15%- 1.29%) from official COVID-19 deaths and a CFR of 1.85% (CI 95%: 1.74%-1.96%) from MoMo ED. However, the CFR is not constant in time nor in space. Different regions showed different CFRs and the fourth phase of the seroprevalence study showed a general decrease in the CFRs for the period 23 June – 29 November. In Spain, the CFR decreased from 1.21% to 0.76% (CI 95%: 0.71%-0.81%) from official COVID-19 deaths and from 1.85% to 1.02% (CI 95%: 0.97%-1.09%) from MoMo ED. This decline is probably due to the improvement of COVID-19 treatments and experience gained since the beginning of the pandemic25. Similar seroprevalence studies are desirable in other countries, because CFRs from official data are so variable. For example, on 22 June 2020, CFRs of 0.6, 5.2, and 14.5% were reported for Iceland, the world, and Italy, respectively26. In addition, we assumed a constant CFR prior to 22 June and from 23 June to 29 November 2020, although introducing time dependence on the CFR would enhance the infected estimates. A constant CFR does not consider who is affected by saturation of the health systems due to limited human and material resources. However, to estimate a time-variable CFR requires accurate infection records among other variables, which are not currently available. Moreover, CFRs among severe cases (mostly detected) are much greater than those among asymptomatic and mild cases (mostly undetected). Consequently, the CFR calculation has a high degree of uncertainty. Nevertheless, because information on the CFRs is expected to improve, calculations in this study can be updated easily using the MATLAB code for REMEDID algorithm provided as Supplementary Material.

We ran the REMEDID algorithm to provide the estimated time series of infections for Spain and for its 19 regions for the period from 8 January to 29 November 2020. These time series provided valuable information to understand the time evolution of the pandemic. Our main findings for Spain are:

(1) On 14 March, when the national government decreed the state of emergency and national lockdown, estimated infections were between 35 and 42 times larger than officially documented ones at that time, and 9-10 times more than current official data.
(2) The national lockdown had a strong effect on the transmission of the virus, as shown by a rapid slope decrease around day 13 March.
(3) The first infection was better determined from inferred infections than from official records during the first months of the pandemic.

The REMEDID algorithm has strengths and limitations. First, it uses the number of deaths, which are more accurately recorded than infections. Second, it allows elucidation of the date of the infections estimated during the seroprevalence studies, which only determines the total number of infections. Thus, the REMEDID algorithm complements the seroprevalence studies. Third, the estimated daily infections place the probable date of infection more accurately than the official numbers. Note that official infections are delayed by the incubation period (from infection to illness onset).

Therefore, although the maxima of infections theoretically should have coincided with the state of emergency and the national lockdown on 14 March, official infections in Spain were maximum 6 days later. Infections estimated from death data do not show such delay, however. Determination of the actual day of infection is very important for evaluation of the immediate effect of the implemented countermeasures. In Madrid the estimated infections reached their maximum on 11 March when regional government warned the population to stay at home and schools and universities were closed, forcing 1.2 million students to stay at home. Moreover, overall recommendations on disease control and social distancing were given by the Ministry of Health to the general public since de beginning of March (see https://www.mscbs.gob.es/en/profesionales/saludPublica/ccayes/alertasActual/nCov-China/ciudadania.htm).

Four, the REMEDID algorithm can be applied to any region, regardless of population size, but being cautious with small populations. For example, in this study we applied REMEDID methodology to a small city (Ceuta with 84,777 inhabitants), a medium size region (Madrid with more than 6.6 million inhabitants), and a country (Spain with 47 million inhabitants). This versatility allows study of different epidemic dynamics as reflected in the variety of shapes and slopes in the curves depicting death and infection records, that show the heterogeneity of the epidemic at each region with different population size and density, social behaviour and transmission pattern, especially during the second wave, showing unique dynamics that must be addressed individually. Knowledge of the real epidemic dynamics of different population nodes is key to succeed in modelling attempts, because we could calculate the sole effect of group size²⁷. This spatially-explicit information, in combination with population size per node and mobility would allow us to use a metapopulation approach in future models^28–30.

Among the methodology limitations, the most important is that it can only be implemented retrospectively. Thus, these estimates cannot be used to control the pandemic in real time. But considering that different regions or countries are at different stages at the same time, results of this methodology for the first communities could be applied elsewhere. Furthermore, it is useful to enhance models and improve our knowledge of the pandemic dynamics, including the effectiveness of the different measures adopted to flatten the curve and design safe post-lockdown measures. In addition, more realistic and accurate models could be obtained by use of the more realistic daily number of infections provided by REMEDID, which, in turn, would improve the models outcome and enhance comparisons of different lockdown and post- lockdown measures³¹.

We believe that REMEDID methodology applied to daily COVID-19 deaths (if accurately reported) or to MoMo ED could be useful to analyse the dynamics of the pandemic retrospectively and more accurately quantify the real daily infections with respect to the official numbers. We only need the CFR and, for greater precision, the PDF of Infections to Death or, alternatively, the PDF of Incubation Period and Illness Onset to Death. This approach could be implemented anywhere, improving our knowledge and understanding of the dynamics of the pandemic and the effectiveness of the confinement measures. This is of high importance to develop successful strategies to face and reduce the effects of future epidemic episodes.

Data Availability

All data used is available in the manuscript or in the references. The MATLAB code is available in the Supplementary Material.

Appendix A. Validation test

In order to validate REMEDID, it would be desirable to have an accurate time series of daily infections and deaths. Then, the REMEDID infections inferred from deaths could be compared to the recorded infections. However, such datasets are not available because many infections in any disease or plague are usually undetected. On the other hand, REMEDID-estimated infections cannot be compared to those of dynamical models, which also need accurate data to be tuned. In fact, the main motivation for developing REMEDID has been to provide (more) reliable data to feed dynamical models.

As an alternative of validation, we propose the following experiment as a proof of concept. In the official COVID-19 time series, IO₂₁, infections are given the date of symptoms onset. Then, REMEDID applied to official deaths with the PDF of illness onset to death, h(t), instead of the PDF of infection to death, f(t), should produce a time series comparable to IO₂₁. Official accumulated cases and deaths on 22 June 2020 were 261,111 and 29,676, respectively, which gives a CFR of 11.37%. This CFR is overestimated due to an under detection of cases in that period. On the other hand, official accumulated cases and deaths from 23 June to 29 November 2020 were 1,482,293 and 17,439, which gives a CFR of 1.18%. Although closer to reality, this value is still overestimated. In any case, if we use these CFR values to construct the segmented timeseries of CFRs, as in section 2.2, when applying REMEDID the resulting estimated daily infections should be comparable to IO₂₁. The results are shown in Figure A.1 where the REMEDID estimated daily infections is in very good agreement with the smoothed IO₂₁ (14-day running mean), with a correlation between the two series of 0.98 % (p-value<0.001). Because smoothed time series present a high degree of serial correlation, the correlation and the significance level was estimated according to a specific Monte Carlo analysis based on the randomization of phases in the frequency domain³². Notice that (i) application of the 14-day running mean to the IO₂₁ series removes the weekly variability due to the weekend infections misreport; (ii) the agreement of the two series would improve with an improved estimate of the CFR time series. Thus, the good match between the estimated and the original daily infections timeseries proves the theoretical concept that supports REMEDID.

Figure A.1.

REMEDID validation test. Thin blue line is official infections and thick blue line is its 14-day-running mean; green line is official deaths; and red line is REMEDID applied with the PDF of illness onset to death, h(t).

Note the differences with the results presented in Section 3.2, where the PDF includes the incubation period and a more realistic CFR, as estimated from the National Seroprevalence Study data.

Footnotes

The most relevant changes in the manuscript are: (1) As suggested by Reviewer 1, (Lewin 1986; Perez, Monsalve, and Marquez 2015) we have updated our analysis. In the original version of the manuscript, time series of inferred infection ended on 28 April 2020, when the first phase of the National Seroprevalence Study was concluded. Now, we have updated them to 29 November 2020, when the last phase of the same study was finished. That is an extension from 4 to 11 months. REMEDID depends on seroprevalence studies since they are needed to estimate the Case Fatality Ratio (CFR). The extension of time series had a great impact on the results of the study, since new information related to the second wave arose and had to be analyzed. That is the reason for so much changes in the data and result sections. However, the methodology and the discussion are quite similar to the original version. (2) As suggested by the Editorial Board Member, we have included a validation test for REMEDID as a new Appendix. Validation of REMEDID results presented in the manuscript is quite complicated due to the lack of accurate data to compare with. Neither dynamical models could be used since they also need accurate data to be tuned. Thus, we propose a proof of concept based in a modification of REMEDID to make it comparable with available official data. The results are positive.

References

1.↵
Holmdahl, S. M. I. & Buckee, D. P. C. Wrong but Useful — What Covid-19 epidemiologic models can and cannot tell us. N Eng J Med. DOI: 10.1056/NEJMp2016822 (2020)
OpenUrl CrossRef PubMed
2.↵
Fonfría, E.S. et al. Essential epidemiological parameters of COVID-19 for clinical and mathematical modeling purposes: a rapid review and meta-analysis. medRxiv 2020.06.17.20133587; doi: https://doi.org/10.1101/2020.06.17.20133587 (2020)
3.↵
Li, R. et al. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science 368, Issue 6490, 489–493, DOI: 10.1126/science.abb3221 (2020).
OpenUrl Abstract/FREE Full Text
4.↵
Instituto de Salud Carlos III. Estudio ENE-COVID19: Estudio Nacional de sero- Epidemiología de la infección por SARS-COV-2 en España. Informe final, 6 July 2020. Available online at https://www.mscbs.gob.es/ciudadanos/ene-covid/docs/ESTUDIO_ENE-COVID19_INFORME_FINAL.pdf
5.↵
Pollán, M., et al. Prevalence of SARS-CoV-2 in Spain (ENE-COVID): a nationwide, population-based seroepidemiological study. The Lancet 396, Issue 10250, 535–544, https://doi.org/10.1016/S0140-6736(20)31483-5 (2020).
OpenUrl
6.↵
World Health Organization. Coronavirus disease 2019 (COVID-19) Situation Report – 24, published online on February 2020, https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200213-sitrep-24-covid-19.pdf?sfvrsn=9a7406a4_4 (WHO, 2020)
7.↵
Centro de Coordinación de Alertas y Emergencias Sanitarias. Actualización n° 84. Enfermedad por el coronavirus (COVID-19), published online on 23 April 2020, https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov-China/documentos/Actualizacion_84_COVID-19.pdf (CCAES, 2020a).
8.↵
Centro de Coordinación de Alertas y Emergencias Sanitarias. Actualización n° 85. Enfermedad por el coronavirus (COVID-19), published online on 24 April 2020, https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov-China/documentos/Actualizacion_85_COVID-19.pdf (CCAES, 2020b).
9.↵
Centro de Coordinación de Alertas y Emergencias Sanitarias. Actualización n° 111. Enfermedad por el coronavirus (COVID-19), published online on 20 May 2020, https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov-China/documentos/Actualizacion_111_COVID-19.pdf (CCAES, 2020c).
10.↵
Centro de Coordinación de Alertas y Emergencias Sanitarias. Actualización n° 112. Enfermedad por el coronavirus (COVID-19), published online on 21 May 2020, https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov-China/documentos/Actualizacion_112_COVID-19.pdf (CCAES, 2020d).
11.↵
Linton et al. Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data. Journal of Clinical Medicine 9 (2), 538, DOI: 10.3390/jcm9020538 (2020).
OpenUrl CrossRef PubMed
12.↵
Uspensky, J. V. Introduction to mathematical probability. Ed. McGraw-Hill Book Company, New York and London (1937).
13.↵
Wallinga, J. & Teunis, P. Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures. American Journal of Epidemiology 160, 509–516 (2004).
OpenUrl CrossRef PubMed Web of Science
14.↵
Instituto de Salud Carlos III. Estudio ENE-COVID19: Cuarta ronda. Estudio Nacional de sero-Epidemiología de la infección por SARS-COV-2 en España. Cuarta ronda, 6 July 2020. Available online at https://portalcne.isciii.es/enecovid19/informes/informe_cuarta_ronda.pdf
15.↵
León-Gómez, I. et al. Exceso de mortalidad relacionado con la gripe en España en el invierno de 2012, Gac Sanit 29, 258–65 (2015).
OpenUrl CrossRef PubMed
16.↵
YY. Gil Bellosta, C. J., Luz Frías, Ludovica Verrieri, Concha Delgado, Inmaculada León y Amparo Larrauri. Informe MO Calor, Estimaciones de la mortalidad atribuible a excesos de temperatura en España, 1 de junio a 15 de septiembre de 2020. Centro Nacional de Epidemiología. Ciber de Epidemiología y Salud Pública (CIBERESP). Instituto de Salud Carlos III. Octubre 2020. https://www.isciii.es/QueHacemos/Servicios/VigilanciaSaludPublicaRENAVE/EnfermedadesTransmisibles/MoMo/Paginas/MOMOcalor.aspx [Last access: 16-02-2021]
17.↵
Fang Yong, S. E., et al., Connecting clusters of COVID-19: an epidemiological and serological investigation. The Lancet, published online April 21, 2020 https://doi.org/10.1016/S1473-3099(20)30273-5 (2020).
18.↵
Defunciones según la Causa de Muerte - Avance enero-mayo de 2019 y de 2020. Notas de prensa del Instituto Nacional de Estadística. 10 diciembre 2020. https://www.ine.es/dyngs/INEbase/es/operacion.htm?c=Estadistica_C&cid=1254736176780&menu=ultiDatos&idp=1254735573175 [Last access on 16 February 2021].
19.
Risk for COVID-19 Infection, Hospitalization, and Death By Race/Ethnicity. https://www.cdc.gov/coronavirus/2019-ncov/covid-data/investigations-discovery/hospitalization-death-by-race-ethnicity.html [last access on 5 March 2021]
20.
20. Dirección General de Tráfico, Ministry of Interior. Cuadro comparativo de accidentes mortales y fallecidos a 24 h en vías interurbanas, published online on 13 May 2020, http://www.dgt.es/Galerias/seguridad-vial/estadisticas-e-indicadores/accidentes-24-horas/2020/Mayo/CC_-2020_05_13.pdf (DGT, 2020).
21.
Ministerio de Trabajo y Economía Social. Estadística de Accidentes de Trabajo, Avance enero-marzo 2020, published online on 18 May 2020, http://www.mitramiss.gob.es/estadisticas/eat/eat20_03/ATR_03_2020_Resumen.pdf (MTES, 2020).
22.
Vandoros, S. Excess Mortality during the Covid-19 pandemic: Early evidence from England and Wales. medRxiv 2020.04.14.20065706; doi: https://doi.org/10.1101/2020.04.14.20065706 (2020).
23.
Domínguez Gil, B., et al. COVID-19 in Spain: Transplantation in the Midst of the Pandemic. Am J Transplant. Accepted Author Manuscript. doi:10.1111/ajt.15983 (2020).
OpenUrl CrossRef
24.
Rodríguez-Leor, O. et al. Impacto de la pandemia de COVID-19 sobre la actividad asistencial en cardiología intervencionista en España. REC Interv Cardiol 2 (2), 82–89 (2020).
OpenUrl CrossRef
25.
Boudourakis L, Uppal A. Decreased COVID-19 Mortality—A Cause for Optimism. JAMA Intern Med. Published online December 22, 2020. doi:10.1001/jamainternmed.2020.8438
OpenUrl CrossRef
26.
Roser, M., Ritchie, H., Ortiz-Ospina, E. & Hasell, J. Mortality Risk of COVID-19. Published online in Our World in Data: https://ourworldindata.org/mortality-risk-covid#the-case-fatality-rate (accessed on 23rd May 2020).
27.↵
Caillaud, D., Craft, M.E. & Meyers, A. Epidemiological effects of group size variation in social species. J. R. Soc. Interface 10: 20130206, DOI: 10.1098/rsif.2013.0206 (2013).
OpenUrl CrossRef PubMed
28.↵
Levins, R. Some demographic and genetic consequences of environmental heterogeneity for biological control. Bull. Entomol. Soc. Am. Am. 15 (3), 237–240 (1969).
OpenUrl
29.
Hanski, I. Metapopulation Ecology. Oxford University Press (1999).
30.↵
Ball, F. et al. Seven challenges for metapopulation models of epidemics, including households models. Epidemics 10, 63–67. http://dx.doi.org/10.1016/j.epidem.2014.08.001 (2015)
OpenUrl
31.↵
Bordehore, C. et al. Understanding COVID-19 spreading through simulation modeling and scenarios comparison: preliminary results. medRxiv 2020.03.30.20047043, doi: https://doi.org/10.1101/2020.03.30.20047043 (2020)
32.↵
Ebisuzaki, W. A method to estimate the statistical significance of a correlation when the data are serially correlated, J. Clim., 10, 2147–2153, (1997).
OpenUrl

View the discussion thread.

Posted March 23, 2021.

Download PDF

Supplementary Material

Data/Code

Citation Tools

Subject Area

Epidemiology

Subject Areas

All Articles

Addiction Medicine (393)
Allergy and Immunology (705)
Anesthesia (197)
Cardiovascular Medicine (2882)
Dentistry and Oral Medicine (327)
Dermatology (247)
Emergency Medicine (432)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1019)
Epidemiology (12622)
Forensic Medicine (10)
Gastroenterology (813)
Genetic and Genomic Medicine (4483)
Geriatric Medicine (409)
Health Economics (717)
Health Informatics (2868)
Health Policy (1059)
Health Systems and Quality Improvement (1057)
Hematology (379)
HIV/AIDS (911)
Infectious Diseases (except HIV/AIDS) (14031)
Intensive Care and Critical Care Medicine (836)
Medical Education (418)
Medical Ethics (115)
Nephrology (466)
Neurology (4251)
Nursing (228)
Nutrition (622)
Obstetrics and Gynecology (796)
Occupational and Environmental Health (727)
Oncology (2226)
Ophthalmology (634)
Orthopedics (255)
Otolaryngology (323)
Pain Medicine (270)
Palliative Medicine (83)
Pathology (491)
Pediatrics (1182)
Pharmacology and Therapeutics (492)
Primary Care Research (488)
Psychiatry and Clinical Psychology (3696)
Public and Global Health (6829)
Radiology and Imaging (1503)
Rehabilitation Medicine and Physical Therapy (878)
Respiratory Medicine (909)
Rheumatology (431)
Sexual and Reproductive Health (434)
Sports Medicine (376)
Surgery (476)
Toxicology (60)
Transplantation (206)
Urology (176)

[1] 1.↵
Holmdahl, S. M. I. & Buckee, D. P. C. Wrong but Useful — What Covid-19 epidemiologic models can and cannot tell us. N Eng J Med. DOI: 10.1056/NEJMp2016822 (2020)
OpenUrl CrossRef PubMed

[2] 2.↵
Fonfría, E.S. et al. Essential epidemiological parameters of COVID-19 for clinical and mathematical modeling purposes: a rapid review and meta-analysis. medRxiv 2020.06.17.20133587; doi: https://doi.org/10.1101/2020.06.17.20133587 (2020)

[3] 3.↵
Li, R. et al. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science 368, Issue 6490, 489–493, DOI: 10.1126/science.abb3221 (2020).
OpenUrl Abstract/FREE Full Text

[4] 4.↵
Instituto de Salud Carlos III. Estudio ENE-COVID19: Estudio Nacional de sero- Epidemiología de la infección por SARS-COV-2 en España. Informe final, 6 July 2020. Available online at https://www.mscbs.gob.es/ciudadanos/ene-covid/docs/ESTUDIO_ENE-COVID19_INFORME_FINAL.pdf

[5] 5.↵
Pollán, M., et al. Prevalence of SARS-CoV-2 in Spain (ENE-COVID): a nationwide, population-based seroepidemiological study. The Lancet 396, Issue 10250, 535–544, https://doi.org/10.1016/S0140-6736(20)31483-5 (2020).
OpenUrl

[6] 6.↵
World Health Organization. Coronavirus disease 2019 (COVID-19) Situation Report – 24, published online on February 2020, https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200213-sitrep-24-covid-19.pdf?sfvrsn=9a7406a4_4 (WHO, 2020)

[7] 7.↵
Centro de Coordinación de Alertas y Emergencias Sanitarias. Actualización n° 84. Enfermedad por el coronavirus (COVID-19), published online on 23 April 2020, https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov-China/documentos/Actualizacion_84_COVID-19.pdf (CCAES, 2020a).

[8] 8.↵
Centro de Coordinación de Alertas y Emergencias Sanitarias. Actualización n° 85. Enfermedad por el coronavirus (COVID-19), published online on 24 April 2020, https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov-China/documentos/Actualizacion_85_COVID-19.pdf (CCAES, 2020b).

[9] 9.↵
Centro de Coordinación de Alertas y Emergencias Sanitarias. Actualización n° 111. Enfermedad por el coronavirus (COVID-19), published online on 20 May 2020, https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov-China/documentos/Actualizacion_111_COVID-19.pdf (CCAES, 2020c).

[10] 10.↵
Centro de Coordinación de Alertas y Emergencias Sanitarias. Actualización n° 112. Enfermedad por el coronavirus (COVID-19), published online on 21 May 2020, https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov-China/documentos/Actualizacion_112_COVID-19.pdf (CCAES, 2020d).

[11] 11.↵
Linton et al. Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data. Journal of Clinical Medicine 9 (2), 538, DOI: 10.3390/jcm9020538 (2020).
OpenUrl CrossRef PubMed

[12] 12.↵
Uspensky, J. V. Introduction to mathematical probability. Ed. McGraw-Hill Book Company, New York and London (1937).

[13] 13.↵
Wallinga, J. & Teunis, P. Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures. American Journal of Epidemiology 160, 509–516 (2004).
OpenUrl CrossRef PubMed Web of Science

[14] 14.↵
Instituto de Salud Carlos III. Estudio ENE-COVID19: Cuarta ronda. Estudio Nacional de sero-Epidemiología de la infección por SARS-COV-2 en España. Cuarta ronda, 6 July 2020. Available online at https://portalcne.isciii.es/enecovid19/informes/informe_cuarta_ronda.pdf

[15] 15.↵
León-Gómez, I. et al. Exceso de mortalidad relacionado con la gripe en España en el invierno de 2012, Gac Sanit 29, 258–65 (2015).
OpenUrl CrossRef PubMed

[16] 16.↵
YY. Gil Bellosta, C. J., Luz Frías, Ludovica Verrieri, Concha Delgado, Inmaculada León y Amparo Larrauri. Informe MO Calor, Estimaciones de la mortalidad atribuible a excesos de temperatura en España, 1 de junio a 15 de septiembre de 2020. Centro Nacional de Epidemiología. Ciber de Epidemiología y Salud Pública (CIBERESP). Instituto de Salud Carlos III. Octubre 2020. https://www.isciii.es/QueHacemos/Servicios/VigilanciaSaludPublicaRENAVE/EnfermedadesTransmisibles/MoMo/Paginas/MOMOcalor.aspx [Last access: 16-02-2021]

[17] 17.↵
Fang Yong, S. E., et al., Connecting clusters of COVID-19: an epidemiological and serological investigation. The Lancet, published online April 21, 2020 https://doi.org/10.1016/S1473-3099(20)30273-5 (2020).

[18] 18.↵
Defunciones según la Causa de Muerte - Avance enero-mayo de 2019 y de 2020. Notas de prensa del Instituto Nacional de Estadística. 10 diciembre 2020. https://www.ine.es/dyngs/INEbase/es/operacion.htm?c=Estadistica_C&cid=1254736176780&menu=ultiDatos&idp=1254735573175 [Last access on 16 February 2021].

[19] 19.
Risk for COVID-19 Infection, Hospitalization, and Death By Race/Ethnicity. https://www.cdc.gov/coronavirus/2019-ncov/covid-data/investigations-discovery/hospitalization-death-by-race-ethnicity.html [last access on 5 March 2021]

[20] 20.
20. Dirección General de Tráfico, Ministry of Interior. Cuadro comparativo de accidentes mortales y fallecidos a 24 h en vías interurbanas, published online on 13 May 2020, http://www.dgt.es/Galerias/seguridad-vial/estadisticas-e-indicadores/accidentes-24-horas/2020/Mayo/CC_-2020_05_13.pdf (DGT, 2020).

[21] 21.
Ministerio de Trabajo y Economía Social. Estadística de Accidentes de Trabajo, Avance enero-marzo 2020, published online on 18 May 2020, http://www.mitramiss.gob.es/estadisticas/eat/eat20_03/ATR_03_2020_Resumen.pdf (MTES, 2020).

[22] 22.
Vandoros, S. Excess Mortality during the Covid-19 pandemic: Early evidence from England and Wales. medRxiv 2020.04.14.20065706; doi: https://doi.org/10.1101/2020.04.14.20065706 (2020).

[23] 23.
Domínguez Gil, B., et al. COVID-19 in Spain: Transplantation in the Midst of the Pandemic. Am J Transplant. Accepted Author Manuscript. doi:10.1111/ajt.15983 (2020).
OpenUrl CrossRef

[24] 24.
Rodríguez-Leor, O. et al. Impacto de la pandemia de COVID-19 sobre la actividad asistencial en cardiología intervencionista en España. REC Interv Cardiol 2 (2), 82–89 (2020).
OpenUrl CrossRef

[25] 25.
Boudourakis L, Uppal A. Decreased COVID-19 Mortality—A Cause for Optimism. JAMA Intern Med. Published online December 22, 2020. doi:10.1001/jamainternmed.2020.8438
OpenUrl CrossRef

[26] 26.
Roser, M., Ritchie, H., Ortiz-Ospina, E. & Hasell, J. Mortality Risk of COVID-19. Published online in Our World in Data: https://ourworldindata.org/mortality-risk-covid#the-case-fatality-rate (accessed on 23rd May 2020).

[27] 27.↵
Caillaud, D., Craft, M.E. & Meyers, A. Epidemiological effects of group size variation in social species. J. R. Soc. Interface 10: 20130206, DOI: 10.1098/rsif.2013.0206 (2013).
OpenUrl CrossRef PubMed

[28] 28.↵
Levins, R. Some demographic and genetic consequences of environmental heterogeneity for biological control. Bull. Entomol. Soc. Am. Am. 15 (3), 237–240 (1969).
OpenUrl

[29] 29.
Hanski, I. Metapopulation Ecology. Oxford University Press (1999).

[30] 30.↵
Ball, F. et al. Seven challenges for metapopulation models of epidemics, including households models. Epidemics 10, 63–67. http://dx.doi.org/10.1016/j.epidem.2014.08.001 (2015)
OpenUrl

[31] 31.↵
Bordehore, C. et al. Understanding COVID-19 spreading through simulation modeling and scenarios comparison: preliminary results. medRxiv 2020.03.30.20047043, doi: https://doi.org/10.1101/2020.03.30.20047043 (2020)

[32] 32.↵
Ebisuzaki, W. A method to estimate the statistical significance of a correlation when the data are serially correlated, J. Clim., 10, 2147–2153, (1997).
OpenUrl

Retrospective Methodology to Estimate Daily Infections from Deaths (REMEDID) in COVID-19: the Spain case study

Abstract

1. Introduction

2. Methods

2.1 REMEDID

2.2. Data

2.2.a. Official numbers of COVID-19 infections and deaths

2.2.b. Excess of expected deaths from MoMo

3. Results

3.1. COVID-19 vs. MoMo

3.2. Infections estimated from COVID-19 deaths

3.3. Infections from MoMo Excess Deaths

4. Discussion

Data Availability

Appendix A. Validation test

Footnotes

References

Citation Manager Formats

Subject Area