Abstract
The COVID-19 pandemy has created a radically new situation where most countries provide raw measurements of their daily incidence and disclose them in real time. This enables new machine learning forecast strategies where the prediction might no longer be based just on the past values of the current incidence curve, but could take advantage of observations in many countries. We present such a simple global machine learning procedure using all past daily incidence trend curves. Each of the 27,418 COVID-19 incidence trend curves in our database contains the values of 56 consecutive days extracted from observed incidence curves across 61 world regions and countries. Given a current incidence trend curve observed over the past four weeks, its forecast in the next four weeks is computed by matching it with the first four weeks of all samples, and ranking them by their similarity to the query curve. Then the 28 days forecast is obtained by a statistical estimation combining the values of the 28 last observed days in those similar samples. Using comparison performed by the European Covid-19 Forecast Hub with the current state of the art forecast methods, we verify that the proposed global learning method, EpiLearn, compares favorably to methods forecasting from a single past curve.
Author summary Forecasting the short time evolution of the COVID-19 daily incidence is a key issue in the epidemic decision making policy. We propose a machine learning method which forecasts the future values of the daily incidence trend based on the evolution of other incidence trend curves that were similar to the current one in the past. Using comparison performed by the European Covid-19 Forecast Hub with the current state of the art forecast methods, we verify that the proposed global learning method, EpiLearn compares favorably to methods that forecast from a single past curve.
Introduction
The COVID-19 epidemic has provided us with information on the evolution of the daily incidence in many different countries and epidemic scenarios. Given the enormous global impact of COVID-19, a large number of researchers have studied the problem of predicting the incidence curve. For example, the European Covid-19 Forecast Hub [1] gathers a variety of prediction models based on many different techniques. These methods observe the past of daily incidence in a given country and forecast its future evolution in the weeks to come. The prediction is generally made for the next four weeks. Most of these methods base their forecast on the observation of only the past values of the current incidence curve, that is, the one that they want to extend towards the future.
The main objective of this paper is to introduce a prediction method that learns the future of a given incidence trend curve from the past evolution of other many incidence trend curves. Our method can be seen as an extension of the “method of analogues”, inspired from meteorology and first introduced for epidemiologic forecasting in [2] in predicting influenza activity. This method uses vectors selected from historical influenza time series that match current activity. The authors applied it to forecasting the incidences of influenza in France and in the country’s 21 administrative regions, using a series of data for 938 consecutive weeks of surveillance between 1984 and 2002, and compared the results with those for autoregressive models. They reported that for 1- to 10-week-ahead predictions, the correlation coefficients between the observed and forecasted regional incidences was significantly superior with the method of analogues than for autoregressive models. The method compares fixed incidence intervals to a query interval by their Euclidean distance, and obtains a prediction as a weighted mean of the incidences that follow the nearest neighbors. Nevertheless, a major difference of their method with ours is that they restrict their comparison to the past history of each incidence curve. Hence, their learning set is considerably smaller than the one that uses many regions or countries: It assumes the observation of a several years period and takes advantage of the periodicity of influenza.
The sophisticated extension of the method of analogues proposed in [3] also uses historical data (up to 20 years) to obtain predictive distributions for incidence in individual weeks using a kernel conditional density estimation (KCDE). Then these individual distributions are tied into joint distributions using copulas, to predict the timing of and incidence in the peak week of the season. Like in [2], the method is applied to a single time series and therefore requires a much longer observation period that the one that could be used for COVID-19 so far. Arguably the closest method to our proposed one is the neural method of [4]. The authors introduce a new neural forecasting model called Attention Crossing Time Series, that makes forecasts via comparing patterns across time series obtained from multiple regions. It interprets the attention mechanism [5] as an application of the “method of analogues”. The model is demonstrated to outperform many recent SEIR models.
In a nutshell, our proposed learning method uses all past incidence trend curves that are similar on 28 consecutive days to the last 28 last days of the trend incidence curve that is to be extended towards the future. To demonstrate the method, we use as learning database a collection of 27,418 COVID-19 past incidence trend curves across 61 world regions and countries. These trend curves are computed by the EpiInvert method [6] from the original raw incidence curves communicated by the governments. A raw incidence curve is not the adequate input for forecasting because of its high noise and weekly oscillation. The weekly seasonality depends on each country, thus hindering comparison between raw incidence curves. Trend curves instead, being freed from seasonality and noise, are much more suitable to forecasting. Nevertheless, as we will show later, a daily forecast of the raw incidence can be deduced from its forecasted trend using the estimated seasonality.
Let us denote by s = (s1, s2, …, s28), the last 28 values of the current incidence trend that we want to extrapolate, and by the forecast for the next 28 days proposed in this work. Each of the 27,418 incidence trend curves in our database contains the values of 56 consecutive days extracted from observed past incidence curves. We predict the evolution of the current incidence trend curve from the median of the 28 last days of the 27,418 database curves, where the median is computed on the 121 most similar curves. The similarity to the query of these candidate curves is measured on its first 28 days, which are matched to the 28 last observed days of the query curve s that we want to extrapolate. In summary, the 28 future samples sf of the current curve s are obtained as the median of the corresponding days 29 to 56 of the most similar past curves1.
We also compute empirical confidence intervals for the incidence trend forecast by applying the proposed method to the incidence curves of our database and obtaining a distribution of the forecast error as a function of the number of days passed from the current day (the last day of the used incidence curve). In Fig 1 we illustrate the results of the proposed method for four countries, using their incidence curves up to May 5, 2022. This figure displays in black the raw input incidence curves, which show a strong weekly periodic bias. In the case of France for example, there is a strong deficit on week-ends compensated by a peak on Mondays. For our prediction, we therefore use a smooth incidence trend curve (in red), that is easier to extend and forecast than the original raw incidence. The usual way to compute an incidence trend curve is to apply a 7 or 14 days sliding average to the original raw incidence, which reduces the weekly effects [7]. In our method, we use the more sophisticated EpiInvert method introduced in [6, 8] and available as a CRAN R package [9]. This method is summarily described in the Material and Methods section. Fig 1 shows in blue the forecast curve, that can be compared to the magenta ground truth that became later available. In light blue, the figure also displays the predicted raw incidence curve where the weekly bias learned by EpiInvert in the immediate past is also applied. In these relatively favorable examples, picked from large countries with large incidence and at a time of regular daily measurements, the error between ground truth and prediction seems acceptable. Nevertheless, the error on the fourth week can exceed 25%. This is not surprising, given the high variability of the possible futures illustrated for the same countries and times in Fig 3. In this introduction we do not present the many alternative forecasting methods. Instead, we review them in detail in the discussion section. The methods that were publicly available through the European Covid-19 Forecast Hub are quantitatively compared with our method through the unbiased metrics of the hub. Our learning technique is different in structure from most previous methods introduced in the literature. We involve no parametric model for the incidence curve. Our method produces a daily forecast of the future, whereas most COVID-19 incidence analysis methods [7, 10] aim to forecast the 7-day sliding average of the daily incidence.
The particular significance of this study lies in the novelty of our machine learning approach that provides a daily forecast of the current incidence curve based on its similarity with many different incidence curves in the past. The unbiased comparison with other methods in the context of the European Covid-19 Forecast Hub confirms the good performance of the proposed method.
Material and Methods
Data sources
The incidence trend database has been built using the daily incidence data, up to May 5, 2022, provided by Our World in Data in [11] for the following countries and regions: Argentina, Austria, Bangladesh, Belgium, Brazil, Canada, Chile, Colombia, Cuba, Czech
Republic, Denmark, Germany, France, Greece, Hungary, India, Iraq, Iran, Ireland, Israel, Italy, Japan, Jordan, Kazakhstan, Malaysia, Mexico, Nepal, Netherlands, Peru, Philippines, Poland, Romania, Russia, Serbia, Slovakia, South Africa, South Korea, Spain, Sweden, Switzerland, Thailand, Tunisia, Turkey, Ukraine, United Arab Emirates, United Kingdom, USA, Vietnam, Africa, South America, North America, Asia, Europe, European Union, Oceania, and the world.
The database provided by Our World in Data in [11] includes COVID-19 information about confirmed cases, deaths, vaccinations, testing and government responses. Confirmed cases and deaths are collected by Johns Hopkins University by date of report, rather than date of test/death. Therefore, the number they report on a given day does not necessarily represent the actual number on that date, because of the long reporting chain that exists between a new case/death and its inclusion in statistics. This also means that time series can show sudden changes (negative or positive) when a country corrects historical data, because it had previously under-or -over estimated the number of cases/deaths. The comparative results with other methods, presented in the comparative results part of the Results section, have been obtained by using the raw evaluation scores published by the European Hub, [1], in the file https://github.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe/tree/main/evaluation/scores.csv. These results cannot be manipulated and they use the version of the data that have been available in real time when producing forecasts.
Data preprocessing method
The infectiousness of individuals at time t is characterized by the reproduction number Rt, defined as the average number of cases generated by an infected person at time t, and by the (observable) serial interval Φs which represents the time distribution of the delay of the onset of symptoms between primary and secondary cases. For Covid 19, this serial interval was measured accurately in [12] on nearly 1000 verified transmission pairs. We use this distribution in the EpiInvert method.
Our forecast model uses the EpiInvert method, which aim it is to invert the fundamental renewal equation [13, 14] linking Rt, Φ and the incidence it of new daily cases, where tc is the current time. The EpiInvert method introduced in [8] and extended in [6, 9] is a deconvolution + denoising procedure to solve the functional equation (1). EpiInvert estimates both Rt and a restored it corrected for the weekend bias. To remove the weekend effect, it computes a 7-day quasi-periodic multiplicative factor qt. From the observed incidence curve and the serial interval, Rt and qt are jointly estimated by minimizing where median(t−τ,t](i0) is the median of in the interval (t − τ, t] used to normalize the energy with respect to the size of it (the value of τ is fixed to 21 (3 weeks) in the experiments). The total number of cases is preserved by adding to (2) the constraint on qt : where T is a period of analysis empirically fixed to T = 56 days. The minimization of the above energy yields estimates of Rt, and a quasi-periodic bias correction factor qt, as the third term in the functional forces the values qt − qt−7 to be small. The parameters wR and wq are regularization weights with default values wR = wq = 5. Their values were proven in [8] to be nearly optimal for Covid-19 incidence curves.
By minimizing this energy (2) under the constraint (3), we obtain the reproduction number Rt and the seasonality bias correction coefficients qt. An incidence corrected of the weekly bias is obtained as . The final restored incidence it that we use for forecasting is obtained by applying the renewal equation (1) to the bias corrected incidence , namely
Software
EpiLearn, the forecasting model presented in this work and including the preprocessing step described above, is implemented in the publicly available EpiInvert CRAN R package [9]. In this package, EpiLearn is executed using the EpiInvertForecast R function. In the vignette https://ctim.ulpgc.es/covid19/EpiInvertForecast.html a description, with examples, of EpiInvertForecast usage is presented.
Forecasting method
Next, we present the proposed method, let us, first, to introduce the following notation to manage the incidence curves and their forecast:
: the current raw incidence curve in the last 28 days.
: the forecast of the current raw incidence curve for the next 28 days in the future.
s = (s1, …, s28) : the current trend curve in the last 28 days.
: the forecast of the incidence trend curve for the next 28 days in the future.
: the collection of incidence trend curves in the database.
: the first 28 days of the database incidence trend curves that we use for comparison purposes with the current curve s.
: the forecast of the database curves for the last 28 days using as current curve the first 28 days.
: empirical distribution of the relative forecast error for the database curves in the forecast day d = 1, …, 28.
Incidence trend curves database construction using EpiInvert
Our proposed method, EpiLearn, uses a world-wide database of raw incidence curves from 61 countries and regions up to May 5, 2022. For each country or region, and for each day, starting 150 days after the beginning of the epidemic, we take the raw incidence data up to that day. Then, the resulting curve is further processed by applying the EpiInvert incidence decomposition algorithm [6] (see the Material and Methods section) and we keep the last 56 values of the estimated incidence trend curve. To add a curve of this type to the database, we impose that the mean of the 56 values of the sequence must be larger than 1000. Taking into account that we normalize all database curves, the magnitude of the curves therefore has no influence in the forecast estimation. This amounts to making the assumption that the incidence curve evolution has the same behavior in large countries than in small countries. We impose this minimum 1000 cases average condition because for very small averages the registered incidence curves are often very noisy and unreliable. Indeed, small averages often correspond to non-threatening or neglected stages of the epidemic.
Normalization of the database incidence curves
EpiInvert is magnitude-invariant, that is, multiplying the raw incidence values by a scalar factor multiplies the estimated EpinInvert incidence trend values by the same scalar factor. Our forecast method preserves this magnitude-invariance by normalizing the magnitude of the incidence trend curves.
Let N be the number of incidence trend curves stored in the database (in our case N =27,418). For corresponds to the last 56 days of the incidence trend curve computed by EpiInvert and stored in the database. Each ik has been normalized by multiplying it by a scale factor so that the average of the first 28 values be equal to 1:
Computing the distance between curves
We denote by ŝ the present-day incidence trend curve for the country being predicted, that has been normalized in the same way, so that
We compare the normalized vectors ŝ and sk (the first 28 values of ik) through the following magnitude-invariant distance average, where the parameter μ ≥ 0 governs the exponentially weighted moving average. The larger the value of μ, the lower this weight for the more remote days, as is classical in control theory [15] and in epidemiological forecasting [16]. As shown below, by minimizing the forecast error, we obtain the optimal value μ = 0.0475. Fig 2 shows the function f (x) = e−0.0475x which determines the weight assigned to each day in the past in the distance estimation.
Forecasting using a median of the closest database curves
First, we select in the database the Nmedian, the curves that are closest to the current one, using the similarity criterion (6). Nmedian is a parameter of the method. The median forecast of for the next 28 days is defined by
As EpiInvert also computes multiplicative weekly seasonality correction factors, qt, we additionally compute a forecast, so,f, of the raw incidence curve, so, by dividing the forecasted incidence trend curve by the corresponding seasonality factors, where % is the modulus operator. By using q22+d%7 as future seasonality factors we are simply making a 7-periodic extrapolation of the last seasonality factors estimated by EpiInvert.
Fig 3 illustrates the proposed learning procedure. For four countries, it shows the current incidence trend by EpinInvert, its 5 closest curves in the database for their first four weeks, and the forecast, computed as the median of the 121 closest curves in their last four weeks. For France, the UK and the USA, we can observe that among the most similar curves there are curves with a strong growth. These curves correspond to the first wave of the omicron variant in Romania, Hungary and Italy that occurred by the end of 2021. These examples show that very close curves in the past can evolve very differently in the future. In particular, the methods studied in this paper, which forecast the evolution of the incidence only using past incidence data, may be subject to large errors in forecasting.
Choice of the method parameters
We have to choose the parameters of the method, Nmedian and μ. For each curve in the database, we use as current incidence curve the first 28 days of ik, that is, . The forecast of sk is given by where are the Nmedian closest curves, in the database, using the distance (6), to sk (removing from the choice a neighborhood of k in the database). For each forecast day d = 1, .., 28, the relative forecast error is given by We define the method’s median forecast error by
By minimizing this median error, we obtained the optimal values Nmedian = 121 and μ = 0.0475. We optimized the parameters Nmedian and μ using the first 14 forecast days because the expected error in the next 14 days is so large that we prefer to focus on the optimization for the first 14 days. We could also optimize the above parameters for the whole 28 forecast days. In that case, we obtain as optimal values Nmedian = 128 and μ = 0.1075 which are slightly different from the ones obtained for the first 14 days.
Empirical confidence intervals
For each forecast day d, we compute empirical confidence intervals using the distribution of relative errors for the database given by , using the estimated optimal values for the parameters Nmedian and μ.
Assuming that the distribution of the relative forecast error for the current incidence trend curve s is similar to the one obtained for the database and determined by Ed, we can empirically approximate the percentiles of the forecast distribution, Fd, of the current curve, using the percentiles of Ed. Indeed, let us denote by Pp(X) the p-th percentile of a distribution X, then where is the forecast estimated by the proposed method. A 95% central confidence interval for the incidence trend value is for example given by (P0.025(Fd), P0.975(Fd)). In Fig 4 we display the confidence intervals of Ed for the proposed forecast method. As expected, the size of the confidence intervals increases with the forecast day d and is quite large after 28 days. Notice that the mean and the median (P0.50(Ed)) of Ed are very different due to the asymmetry of the distribution Ed. The mean is closer to the upper end of the forecast interval than the median. The fact that the median of the error is very close to zero confirms the consistency of the method.
Results
Comparative results in the context of the European Covid-19 Forecast Hub
The question arises of how to compare all methods, in theory and in practice. For a practical comparison, we take advantage of the fact that a wide variety of forecasts are submitted to the European COVID-19 Forecast Hub [17] and to the COVID-19 Forecast Hub [18]. A study on the methodology to evaluate and compare forecast has been proposed in [19], using the data of this Hub. As developed in [1], the European Covid-19 Forecast Hub provides short-term forecasts of Covid-19 cases and deaths across Europe. It is supported by teams working on pandemic modeling and sharing their forecast of the weekly accumulated incidence with horizons of 1 to 4 weeks. Each week starts on Sunday and ends on Saturday. At the time of writing, many countries do not provide data during the week-end, and some countries only provide a weekly estimate. This fact has no influence for method preprocessing the data by a 7 day sliding average. Nevertheless, since we use daily estimates, a single weekly estimate has a negative impact on the quality of our forecast. To address this issue, when a country provides data on a day, but not on the previous days, we distribute equally the last accumulated value over the previous uninformed days before applying EpiInvert.
Since EpiLearn forecasts the daily incidence, the weekly forecast is obtained by summing the forecasted raw daily incidence given by (8). The quantiles of the associated weekly distributions are computed on the registered database of incidence curves by extending the procedure of the previous section which computes the confidence intervals of the forecasted incidence curve. In this case, we aggregate to the weekly scale first and then compute quantiles.
The European Hub encourages teams to provide, for each model, m, each horizon week, h = 1, 2, 3, 4, and each forecast target, n, the prediction of the weekly incidence, fm,h,n, and 23 quantiles of the associated distribution. These quantiles correspond to the predictive median, M, and eleven (1 − αk) × 100% central prediction intervals , with αk = 0.02, 0.05, 0.1, 0.2, …, 0.9, where and are (respectively) the αk/2 and (1 − αk/2) quantiles of F . The following weighted interval score, WISm,h,n (see [20]), is proposed to evaluate the distribution accuracy: where oh,n is the observed outcome, (.)+ is defined as (x)+ = x if x > 0 and 0 otherwise. The lower the value of WISm,h,n, the better the score associated to the forecast distribution determined by the quantiles of F .
The prediction accuracy of a model is measured using two indicators: the first one is |fm,h,n − oh,n|, that is, the absolute value of the difference between the observed value oh,n and the prediction fm,h,n. The second indicator measures the quality of the confidence intervals and is given by WISm,h,n. To compare the prediction accuracy of different models, we have to take into account that, in general, each team provides a different number of forecast targets. We started for example submitting forecast to the European Hub by August 2022, but other teams started submitting up to 2 years earlier. Furthermore, not all teams provide a forecast for all horizons and for all countries. Thus, defining a fair comparison of models requires some caution. To address this issue, the European Hub uses the following procedure (we explain the procedure for the comparison of |fm,h,n − oh,n|, but the comparison of WISm,h,n is equivalent). Consider two models m and m, a week horizon h ∈ {1, 2, 3, 4} and where Nm,m′,h is the number of forecast targets that have been handled by both models. The pairwise comparison of both models is then defined by the ratio which is smaller than 1 if m′ is more accurate than m, and larger than 1 otherwise. Subsequently, we compute for each model m the geometric mean of the results achieved for all different pairwise comparisons, where M ′ is the number of models, m′ ≠ m, which have forecast targets in common with model m. It follows that θm,h is a measure of the relative skill of model m with respect to the set of all other models in the week horizon h. The relative performance of model m is computed with respect to θb,h, the score of the baseline model, as where the baseline model b is nothing but the constant prediction extending the last observed weekly value [21].
The ratio is called the relative MAE, rel_ae, of model m in the week horizon h. A score of 0 < rel_ae < 1 means that model m is better than the baseline; a score of rel_ae > 1 means that the baseline is better. In the case of WISm,h,n, we use the same procedure (replacing |fm,h,n − oh,n| by WISm,h,n) and call rel_wis the associated indicator. Every week, the Hub publishes, in the file scores.csv of the evaluation repository, information about the accuracy of the predictions. In particular, it publishes, for each team m, horizon h and forecast target n, the values of |fm,h,n − oh,n|, WISm,h,n, and the 50% and 95% prediction intervals coverage. We used this information to compare EpiLearn with the other methods. To do a fair comparison, for all teams, we used as comparison population the horizons and targets used by EpiLearn to provide forecast between August 6, 2022 and March 6, 2023. In this way we used for all teams the same target population when computing the performance scores. Moreover, we only considered models that provided forecasts for at least 50% of the target population. In table 1 we present, for each model, the values of rel_ae, rel_wis, the 50% and 95% interval coverage and the number of targets in common between EpiLearn and the model used to compute the indicators. In Fig. 5 we show the actual observed weekly disease incidence by country during the time span evaluated in the comparative results from the European COVID-19 Forecast Hub, this period showed challenging behaviors in disease incidence trends. The baseline value of rel_ae and rel_wis equal to 1 is the constant prediction, which is expected to be beaten by all more sophisticated methods. The lower the values of these indicators, the better the model performance. The result of the ensemble model is usually considered as the best option as argued in [20]. In the Hub, the proposed EpiLearn technique corresponds to the named AMM-EpiInvert team. In table 1 we observe that, except for the horizon of 4 weeks, EpiLearn obtains the best results for rel_ae and rel_wis, specially for the horizons of 1 and 2 weeks. The value of the 50% confidence interval coverage obtained by EpiLearn is not so good, but this indicator does not take into account how far the estimation stands from the confidence interval. The low value of rel_wis indicates that, globally, the estimation obtained by EpiLearn are quite close to the confidence intervals. In Fig. 6 we plot the values of rel_ae and rel_wis presented in table 1.
The EpiInvert method performs a decomposition of the past incidence curve into a trend and a noise component, after correction of the weekly bias. In the above error estimations, we only used the trend curves, because what is forecast also is a trend. We did not compute the additional error between prediction and ground truth that is caused by the noise component. Our above error prediction therefore only addresses the method’s bias, namely the observed variability of the future trends following a given past trend interval. A further refinement of the method should take into account the noise residual computed by EpiInvert for a given incidence curve, estimate its model, and deduce a noise variance for the prediction. This noise variance should be added to the method bias variance.
A great advantage of using the scores published by the European Hub is that such scores cannot be manipulated. They represent a fair quality comparison framework for the models performance.
Discussion
In this section we review and discuss the properties and assumptions of the most relevant forecasting methods, and link them when possible to methods and results published weekly in the European Covid-19 Forecast Hub [17].
ARIMA
The ARMA (AutoRegressive Moving Average) and ARIMA (AutoRegressive Integrated Moving Average) models are the backbone of many forecasting methods and are implemented through the popular R package [22]. In the European Hub Forecast initiative, the epiforecasts-weeklygrowth team [23] uses a Bayesian ARIMA model on weekly incidence data. ARIMA is arguably the most popular forecasting model for COVID-19, and has been applied with a country-specific optimization of parameters. For example the MUNI-ARIMA [24] team participating to the European Hub Forecast initiative uses an “ARIMA model with outlier detection fitted to transformed weekly aggregated series”. This method is one of the best performing methods as illustrated in Figure 5.
An extension of ARIMA, SARIMA (seasonal ARIMA) is a combination of two ARIMA models. This method was tested for forecasting the global COVID-19 incidence in [25, 26]. It has been used and compared to ARIMA for COVID-19 forecast in [27].
Compartmental epidemiological models (SIR, SEIR, SIRD, SEIARD and SUIHTER)
Compartmental models are in silico simulation models that consider the population as a collection of compartments, for example in the case of SEIARD : S susceptible, E exposed, I infected, A asymptomatic, R recovered and D dead. Initially designed for epidemic modeling, the SIR model and its variants have since been adapted to forecasting the future evolution of the pandemic from an estimated starting point. The model’s parameters are estimated from the past incidence, and the model is then applied forward to simulate the future. This method has been developed for SIR [28, 29], SEIR [30, 31], SIRD [32], SEIARD [33] and SUIHTER [34, 35].
Regression models
The Richards model [36] is a 2-parameter simple logistic growth model including a scaling parameter. This model is used in [37] as a parametric regression model for the modeling of incidence indicators. The incidence distribution is modeled by an appropriate Poisson or Negative Binomial. It is also used in [38] for estimating the regional propagation of COVID-19 in Italy and in [39] for recurrent forecasting in Europe. The Gompertz model was originally proposed to explain human mortality curves and has been further employed in the description of growth processes. Modeling the cumulative cases of Covid-19, it is used for COVID-19 forecast in [40] and [41]. This model was implemented in the European Hub Forecast initiative through the BIOCOMSC-Gompertz method [40] . The composite logistic growth model (CLM) [42] is another regression model, a variant of which is used by the RobertWalraven-ESG [43] team participating to the European Hub Forecast initiative. Its results are illustrated in Figure 5. The sub-epidemic model is the most flexible extension of the previous models used for forecasting [42] This sub-epidemic wave model supports complex epidemic trajectories shaped by multiple underlying sub-epidemics modeled by the GLM.
Short term prediction by the renewal equation, linear extrapolation
The approach proposed in [44] to forecasting future COVID-19 cases involves 1) modeling the incidence using a Poisson distribution for the daily incidence number, and a gamma distribution for the series interval; 2) estimating the effective reproduction number assuming its value stays constant during a short time interval (by the EpiEstim method [7]); and 3) using the renewal equation, drawing future incidence cases from their posterior distributions, assuming that the current transmission rate will stay the same, or change by a certain degree. A similar forecast method is involved in [45] which compares human and machine forecasts in Germany and Poland. The authors use a Bayesian model from the EpiNow2 R package (version 1.3.3) to predict reported cases. Epinow [10] estimates the effective reproduction number Rt. The future infections are computed by the Fraser renewal equation as a weighted sum of past infection multiplied by Rt. In the comparison, Rt is assumed to stay constant beyond the forecast date. The conclusion of this paper is that an average of human experts’ forecasts performs better. Similarly, the USC-SIkJalpha [16] and ILM-EKF [46] teams participating to the European Hub Forecast initiative use the renewal equation (the second mentioned group also involves a Kalman filter in its prediction). Its results are illustrated in Figure 6. Lastly, the SDSC ISG-TrendModel [47] team, also participating to the European Hub Forecast initiative, is a trend extrapolation which starts by decomposing the incidence curve into three components: the trend, a seasonal component and noise. Then the model predicts daily cases using linear extrapolation on the linear or log scale of the underlying trend estimated by a robust LOESS seasonal-trend decomposition model. Its results are illustrated in Figure 6, where only the results for the first two weeks are available.
Aggregation of estimators (ensemble methods)
The idea of agregation methods, sometimes also called ensemble methods is to build a prediction model by combining the strengths of a collection of simpler base models called weak learners [48]. In [49] the use of ensemble models was evaluated for influenza seasons and it was concluded that the ensemble methods average performance is similar to the best of the component models, but offers more consistent performance across seasons than the component models. The European Covid-19 Forecast Hub [17] also proposes “an ensemble, or model average, of submitted forecasts to the European COVID-19 Forecast Hub”, described in [20]. In it, the teams submit weekly forecasts for COVID-19 cases and deaths in up to 32 countries for the next week and the three following weeks. The teams also submit standardized quantiles of their predictive distribution. In the ensemble forecast, each predictive quantile is calculated as the equally-weighted median of all individual models’ predictive quantiles. The performance of each model is evaluated with the relative Weighted Interval Score (WIS), comparing a models’ forecast accuracy relative to all other models (see section for the formula of WIS). In [20], the authors report that the ensemble performed better on relative WIS than 84% of participating models’ forecasts of incident cases (with a total N=862), and 92% of participating models’ forecasts of deaths (N=746). In view of this, we shall pay a special attention to the comparison of the model proposed here with the ensemble model, as illustrated on 6.
Global learning
The idea of Global learning is to predict jointly an ensemble of time series with similar characteristics [50]. Each time series is time-delay embedded and stacked together before fitting a single linear autoregressive model. The dimension of the embedding is tuned by temporal validation. The same method is used in [51], which proposes to estimate a time lag between two countries after finding an optimal dynamic time warping between their incidence curves. This procedure allows an elastic adjustment of the time axis to find similar but phase-shifted sequences. Then the incidence curve of the leading country is used to extend toward future the incidence curve of the other. This group of methods can be seen as a direct antecedent of the method proposed here. Indeed, our method (implicitly) estimates time lags between past incidence curves of different countries and the one that we want to extend before exploiting the “future” samples of these time shifted incidence curves to predict the future of our target incidence.
Conclusion
Given the large number of factors that can influence a future evolution, forecasting the evolution of the incidence curve is clearly difficult. We saw in the discussion section that most standard approaches estimate the parameters of an evolution model (ARIMA, SIR, a logistic curve). In this work, we proposed EpiLearn, a method following a more empirical approach that estimates the forecast by a learning procedure using many samples of past incidences evolution in many countries. Using EpiInvert, an incidence decomposition method, we removed first the strong administrative weekly bias from the original raw incidence to estimate a smooth incidence trend curve. Using a large database of incidence trends, the forecast is computed as the median of the closest curves, in the past, to the current incidence trend curve. We observed that the size of the estimated empiric confidence interval grows quickly with the number of forecast days. For a 28 days forecast the size of the confidence interval becomes very large, and this is confirmed weekly by our results in the European hub [17]. These results place EpiInvert among the very best methods in the period and regions analyzed. We observed that the prediction of all methods may miss the forecast target by a large margin in the three and four weeks horizon. Nevertheless, they seem to be reliable and useful to predict the pandemic in a two-week horizon.
The proposed method might be improved in several ways by taking into account additional relevant factors before comparing time sequences. Indeed, our obtained confidence intervals were based on a global distribution of relative errors. However, the size of relative errors might vary depending on the trend and magnitude of the query curve. It might be interesting to explore this by adding a distance of the average incidences as an additional term in the distance between incidences given in equation (6). In addition, the forecasting could benefit from additional knowledge about implementation or changes of the social distancing policy and evolution of the virus’ contagiousness (as was observed with the emergence of Omicron). Digging into these aspects requires a far ranging overhaul of our experimental protocol which belongs to our future plans.
Data Availability
All data produced are available online at OurWorldInData.org
https://github.com/owid/covid-19-data/tree/master/public/data
Acknowledgments
Publicly available datasets of the incidence curve were analyzed in this study. These data can be found in [52]. The method evaluation performance uses the weekly evaluation reports published in the European CoviD-19 Hub [1]. Work supported by Kayrros, Inc.