Abstract
Background Collaborative comparisons and combinations of epidemic models are used as policy-relevant evidence during epidemic outbreaks. In the process of collecting multiple model projections, such collaborations may gain or lose relevant information. Typically, modellers contribute a probabilistic summary at each time-step. We compared this to directly collecting simulated trajectories. We aimed to explore information on key epidemic quantities; ensemble uncertainty; and performance against data, investigating potential to continuously gain information from a single cross-sectional collection of model results.
Methods We compared July 2022 projections from the European COVID-19 Scenario Modelling Hub. Five modelling teams projected incidence in Belgium, the Netherlands, and Spain. We compared projections by incidence, peaks, and cumulative totals. We created a probabilistic ensemble drawn from all trajectories, and compared to ensembles from a median across each model’s quantiles, or a linear opinion pool. We measured the predictive accuracy of individual trajectories against observations, using this in a weighted ensemble. We repeated this sequentially against increasing weeks of observed data. We evaluated these ensembles to reflect performance with varying observed data.
Results By collecting modelled trajectories, we showed policy-relevant epidemic characteristics. Trajectories contained a right-skewed distribution well represented by an ensemble of trajectories or a linear opinion pool, but not models’ quantile intervals. Ensembles weighted by performance typically retained the range of plausible incidence over time, and in some cases narrowed this by excluding some epidemic shapes.
Conclusions We observed several information gains from collecting modelled trajectories rather than quantile distributions, including potential for continuously updated information from a single model collection. The value of information gains and losses may vary with each collaborative effort’s aims, depending on the needs of projection users. Understanding the differing information potential of methods to collect model projections can support the accuracy, sustainability, and communication of collaborative infectious disease modelling efforts.
Data availability All code and data available on Github: https://github.com/covid19-forecast-hub-europe/aggregation-info-loss
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
KS, SF funded by ECDC and Wellcome (210758/Z/18/Z). AS funded by National Science Foundation Award 2135784, 2223933. KA funded by Netherlands Ministry of Health, Welfare and Sport, and European Union Horizon 2020 research and innovation programme, project EpiPose (grant agreement number 101003688). DES, AC, MM, JC, ACG funded by U3CM, Instituto de Salud Carlos III, Gobierno de España, European Commission. NF, LW, SA, CF, PB, NH funded by European Union Horizon 2020 research and innovation programme (grant number 101003688, EpiPose project). SM, BC, RE, SP, CR, JR, TC, CS, KN funded by Ministry of research and education (BMBF) Germany (grants number 031L0300D, 031L0302A). RG, RN, BP, FS funded by ECDC.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The study used openly available data originally available at: https://github.com/CSSEGISandData/COVID-19
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
Funding declaration: KS, SF funded by ECDC and Wellcome (210758/Z/18/Z). AS funded by National Science Foundation Award 2135784, 2223933. KA funded by Netherlands Ministry of Health, Welfare and Sport, and European Union’s Horizon 2020 research and innovation programme - project EpiPose (grant agreement number 101003688). DES, AC, MM, JC, ACG funded by U3CM, Instituto de Salud Carlos III, Gobierno de España, European Commission. NF, LW, StA, CF, PB, NH funded by European Union’s Horizon 2020 research and innovation programme (grant number 101003688 – EpiPose project). SM, BC, RE, SP, CR, JR, TC, CS, KN funded by Ministry of research and education (BMBF) Germany (grants number 031L0300D, 031L0302A). RG, RN, BP, FS funded by ECDC.
we have made two substantial changes to the work, as follows: 1. We have added a quantitative evaluation of an ensemble using weighted trajectories. This was not requested but we believe this is necessary to support the conclusions of the work recommending weighting trajectories by past performance. This now forms a larger portion of the overall work as the key novel aspect of this piece. 2. We have added a Linear Opinion Pool ensemble to reflect this important method of model combination. At the same time we have re-balanced the article to focus away from aggregation/combination and towards method of model output collection. We believe this addresses the main concern that the paper conflated aggregation with methods of collection.
Data Availability
All code and data available on Github: https://github.com/epiforecasts/aggregation-info-loss