Abstract
In the early stages of the COVID-19 pandemic, it became clear that pandemic waves and population responses were locked in a mutual feedback loop. The initial lull following strict interventions in the first wave often led to a second wave, as restrictions were relaxed. We test the ability of new hybrid machine learning techniques, namely universal differential equations (UDEs) with learning biases, to make predictions in a such a dynamic behavior-disease setting. We develop a UDE model for COVID-19 and test it both with and without learning biases describing simple assumptions about disease transmission and population response. Our results show that UDEs, particularly when supplied with learning biases, are capable of learning coupled behavior-disease dynamics and predicting second waves in a variety of populations. The model predicts a second wave of infections 55% of the time across all populations, having been trained only on the first wave. The predicted second wave is larger than the first. Without learning biases, model predictions are hampered: the unbiased model predicts a second wave only 25% of the time, typically smaller than the first. The biased model consistently predicts the expected increase in the transmission rate with rising mobility, whereas the unbiased model predicts a decrease in mobility as often as a continued increase. The biased model also achieves better accuracy on its training data thanks to fewer and less severely divergent trajectories. These results indicate that biologically informed machine learning can generate qualitatively correct mid to long-term predictions of COVID-19 pandemic waves.
Significance statement Universal differential equations are a relatively new modelling technique where neural networks use data to learn unknown components of a dynamical system. We demonstrate for the first time that this technique is able to extract valuable information from data on a coupled behaviour-disease system. Our model was able to learn the interplay between COVID-19 infections and time spent travelling to retail and recreation locations in order to predict a second wave of cases, having been trained only on the first wave. We also demonstrate that adding additional terms to the universal differential equation’s loss function that penalize implausible solutions improves training time and leads to improved predictions.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study did not receive any funding
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Study uses only data available at: COVID-19 case data: https://github.com/CSSEGISandData/COVID-19 Google Community Mobility Report: https://www.google.com/covid19/mobility/
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
All data produced in the present study are available upon reasonable request to the authors