Abstract
Accurately estimating the prevalence and transmissibility of an infectious disease is a critical part of genetic infectious disease epidemiology. However, generating accurate estimates of these quantities, informed by both time series and sequencing data, is challenging. Birth-death processes and coalescent-based models are popular methods for modelling the transmission of infectious diseases, but they struggle with estimating the prevalence of infection.
We extended our approximation of the likelihood for a point process of viral genomes and time series of case counts so it can estimate historical prevalence, and we implemented this in a BEAST2 package called Timtam. In a simulation study the approximation recovered the parameters from simulated data, even when we aggregated the point process data into a time series of daily case counts.
To demonstrate how Timtam can be applied to real datasets, we estimated the reproduction number and the prevalence of infection through time during the SARS-CoV-2 outbreak onboard the Diamond Princess cruise ship using a time series of confirmed cases and sequence data. We found a greater prevalence than previously estimated and comment on how differences in the algorithms used could explain this.
Competing Interest Statement
The authors have declared no competing interest.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Data Availability
All data produced in the present study are available upon reasonable request to the authors