Abstract
Background Metagenomic sequencing of wastewater (W-MGS) offers broad, pathogen-agnostic monitoring of infectious diseases. We quantify the sensitivity and cost of W-MGS for viral pathogen detection by jointly analysing W-MGS and epidemiological data for a range of human-infecting viruses.
Methods Sequencing data from four studies were analysed to estimate the relative abundance (RA) of 11 human-infecting viruses. Corresponding prevalence and incidence estimates were obtained or calculated from academic and public-health reports. These estimates were combined using a hierarchical Bayesian model to predict RA at set prevalence or incidence values, allowing comparison across studies and viruses. These predictions were then used to estimate the sequencing depth and concomitant cost required for pathogen detection using W-MGS with or without use of a hybridization-capture enrichment panel.
Findings After controlling for variation in local infection rates, relative abundance varied by orders of magnitude across studies for a given virus. For instance, a local SARS-CoV-2 weekly incidence of 1% corresponds to predicted SARS-CoV-2 relative abundance ranging from 3·8 × 10−10 to 2·4 × 10−7 across studies, translating to orders-of-magnitude variation in the cost of operating a system able to detect a SARS-CoV-2-like pathogen at a given sensitivity. Use of a respiratory virus enrichment panel in two studies dramatically increased predicted relative abundance of SARS-CoV-2, lowering yearly costs by 24- to 29-fold for a system able to detect a SARS-CoV-2-like pathogen before reaching 0.01% cumulative incidence.
Interpretation The large variation in viral relative abundance after controlling for epidemiological factors indicates that other sources of inter-study variation, such as differences in sewershed hydrology and lab protocols, have a substantial impact on the sensitivity and cost of W-MGS. Well-chosen hybridization capture panels can dramatically increase sensitivity and reduce cost for viruses in the panel, but may reduce sensitivity to unknown or unexpected pathogens.
Funding Wellcome Trust; Open Philanthropy; Musk Foundation
Evidence before this study Numerous other studies have performed wastewater metagenomic sequencing (W-MGS), with a range of objectives. However, few have explicitly examined the performance of W-MGS as a monitoring tool. We searched PubMed between database inception and September 2024, using the search terms “MGS OR Metagenomic sequencing OR Metagenomics OR Shotgun sequencing” AND “Performance OR Precision OR Sensitivity OR Cost-effectiveness OR Effectiveness” AND “Virus OR Viral OR Virome” AND “Wastewater OR Sewage”. Among the 88 resulting studies, 17 focused specifically on viruses in wastewater. A 2023 UK study by Child and colleagues assessed untargeted and hybridization-capture sequencing of wastewater for genomic epidemiology, concluding that the former but not the latter provided sufficient coverage for effective variant tracking. However, they did find untargeted sequencing sufficient for presence/absence calls of human pathogens in wastewater, a finding supported by numerous other W-MGS studies. While several studies examined the effect of different W-MGS protocols on viral abundance and composition, none accounted for epidemiological or study effects, and none explicitly quantified the sensitivity and cost of W-MGS for viral detection.
Added value of this study To our knowledge, this study provides the first quantitative assessment of the sensitivity and cost of untargeted and hybridization-capture W-MGS for pathogen surveillance. Linking a large corpus of public wastewater metagenomic sequencing with epidemiological data in a Bayesian model, we predict pathogen relative abundance in W-MGS data at set infection prevalence or incidence, and estimate concomitant read-depth and cost requirements for effective detection across different studies and viruses. Our flexible modelling framework provides a valuable tool for evaluation of sequencing-based surveillance in other contexts.
Implications of all the available evidence The sensitivity of untargeted W-MGS varies greatly with pathogen and study design, and large gaps in our understanding remain for pathogens not present in our data. As untargeted W-MGS protocols undergo further improvements, our Bayesian modelling framework is an effective tool for evaluating the sensitivity of new protocols under different epidemiological conditions. While less pathogen-agnostic, hybridization capture can dramatically increase the sensitivity of W-MGS-based pathogen monitoring, and our findings support piloting it as a tool for biosurveillance of known viruses.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
S.L.G., J.T.K., D.P.R., W.J.B., and M.R.M. were funded for this research project by gifts from Open Philanthropy (to SecureBio) and the Musk Foundation (to MIT). S.L.G. was additionally supported through a grant by the Swiss Scholarship Foundation. C.W. was supported by Sir Henry Wellcome Postdoctoral Fellowship, reference 224190/Z/21/Z.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
Manuscript revised for submission to Lancet Microbe. Supplement and analyses of hybridization-capture sequencing data added, figures and text revised.
Data Availability
All data produced are available online at https://github.com/naobservatory/mgs-workflow/tree/2.1.0 (metagenomic data analysis pipeline) and https://github.com/naobservatory/p2ra-manuscript (epidemiological analysis, statistical models, figure generation).