Abstract
Wastewater-based epidemiology has emerged as a critical tool for public health surveillance, building on decades of environmental surveillance work for pathogens such as poliovirus. Work to date has been limited to monitoring a single pathogen or small numbers of pathogens in targeted studies; however, simultaneous analysis of a wide variety of pathogens would greatly increase the utility of wastewater surveillance. We developed a novel quantitative multi-pathogen surveillance approach (33 pathogen targets including bacteria, viruses, protozoa, and helminths) using TaqMan Array Cards (RT-qPCR) and applied the method on concentrated wastewater samples collected at four wastewater treatment plants in Atlanta, GA from February to October of 2020. From sewersheds serving approximately 2 million people, we detected a wide range of targets including many we expected to find in wastewater (e.g., enterotoxigenic E. coli and Giardia in 97% of 29 samples at stable concentrations) as well as unexpected targets including Strongyloides stercolaris (i.e., human threadworm, a neglected tropical disease rarely observed in clinical settings in the USA). Other notable detections included SARS-CoV-2, but also several pathogen targets that are not commonly included in wastewater surveillance like Acanthamoeba spp., Balantidium coli, Entamoeba histolytica, astrovirus, norovirus, and sapovirus. Our data suggest broad utility in expanding the scope of enteric pathogen surveillance in wastewaters, with potential for application in a variety of settings where pathogen quantification in fecal waste streams can inform public health surveillance and selection of control measures to limit infections.
Introduction
Wastewater-based epidemiology (WBE) incorporates a range of tools intended to complement traditional public health surveillance, optimally providing timely and actionable data on pathogens circulating in populations of interest. Historically, wastewater monitoring has been used as a surveillance tool for individual pathogens including poliovirus[1,2], hepatitis A[3], Vibrio cholerae[4], Salmonella enterica serotype Typhi [5] as well as for chemical analytes (e.g., drug use) [6]. This strategy has gained global prominence in detection and quantification of SARS-CoV-2 RNA in wastewater[7–9], specifically focusing on community prevalence[7][10][11], apparent trends in infections over time and space[12], and emerging variants[13,14]. Advantages and limitations of wastewater as a surveillance matrix have been widely discussed since 2020[15–17].
Expanding wastewater monitoring efforts to include and screen multiple pathogens or variants could maximize the potential of using wastewater as a valuable tool. A tool to better understand the possibility of emerging pathogens or circulating strains in a particular population. The need to expand wastewater monitoring to screen multiple pathogens or variants is a valuable tool to better understand the possibility of emerging pathogens or circulating strains in a particular population. A tool for the simultaneous detection of a broad range of pathogens quickly and accurately could be used for screening followed by confirmatory testing with more sensitive methods. Emerging and re-emerging infectious diseases[18] – including those with pandemic potential[19] – represent ongoing risks to society, and wastewater surveillance can fill critical gaps in data to inform public health responses[20].
Based on the demonstrated potential for WBE to complement traditional diagnostic public health surveillance for a diverse array of pathogens, we developed a customized multi-parallel molecular surveillance tool for simultaneous detection and quantification of 33 common pathogenic bacteria, viruses, protozoa, and helminths in wastewater. Such approaches can expand the existing WBE platform by screening for many more pathogens – including rare or emerging microbes of interest – enhancing monitoring to inform public health response. We demonstrate the utility of this method in an analysis of primary influent samples from four wastewater treatment plants in metro Atlanta, Georgia, USA.
Materials & Methods
Sample Collection
We collected one-liter primary influent grab samples (n=30) in HDPE plastic bottles from four wastewater treatment plants (anonymized as WWTP A, B, C, D) in Atlanta, GA between March 20th, 2020 - November 5th, 2020 between 9:30 AM—11 AM. Flow values from the WWTPs ranged from 14 – 80.2 million gallons per day. All samples were transferred to the laboratory on ice and stored at −80°C until further processing was completed. Initial sample processing began on November 8th, 2021. Frozen samples were thawed in a 5L bucket of water located in a 4°C walk-in fridge for up to 3 days or until thawed. Samples were then recorded for temperature and pH, and a 50 mL aliquot was taken for total suspended solids measurements (Table S1). Each 1L sample was spiked with 10 µL of Calf-Guard (Zoetis) resuspended vaccine, containing attenuated bovine coronavirus (BCoV), and 10 µL of MS2 (105/µL), which served as the process recovery controls. A 1:100 ratio of 5% Tween 20 solution was added to the sample bottle. A graduated 1L bottle was used as a reference for the total volume in each sample bottle. Samples were mixed by inverting the bottle 3-4 times. A subset of samples (n=4) were processed using three different methods to establish a reasonable workflow for the larger scale demonstration: (1) direct extraction, (2) InnovaPrep Concentrating Pipette (CP) Select, and (3) skim milk flocculation (SMF).
Sample Processing
Direct Extraction
We directly extracted 200 µL of wastewater influent into the DNeasy PowerSoil Pro Manual extraction kit (Qiagen, Hilden, Germay).
InnovaPrep Concentrating Pipette
150 mL from the wastewater influent sample was transferred into a 500 mL conical centrifuge tube. Samples were centrifuged for 20 minutes at 4800 x g. The 500 mL conical tube was placed under the CP Select, and the fluidics head lowered into the sample. The sample was filtered using a 0.05 µm unirradiated hollow-fiber CP tip and eluted using the InnovaPrep FluidPrep Tris elution canister. Processing times and eluted volumes were recorded. For each day samples were run, one negative control consisting of 100 mL of DI water was also filtered and processed.
Skim Milk Flocculation
With the remaining wastewater sample, we proceeded to use the SMF method[22]. We combined 1 mL of a 5% skimmed milk solution per 100 mL of wastewater sample and adjusted the pH of the skimmed-milk-wastewater solution between 3.0 – 4.0 using 1M HCl. Samples were placed on a shaker plate at room temperature (20-25°C) at 200 RPM for two hours. After shaking, samples were centrifuged at 3500 x g at 4°C for 30 minutes. The supernatant was discarded and the pellet was archived at −80°C until batch extractions were complete within one week.
A subset of 4 samples were directly extracted and the TAC results from CP, SMF, and the direct extractions were compared to determine an optimal concentration method prior to full scale downstream processing. Additional details can be found in Table S2. In the methods trial, SMF resulted in greater number of pathogen detections and was therefore used for the subsequent full-scale analyses. In the SMF workflow, skim milk pellets were processed for RNA using the Qiagen DNeasy PowerSoil Pro manual extraction kit. One extraction blank was run using nuclease-free water for each batch of sample extractions. Extracts were placed in the −80°C freezer until RT-qPCR or dPCR processing followed within one week. Skim milk pellets were run on TAC with 7% in duplicate. All CP eluants were extracted for RNA using Qiagen AllPrep PowerViral manual kits following manufacturer instructions to be further processed using digital PCR (dPCR). CP and dPCR were used for process controls and fecal indicators in the full-scale analyses.
Molecular Analysis
Two PCR platforms were used to process extracts, the first was an RT-qPCR QuantStudio (QS) 7 Flex (ThermoFisher Scientific, Waltham, MA) and the second a dPCR QIAcuity Four (Qiagen, Hilden, Germany). All skim milk pellets were analyzed using the QS7 Flex. The QS7 works in conjunction with a custom TaqMan Array Card (TAC), which is prespecified with lyophilized primers and probes for 33 enteric pathogen targets (see Table S3). Cq values < 40 were considered positive for the target and confirmed through clear amplification signals in the amplification and multicomponent plots. Additional MIQE details are found in Table S12. All CP eluant samples were analyzed using the digital PCR QIAcuity Four platform (Qiagen, Hilden, Germany). On the dPCR platform previously designed and optimized multiplex assays were used for bovine coronavirus (BCoV), pepper mild mottle virus (PMMoV), and human mitochondrial DNA (mtDNA)[23] (see Table S4 and Text S1). Gene copy concentration results for PMMoV and mtDNA were used as normalization markers for the TAC pathogen data.
Data Analysis
All project data are available here: https://osf.io/rg36f/. When multiple gene targets for a single microbial taxon was detected, we used the highest concentration gene target to calculate summary statistics and supported figures. We used R Studio version 4.2.1 to complete all data cleaning, analyses and generate graphs. Effective volumes (EV) were calculated using the following equation: The 95% LODs were calculated for each assay using probit models[24]. We translated these 95% analytical LODs (aLODs) into a 95% matrix LOD (mLOD) using the following equation and the previously calculated effective volumes for SMF:
Results
TAC results were generated using skim milk pellets extracted by the PowerSoil Pro Manual kit to process the influent samples. The average SMF pellet was 2.2 mL and the average wastewater influent processed for SMF was 688 mL. Supplemental data on any other method performed (direct extraction or InnovaPrep CP pellet) is provided in Table S2 and Figure S2.
Enteric Pathogen Measurement by Skim Milk Flocculation
The log10-transformed gene copy concentrations by pathogen class and specific enteric pathogen (Figure 1) demonstrates the wide range of pathogens detected in Atlanta wastewater influent (n=30). Enteric bacteria, specifically enterotoxigenic E. coli (ETEC), were detected most frequently and at higher gene copy concentrations compared to helminths and viruses. Notable protozoan detections were Acanthamoeba spp. (28/30), Balantidium coli (29/30), Entamoeba spp. (29/30), and Giardia spp. (29/30). While virus detections were relatively lower than protozoan detections, astrovirus (26/30), norovirus GI/GII (28/30), and sapovirus (7/30) were detected in the processed samples. Additional comparison of prevalence of pathogens by wastewater treatment plant are detailed in Table 2 with Plant C representing the most samples processed (n=21). Figure S3 demonstrates the log10 gene copies per liter of wastewater influent stratified by gene targets. Interestingly, with the CP samples we detected Strongyloides stercolaris in one wastewater sample (Figure S2 and Table S5).
Of the SMF samples, the bacterial targets of highest concentration were ETEC and enteropathogenic E. coli (EPEC - atypical), whereas viral targets were mainly astrovirus and norovirus GI/GII. Somewhat unexpected protozoan targets detected were Cyclospora cayetanensi (3/30) and Entamoeba histolytica (6/30). Both Cryptosporidium spp. and Giardia spp. were detected at means of 5.0 log10 and 6.5 log10, respectively. Of the total samples, we detected SARS-CoV-2 RNA in 50% of samples (n=15) at concentrations between 3.0 log10—6.0 log10 gene copies per liter of wastewater influent.
dPCR for Concentrating Pipette and normalization markers
A total of n=30 CP samples were processed for PMMoV, mtDNA, and BCoV. Figure 2 demonstrates the log10 gene copies per liter of wastewater influent and indicates PMMoV concentrations exceed mtDNA concentrations. The average concentrations for BCoV dPCR reactions was 43.3 gc/μL, PMMoV was 1602 gc/μL, and mtDNA was 4.33 gc/μL. The average concentrations of log10 gene copies/liter per reaction of wastewater was 5.2 x 104 for mtDNA and 1.9 x 107 for PMMoV. All positive controls and non-template controls performed without suspicion and additional details on control performance is included in Text S2 and in the dMIQE checklist (Table S11).
Pathogen concentrations normalized by mtDNA and PMMoV
Quantitative log10 gene copies per liter of wastewater influent before (Table S6) and after normalization (Tables S7-8), with mtDNA normalization resulting in overall higher log10 ratios. Between Figures 3 and 4, we note a considerably smaller ratio when using PMMoV normalization over mtDNA. These concentrations are caused by increased PMMoV concentrations in wastewater influent compared to mtDNA concentrations.
TAC Performance Interpretation
Standard Curves
The standard curves for this custom TAC included two assays (Adenovirus 40/41 and Hepatitis A) with poor standard curve performances and therefore were excluded from all analyses. Of the remaining 40 enteric targets, the DNA control was phocine herpes virus and RNA control was MS2. For performance metrics (Table S9), reasonable linearity was detected for all included assays with an average R2 value of 0.997 across all assays with the lowest R2 of 0.967 for STEC (stx2) and the highest R2 of 1 for Acanthamoeba spp., Balantidium coli, E. coli O157:H7, Giardia spp., Plesiomonas shigelloides, Salmonella spp., and STEC (stx1). The lowest efficiency assay was Astrovirus at 87% while the highest was Entamoeba spp. at 104%.
Effective Volume
The effective volume, which does not account for recovery efficiency, is calculated as the proportion of original wastewater sample assayed in a single qPCR reaction. The effective analyzed wastewater volume for InnovaPrep CP was 0.155 mL (SD 0.0605) per reaction and SMF was 0.410 mL of wastewater per reaction (SD 0.121).
Limit of Detection and matrix LOD
The 95% analytical limit of detection (aLOD) was calculated for each assay in Table S9, reported as gene copies per reaction. The lowest detectable target as Cryptosporidium spp. at 0.6 gene copies per reaction and the highest as 291 gene copies per reaction for ETEC (LT), followed by 96 gene copies per reaction for STEC (stx2).
A comprehensive mLOD table for each assay indicates the gene copy per mL of sewage is found in Table S10 and includes the minimum, maximum, mean, standard deviations, standard error, and confidence intervals. These results indicate average gene copies per mL of wastewater influent as low as 1.591 for Cryptosporidium spp. and 16S marker or as high as 264.668 gc/mL for ETEC (LT & ST). SARS-CoV-2 mLOD was 16.44 gc/mL influent.
Inhibition
We used MS2 as the extraction control and the average Cq for negative extraction controls (n=7) was 17.8 gene copies per reaction [confidence interval 0.821], whereas all SMP samples (n=30) had an average Cq of 19.3 gene copies per reaction [CI 2.04]. With a Cq difference of 1.5, we can reasonably conclude inhibition was not a major issue with our sample matrix since samples and controls had Cq difference less than 2.
Discussion
Wastewater surveillance sampling, processing, storage, and analysis methods have advanced rapidly since the emergence of SARS-CoV-2. Most studies have examined primary influent[25,26] and solids[27,28]. Sampling methods have also varied from grab, composite, and more recently passive techniques[29]. In addition to testing different matrices, many laboratories have implemented various methods to concentrate SARS-CoV-2 in wastewater using ultracentrifugation, polyethylene glycol precipitation, electronegative membrane filtration, and ultrafiltration[22,30], but few have considered a concentration step followed by a simultaneous, multi-parallel quantitative assay or multiple pathogen detection assays. The possibility of high-plex, high throughput platforms are of particular interest to stakeholders looking to expand wastewater monitoring nationally in the US and abroad. For example, the CDC has expanded upon the previously single-plex N1 assay for SARS-CoV-2 to include influenza A and/or B for increased testing capacity.[31]
TAC vs qPCR comparisons
We compared our traditional metrics such as R2 trends of standard curves and found that our TAC results are within a reasonable R2 range for almost all assays (R2>0.96), except for two explicitly excluded due to poor standard curve performance. Our 95% LODs calculated also indicate a broad range of analytical sensitivities across all pathogen targets. With the lowest detections at 0.6 gene copies per reaction, we also have targets on the higher end of 291 gene copies per reaction for ETEC. While other studies indicate a loss of sensitivity when using TAC, there was still an 89% detection rate compared to single-plex assays run.[32]
Prevalence of bacteria, protozoan, and viral targets
Our qPCR data indicated 104-106 gene copies per liter for SARS-CoV-2 prior to normalization efforts, which is comparable to other studies [33]. Researchers had previously detected Giardia duodenale., Cryptosporidium spp., and Enterocytozoon bieneusi at 82.6%, 56.2% and 87.6%, in combined sewer overflows (CSO) around China.[34] These molecular surveillance findings were also similar to ours at 97% (n=29/30) for Giardia spp., not specifically Giardia duodenale, and 27% (n=8/30) for Cryptosporidium spp., and 53% (n=16/30) for E. bieneusi. While we detected Strongyloides stercolaris, we cannot interpret viability concerns; however, since the preferred host is humans it is possible this detection was a real and rare example of detecting the unexpected. Recent evidence does suggest that Strongyloides stercolaris antibodies remain endemic in some rural communities.[35] For the CP samples, we also had a rare detection of Helicobacter pylori (n=1/28) and Hymelopis nana (n=3/28) from the same WWTP.
Groundwater and runoff can intrude into wastewater collection systems through inflow and infiltration (I&I), which may be relevant for fungi and a possibility for other microbial species to mix with wastewater flows.[36] Other potential explanations of sources into wastewater may include animal waste, commercial and/or industrial waste. These influent flows and their sources are difficult to determine, but routine surveillance – including with the addition of source-tracking – may provide additional insight into influent pathogens, their possible origins, and their utility in understanding infection transmission and control in the sewershed.
Value of multiple detections on TAC
Multi-parallel detection of pathogens of interest using TAC can be helpful in long-term surveillance or monitoring of pathogens, including in rapid screening programs or where numerous pathogens may be of interest. Apart from known, emerging, or suspected pathogens, antimicrobial resistance genes or other PCR-detectable targets of public health relevance can be included in TAC design. One key premise of wastewater-based epidemiology and monitoring is the potential value of using the method as an early detection for the onset of a potential outbreak, yet most detection methods have a needle in a haystack approach versus a wider screening that could be especially applicable to state health departments or in routine monitoring.
The customizability of TAC has proven useful in other applications such as surveillance of respiratory illness[37][38], acute-febrile illness for outbreak or surveillance purposes[39], and to improve etiological detection of difficult neonatal infectious diseases for low-resource clinical settings.[40] Some studies have focused on applications of combining nucleic acid detection with quantitative microbial risk assessments[41], but none have considered such a broad set of applications to wastewater monitoring and surveillance, although some have applied these methods qualitatively on fecal sludge samples[21,42]. It is possible to create a multiplex assay for digital PCR, the leading technology for wastewater monitoring, for up to five different genes, but no other platform provides quantitative data on up to 48 gene targets during a single experimental run.
TAC methods can fill a critical gap in existing molecular monitoring tools. As a method yielding quantitative estimates of potentially dozens of targets, it offers complementary advantages over emerging digital PCR platforms (greater sensitivity and lower limits of quantification, but fewer targets) and sequencing methods (many more targets, but high limits of detection and generally not quantitative). TAC should be considered where targets are present in high numbers – like in wastewaters and fecal sludges – and where many pathogens are of interest.
The application of improved methods for the detection and quantification of enteric pathogens in wastewater, in addition to other enteric pathogens of interest, can then be translated into relevant intervention efforts. As SARS-CoV-2 surveillance in wastewater reaches scale [7,25,43], detection and quantification of other pathogens has been proposed. Researchers have expanded on wastewater monitoring to focus increased surveillance on other respiratory viruses such as human influenza and rhinovirus[44], norovirus[45], or as an outbreak detection tool for influenza,[46] and are also considering other emerging infections such as monkeypox.[47]
Value of sensitivity of dPCR
The current and suggested methodology to process wastewater samples using a molecular platform is digital PCR due to its low limit of detection and quantification. While these efforts make sense to consider when focused on one particular pathogen, it is not as feasible and consumes several resources if considering a truly practical monitoring system for wastewater. Time, technical staff labor and resources are always a challenge for laboratories and especially public health laboratories that have been tasked with monitoring wastewater for SARS-CoV-2. We can expect enteric targets to be present in wastewater, but to further identify which enteric pathogens are present and their concentrations with respect to each other would be a useful application towards building a wastewater monitoring system.
While SARS-CoV-2 was detected through TAC, we were also interested in detecting additionally relevant targets, including BCoV, PMMoV, and mtDNA, which were not previously included on the TAC. The normalization of pathogen concentrations using mtDNA consistently lowered concentrations across samples and may be useful as a normalization variable instead of or in addition to PMMoV. While PMMoV has been widely used for normalization of wastewater data[48,49], we found the normalization efforts did not drastically reduce the noise-to-signal ratio. While several studies have used PMMoV as a normalization marker for SARS-CoV-2[12,50][51], fewer studies have considered human mitochondrial DNA markers and those who have found the marker to have strong correlations to clinical case counts[52]. Additional studies have also considered the use of crAssphage[12][49], HF183[53][54], and Bacteroides ribosomal RNA (rRNA) and human 18S rRNA as other normalization markers to explore using for wastewater fecal concentration data[12].
Shedding rates/Limitations
Wastewater sample recovery for SARS-CoV-2 has been successful when using fresh samples, but for many WWTP and their partners it may be unrealistic to complete same-day processing for logistical reasons[55]. This work demonstrated the recovery of pathogen targets using archived grab samples, which makes this approach open to a broader range of applications such as retrospective analyses where clinical data is available or can be linked to these environmental surveillance results. However, more research is needed to understand which recovery methods work best and can be performed efficiently for archived samples.
A major limitation to understanding this work is limited data on fecal shedding rates and their incorporation into predictive models. Researchers have gained interest in calculating community-specific or dorm-specific fecal shedding rates specifically for SARS-CoV-2[56][57], but there was no specific information on the fecal shedding rates for this particular population to consider a modeling approach to relate pathogen concentration and clinical case data for asymptomatic individuals. Additionally, sewersheds of different sizes may have specific challenges in determining accurate shedding rates. Robust data on enteric shedding rates is not widely available for high-income countries, but efforts to estimate these variables and their uncertainties have been attempted.[58]
Another major limitation with TAC methods is the limit of gene target detections one can consider. With the option of detecting many pathogens comes with a need for determining the most relevant genes of interest. While TAC can run up to 48 unique targets, the total amount of template that enters each individual well is ∼ 0.6 μL. This low template volume, compared to a 2-5 μL of template included in other molecular assays can affect the overall limits of detection for this platform. While singleplex assays may have lower limits of detection, the likelihood of optimizing a multiplex for up to 46 or more agents is unrealistic; therefore, giving TAC a considerable advantage as a high parallel, multiple detection platform.[32] Additionally, these targets and the QA/QC involved require dedicated time and effort to include relevant targets that may change based on future applications. The need for additional replicates run to produce robust analytical limits of quantification are encouraged for future work. Using this multiple pathogen detection tool does not account for variant changes and may not be suitable for all applications. Our findings indicate TAC offers a multi-parallel platform for screening wastewater for a diverse array of enteric pathogens of interest to public health with strong potential for screening other targets of interest including respiratory viruses and antibiotic resistance genes.
Data Availability
All data from this manuscript have been posted in a publicly-accessible database and are available at this link: https://osf.io/rg36f/
Acknowledgements
The authors would like to acknowledge the wastewater treatment plants that permitted sample collection and the National Science Foundation (NSF 2027752-Collaborative Research-RAPID: Wastewater Informed Epidemiological Monitoring of SARS-CoV-2) for financial support in completing this work. DC was supported in part by T32ES007018, Biostatistics for Research in Environmental Health, funded by the National Institute of Environmental Health Sciences to the University of North Carolina at Chapel Hill. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Specific author contributions were: original draft and data analysis (GR, DC); conceptualization (JB, AB); sample collection and processing (KZ, AK, YL, RC, AL, JK, GR); reviewing and editing (all authors).