Abstract
The serovars of Salmonella enterica display dramatic differences in pathogenesis and host preferences. We developed a process (patent pending) for grouping Salmonella isolates and serovars by their public health risk to provide better Salmonella control targets along the food chain. We collated a curated set of 12,337 S. enterica isolate genomes from human, beef, and bovine sources in the US. After annotating a virulence gene catalog for each isolate, we used unsupervised random forest methods to estimate the proximity (similarity) between isolates based upon the genomic presentation of putative virulence traits We then grouped isolates (virulence clusters) using hierarchical clustering (Ward’s method), used non-parametric bootstrapping to assess cluster stability, and externally validated the virulence clusters against epidemiological virulence measures from FoodNet, the National Outbreak Reporting System (NORS), and US federal sampling of beef products. We identified five stable virulence clusters of S. enterica serovars. Cluster 1 (higher virulence) serovars yielded an annual incidence rate of domestically acquired sporadic cases roughly one and a half times higher than the other four clusters combined (Clusters 2-5, lower virulence). Compared to other clusters, cluster 1 also had a higher proportion of infections leading to hospitalization and was implicated in more foodborne and beef-associated outbreaks, despite being isolated at a similar frequency from beef products as other clusters. We also identified subpopulations within 11 serovars. Remarkably, we found S. Infantis and S. Typhimurium subpopulations that significantly differed in genome length and clinical case presentation. Further, we found that the presence of the pESI plasmid accounted for the genome length differences between the S. Infantis subpopulations. Our results demonstrate that S. enterica strains with the highest incidence of human infections share a common virulence repertoire. This work could be used in combination with foodborne surveillance information to best target serovars of public health concern.
Competing Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. EpiX Analytics LLC has received funds from the United States Department of Agriculture, the Beef Checkoff/National Cattlemen's Beef Association, and MatPrat Norway. F.Z. is a member of the USA National Advisory Committee on Microbiological Criteria for Foods (NACMCF). All authors are named inventors on a patent application for the methodology used to identify higher virulence pathogens.
Funding Statement
This research was partly funded by a grant from the Beef Checkoff administered by the Foundation for Meat and Poultry Research and Education (http://meatpoultryfoundation.org/) awarded to FZ. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. There was no additional external funding received for this study.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
We used NCBI's pathogen detection network (https://www.ncbi.nlm.nih.gov/pathogens/) We compiled S. enterica assemblies from four primary sources: 1. BioProject PRJNA242847 (FSIS HACCP samples, accessed 7/13/2021) 2. BioProject PRJNA292666 (FSIS NARMS isolates, accessed 7/13/2021) 3. BioProject PRJNA292661 (FDA NARMS isolates, accessed 8/25/2021) 4. human clinical cases from BioProject PRJNA230403 (CDC PulseNet, accessed 9/13/2021) The study also used publicly available data from the National Outbreak Reporting System (NORS) (https://www.cdc.gov/nors/index.html) and the CDC FoodNet (https://www.cdc.gov/foodnet/index.html).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Table 2 has been replaced with figure 5 to increase clarity. Likewise, text in the results and discussion section has been updated. Supplemental material has been re-organized for efficiency and transparency.
Data Availability
All data produced in the present study will be available in a public repository upon acceptance of this manuscript to a peer-reviewed journal.