RT Journal Article SR Electronic T1 Applying Prospective Tree-Temporal Scan Statistics to Genomic Surveillance Data to Detect Emerging SARS-CoV-2 Variants and Salmonellosis Clusters in New York City JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2024.08.28.24312512 DO 10.1101/2024.08.28.24312512 A1 Greene, Sharon K. A1 Latash, Julia A1 Peterson, Eric R. A1 Levin-Rector, Alison A1 Luoma, Elizabeth A1 Wang, Jade C. A1 Bernard, Kevin A1 Olsen, Aaron A1 Li, Lan A1 Waechter, HaeNa A1 Mattias, Aria A1 Rohrer, Rebecca A1 Kulldorff, Martin YR 2024 UL http://medrxiv.org/content/early/2024/08/29/2024.08.28.24312512.abstract AB Genomic surveillance data are used to detect communicable disease clusters, typically by applying rule-based signaling criteria, which can be arbitrary. We applied the prospective tree-temporal scan statistic (TreeScan) to genomic data with a hierarchical nomenclature to search for recent case increases at any granularity, from large phylogenetic branches to small groups of indistinguishable isolates. Using COVID-19 and salmonellosis cases diagnosed among New York City (NYC) residents and reported to the NYC Health Department, we conducted weekly analyses to detect emerging SARS-CoV-2 variants based on Pango lineages and clusters of Salmonella isolates based on allele codes. The SARS-CoV-2 Omicron subvariant EG.5.1 first signaled as locally emerging on June 22, 2023, seven weeks before the World Health Organization designated it as a variant of interest. During one year of salmonellosis analyses, TreeScan detected fifteen credible clusters worth investigating for common exposures and two data quality issues for correction. A challenge was maintaining timely and specific lineage assignments, and a limitation was that genetic distances between tree nodes were not considered. By automatically sifting through genomic data and generating ranked shortlists of nodes with statistically unusual recent case increases, TreeScan assisted in detecting emerging communicable disease clusters and in prioritizing them for investigation.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was supported by the U.S. Centers for Disease Control and Prevention (NU90TP922035-05, NU50CK000517-01-09, NU50CK000517-05-00).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The Institutional Review Board of the NYC Health Department gave ethical approval for this work. The IRB determined this activity meets the definition of public health surveillance as set forth under 45 CFR 46.102(l)(2).I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesData, software, and code are available as follows at the links below: (1) SARS-CoV-2 variant data for New York City residents. (2) Allele codes for Salmonella isolates (available to CDC partners via SEDRIC). (3) SAS code for generating TreeScan input files. (4) TreeScan software for free download. (5) TreeScan source code. https://github.com/nychealth/coronavirus-data/tree/master/variants https://www.cdc.gov/foodborne-outbreaks/php/foodsafety/tools/ https://github.com/CityOfNewYork/communicable-disease-surveillance-nycdohmh https://www.treescan.org/ https://github.com/scanstatistics/treescan