Abstract
Background Viral genomes contain records of geographic movements and cross-scale transmission dynamics. However, the impact of regional heterogeneity, particularly among rural and urban centers, on viral spread and epidemic trajectory has been less explored due to limited data availability. Intensive and widespread efforts to collect and sequence SARS-CoV-2 viral samples have enabled the development of comparative genomic approaches to reconstruct spatial transmission history and understand viral transmission across different scales.
Methods We proposed a novel spatial transmission count statistic that efficiently summarizes the geographic transmission patterns imprinted in viral phylogenies. Guided by a time-scaled tree with ancestral trait states, we identified spatial transmission linkages and categorize them as imports, local transmissions, and exports. These linkages were then summarized to represent the epidemic profile of the focal area.
Results We demonstrated the utility of this approach for near real-time outbreak analysis using over 12,000 full genomes and linked epidemiological data to investigate the spread of the SARS-CoV-2 in Texas. Our study showed (1) highly populated urban centers were the main sources of the epidemic in Texas; (2) the outbreaks in urban centers were connected to the global epidemic; and (3) outbreaks in urban centers were locally maintained, while epidemics in rural areas were driven by repeated introductions.
Conclusions In this study, we introduce the Source Sink Score, which allows us to determine whether a localized outbreak may be the source or sink to other regions, and the Local Import Score, which assesses whether the outbreak has transitioned to local transmission rather than being maintained by continued introductions. These epidemiological statistics provide actionable information for developing public health interventions tailored to the needs of affected areas.
Plain Language Summary This study examined how COVID-19 spread through urban and rural areas in Texas by analyzing the virus’s genomes from over 12,000 samples. Our goal was to understand how the virus travels and impacts different regions. Our findings revealed that densely populated urban centers were the primary sources of the virus in Texas. In contrast, outbreaks in rural areas were often fueled by new introductions of the virus from external sources. To conduct this analysis, we employed new computational methods that track where the virus originates and where it spreads. These methods provide detailed information crucial for public health officials, particularly in regions where the virus has more severe impacts or exhibits unique spread patterns.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work has been funded in part from the National Institute of Allergy and Infectious Diseases, a component of the NIH, Department of Health and Human Services, under contract no. 75N93021C00018 (NIAID Centers of Excellence for Influenza Research and Response, CEIRR) and Centers for Disease Control and Prevention, Department of Health and Human Services, under contracts 75D30121C10133 and NU50CK000626. We acknowledge the GISAID contributors (acknowledgment table of genomes used is provided on our GitHub repository) for sharing genomic data.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
The primary updates to the manuscript include: (1) reformatting the paper to meet the guidelines required by Communications Medicine, (2) incorporating the sensitivity analysis, along with corresponding methods and results, and (3) expanding the discussion section to provide a more in-depth exploration of our findings and their implications.
Data Availability
All data produced in the present study are available upon reasonable request to the authors