Abstract
Viral genomes contain records of geographic movements and cross-scale transmission dynamics. However, the impact of population heterogeneity, particularly among rural and urban areas, on viral spread and epidemic trajectory has been less explored due to limited data availability. Intensive and widespread efforts to collect and sequence SARS-CoV-2 viral samples have enabled the development of comparative genomic approaches to reconstruct spatial transmission history and understand viral transmission across different scales. Large genomic datasets with few mutations present challenges for traditional phylodynamic approaches. To address this issue, we propose a novel spatial transmission count statistic that efficiently summarizes the geographic transmission patterns imprinted on viral phylogenies. Our analysis pipeline reconstructs a time-scaled phylogeny with ancestral trait states and identifies spatial transmission linkages, categorized as imports, local transmission, and exports. These linkages are summarized to represent the epidemic profile of the focal area. We demonstrate the utility of this approach for near real-time outbreak analysis using over 12,000 full genomes and linked epidemiological data to investigate the spread of the SARS-CoV-2 Delta variant in Texas. Our goal is to trace the Delta variant’s origin, timing and to understand the role of urban and rural areas in the spatial diffusion patterns observed in Texas. Our study shows (1) highly populated urban centers were the main sources of the epidemic in Texas; (2) the outbreaks in urban centers were connected to the global epidemic; and (3) outbreaks in urban centers were locally maintained, while epidemics in rural areas were driven by repeated introductions.
Significance Statement We developed a novel phylogeographic approach that analyzes transmission patterns at low computational cost. This method not only facilitates the inference of spatial scales of transmission but also enables exploration of how specific demographic characteristics influence transmission patterns among heterogenous populations. The rural population in the US, comprising approximately 60 million individuals, has been significantly impacted by COVID-19. Applying our new method, we examined the variations in epidemic patterns between urban centers (e.g., Houston) and rural areas in Texas. We found that urban centers are the primary source for SARS-CoV-2 in rural areas. This analysis lays the groundwork for designing effective public health interventions specifically tailored to the needs of affected areas.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work has been funded in part from the National Institute of Allergy and Infectious Diseases, a component of the NIH, Department of Health and Human Services, under contract no. 75N93021C00018 (NIAID Centers of Excellence for Influenza Research and Response, CEIRR) and Centers for Disease Control and Prevention, Department of Health and Human Services, under contracts 75D30121C10133 and NU50CK000626. We acknowledge the GISAID contributors (acknowledgment table of genomes used is provided on our GitHub repository) for sharing genomic data.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
All data produced in the present study are available upon reasonable request to the authors