ABSTRACT
The successful implementation of pathogen genomic surveillance demands rapid, low-cost genotyping solutions for tracking infections. Here we demonstrate the capacity of single nucleotide polymorphism (SNP) barcodes to generate practical information for malaria surveillance and control. The study was conducted in Papua New Guinea (PNG), a country with a wide range of malaria transmission intensities. A panel of 191 candidate SNPs was selected from 5786 SNPs with minor allele frequency greater than 0.1, identified amongst 91 Plasmodium falciparum genomes from three provinces of PNG. We then genotyped 772 P. falciparum isolates from a 2008 nationwide malaria indicator survey and 31 clinical infections from an outbreak of unknown origin. We assessed the performance of SNP panels with different allele frequency characteristics, and measured population diversity, structure and connectivity using both whole genome data and the SNP barcode. The full SNP barcode captured similar patterns of population structure evident with 5786 ‘whole genome’ SNPs. Geographically informative SNPs (iSNPs, FST>0.05) show increased population clustering compared to the full barcode whilst randomly selected SNPs (rSNPs) and SNPs with similar allele frequencies (FST<0.05) amongst different countries (universal, uSNPs) or local PNG populations (balanced, bSNPs) indicated little clustering. Applied to samples from all endemic areas of PNG, this barcode identified variable transmission dynamics, and eight major populations. Genetic diversity was high throughout most areas, however, in the southern region, isolates were either closely related, suggesting highly inbred or near-clonal populations; or, they shared ancestry with other parasite populations, consistent with importation. Applied to outbreak samples, only the full barcode, the iSNPs and bSNPs distinguished between locally acquired and imported infections. The full barcode contains more than 100 SNPs prevalent in other endemic regions, allowing the transfer of this tool to other settings. SNP barcodes must be validated in local settings to ensure they capture the diversity and population structure of the target population. Subsets of geographically informative SNPs will be essential for predicting geographic origins but may bias analyses of population structure and gene flow if used alone.
AUTHOR SUMMARY Pathogen genomic surveillance is a sensitive approach for mapping pathogen transmission dynamics to support decisions about how to prevent and control infectious diseases. High throughput genotyping tools known as single nucleotide polymorphism (SNP) barcodes are used to measure relationships between individuals and connectivity between populations, however, the barcode design may influence these results. We used whole genome sequences from the malaria parasite Plasmodium falciparum to design a barcode that captures the diversity both within and between parasite populations of Papua New Guinea (PNG), where transmission is variable amongst provinces. By investigating the performance of different panels of SNPs, we show that validation for use in the target population is crucial to correctly identifying population genetic structure. Applying the validated SNP barcode on hundreds of samples from all endemic provinces of PNG, high levels of variability in local transmission dynamics, and regions of population subdivision were observed. Some geographic areas show evidence of interrupted transmission, and the substantial genetic differentiation between the northern, eastern and island populations, presents an opportunity to design targeted, subnational control efforts. Application of the barcode to outbreak samples classified cases into imported and locally acquired infections, with substantial local transmission indicating control efforts were not sufficient to prevent the spread of infections. SNP barcodes are useful tools that can be used to supplement existing malaria surveillance tools however careful validation of their effectiveness in different settings is recommended.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
Sample collections, DNA extractions and molecular diagnosis were funded by the Global Fund to end AIDS Tuberculosis and Malaria (GFATM). Genetic and genomic data collection for this was funded through a National Health and Medical Research Council (NHMRC) of Australia Project Grant Number GNT1027108. IM and MB are supported by NHMRC Research Fellowships (GNT1155075, GNT1102971). The authors acknowledge the Victorian State Government Operational Infrastructure Support and Australian Government NHMRC Independent Research Institute Infrastructure Support Scheme (IRIISS).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Ethics and Informed Consent. All samples were collected as part of ongoing studies in PNG investigating the impact of intensified malaria control efforts since 2004. All samples were collected with written informed consent from individuals or if children, consent was obtained from their parents, and guardians. Ethical approval for the study was obtained from the Papua New Guinea Institute of Medical Research Institutional Review Board (IRB 11/21, 12/29), the PNG Medical Research Advisory Council (MRAC 12/03, 13/08), the Walter and Eliza Hall Institute Human Research Ethics Committee (HREC 12/06, 13/14) and Deakin Human Research Ethics Committee (2020-282, 2020-283).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
↵† deceased 27 April 2023
Data Availability
All raw whole genome sequence data is available at the European Nucleotide Archive using the accession numbers indicated in Table S1. The final SNP barcode dataset for 727 isolates and for 32 outbreak samples will be made available in a public repository upon publication, and in the interim will be available by contacting the corresponding author.