Abstract
The Global Typhoid Genomics Consortium was established to bring together the typhoid research community to aggregate and analyse Salmonella enterica serovar Typhi (Typhi) genomic data to inform public health action. This analysis, which marks twenty-one years since the publication of the first Typhi genome, represents the largest Typhi genome sequence collection to date (n=13,000), and provides a detailed overview of global genotype and antimicrobial resistance (AMR) distribution and temporal trends, generated using open analysis platforms (GenoTyphi and Pathogenwatch). Compared with previous global snapshots, the data highlight that genotype 4.3.1 (H58) has not spread beyond Asia and Eastern/Southern Africa; in other regions, distinct genotypes dominate and have independently evolved AMR. Data gaps remain in many parts of the world, and we show potential of travel-associated data to provide informal “sentinel” surveillance for such locations. The data indicate ciprofloxacin non-susceptibility (>1 resistance determinant) is widespread across geographies and genotypes, with high-level resistance (≥3 determinants) reaching 20% prevalence in South Asia. Extensively drug-resistant (XDR) typhoid has become dominant in Pakistan (70% in 2020), but has not yet become established elsewhere. Ceftriaxone resistance has emerged in eight non-XDR genotypes, including a ciprofloxacin-resistant lineage (4.3.1.2.1) in India. Azithromycin resistance mutations were detected at low prevalence in South Asia, including in two common ciprofloxacin-resistant genotypes. The Consortium’s aim is to encourage continued data sharing and collaboration to monitor the emergence and global spread of AMR Typhi, and to inform decision-making around the introduction of typhoid conjugate vaccines (TCVs) and other prevention and control strategies.
Introduction
Salmonella enterica serovar Typhi (Typhi) causes typhoid fever, a predominantly acute bloodstream infection associated with fever, headache, malaise, and other constitutional symptoms. If not treated appropriately, typhoid fever can be fatal; mortality rates are estimated <1% today, but in the pre-antibiotic era ranged from 10-20% (Andrews et al., 2018; Stuart and Pullen, 1946). Historically, the disease was responsible for large-scale epidemics, triggered by the unsanitary conditions created during rapid urbanisation. Typhoid fever has since been largely controlled in many parts of the world due to large-scale improvements in water, sanitation, and hygiene (WASH) (Cutler and Miller, 2005), but was still responsible for an estimated 10.9 million illnesses and 116,800 deaths worldwide in 2017, largely in parts of the world where WASH is suboptimal (GBD 2017 Typhoid and Paratyphoid Collaborators, 2019). Antimicrobial therapy has been the mainstay of typhoid control, but multidrug resistance (MDR, defined as combined resistance to ampicillin, chloramphenicol and co-trimoxazole) emerged in the 1970s, and resistance to newer drugs including fluoroquinolones, third-generation cephalosporins, and azithromycin has been accumulating over the last few decades (Marchello et al., 2020).
In 2001, the first completed whole genome sequence of Typhi was published (Parkhill et al., 2001). The sequenced isolate was CT18, an MDR isolate cultured from a typhoid fever patient in the Mekong Delta region of Vietnam in 1993. The genome was the result of two years of work piecing together plasmid-cloned paired-end sequence reads generated by Sanger capillary sequencing. Together with other early bacterial pathogen genomes, including a second Typhi genome (Ty2) published two years later in 2003 (Deng et al., 2003), the CT18 genome was heralded as a major turning point in the potential for disease control, treatment, and diagnostics, providing new tools for epidemiology, molecular microbiology and bioinformatics. It formed the basis for new insights into comparative and functional genomics (Boyd et al., 2003; Faucher et al., 2006), and facilitated early genotyping efforts (Baker et al., 2008; Roumagnac et al., 2006). When high-throughput sequencing technologies such as 454 and Solexa (subsequently Illumina) emerged, Typhi was an obvious first target for in-depth characterisation of a single pathogen population (Holt et al., 2008), and genomics has been increasingly exploited to describe the true population structure and global expansion of this highly clonal pathogen (Wong et al., 2015). Now, whole genome sequencing (WGS) is becoming a more routine component of typhoid surveillance. Salmonellae were among the first pathogens to transition to routine sequencing by public health laboratories in high-income countries (Chattaway et al., 2019; Stevens et al., 2022), and these systems often capture Typhi isolated from travel-associated typhoid infections, providing an informal mechanism for sentinel genomic surveillance of pathogen populations in typhoid endemic countries (Ingle et al., 2019). More recently, WGS has been adopted for typhoid surveillance by national reference laboratories in endemic countries including the Philippines, Nigeria (Okeke et al., 2022) and South Africa (Lagrada et al., 2022), and PulseNet International is gradually transitioning to WGS (Davedow et al., 2022; Nadon et al., 2017). Following the first global genomic snapshot study, which included nearly 2000 genomes of Typhi isolated from numerous typhoid prevalence and incidence studies conducted across Asia and Africa (Wong et al., 2015), WGS has become the standard tool for characterising clinical isolates. Given the very high concordance between antimicrobial susceptibility to clinically relevant drugs and known genetic determinants of antimicrobial resistance (AMR) in Typhi (Argimon et al., 2021; Chattaway et al., 2021; da Silva et al., 2022), WGS is also increasingly used to infer resistance patterns.
The adoption of WGS for surveillance relies on the definition of a genetic framework with linked standardised nomenclature, often supplied by multilocus sequence typing (MLST) and core genome multilocus sequence typing (cgMLST) for clonal pathogens. Typhi evolves on the order of 0.5 substitutions per year, much more slowly than host-generalist Salmonellae, such as S. enterica serovars Kentucky and Agona (5 substitutions per year) (Achtman et al., 2021; Duchene et al., 2016). As a result, the cgMLST approach, which utilises 3,002 core genes (Zhou et al., 2020) (two-thirds of the genome) and is popular with public health laboratories for analysis of non-typhoidal S. enterica, has limited utility for Typhi. Instead, most analyses rely on identifying single nucleotide variants (SNVs) and using these to generate phylogenies. This approach allows for fine-scale analysis of transmission dynamics (although not resolving individual transmission events, due to the slow mutation rate (Campbell et al., 2018)) and tracking the emergence and dissemination of AMR lineages (Klemm et al., 2018; da Silva et al., 2022; Wong et al., 2015). In the absence of a nomenclature system such as that provided by cgMLST, an alternative strategy was needed for identifying and naming lineages. To address this challenge, a genotyping framework (‘GenoTyphi’) was developed that uses marker SNVs to assign Typhi genomes to phylogenetic clades and subclades (Wong et al., 2016), similar to the strategy that has been widely adopted for Mycobacterium tuberculosis (Coll et al., 2014). The GenoTyphi scheme was initially developed based on an analysis of almost 2,000 Typhi isolates from 63 countries (Wong et al., 2016). This dataset was used to define a global population framework based on 68 marker SNVs, which were used to define four primary clades, 15 clades, and 49 subclades organised into a pseudo-hierarchical framework. This analysis demonstrated that most of the global Typhi population was highly structured and included many subclades that were geographically restricted, with the exception of Haplotype 58, or H58 (so named by (Roumagnac et al., 2006), and designated as genotype 4.3.1 under the GenoTyphi scheme). H58 (genotype 4.3.1) was strongly associated with AMR and was found throughout Asia as well as Eastern and Southern Africa (Wong et al., 2016). The GenoTyphi framework has evolved and expanded to reflect changes in global population structure and the emergence of additional AMR-associated lineages (Dyson and Holt, 2021), and has been widely adopted by the research and public health communities for the reporting of Typhi WGS data (Chattaway et al., 2021; Ingle et al., 2021; da Silva et al., 2022). The genotyping framework, together with functionality for identifying AMR determinants and plasmid replicons, and generating clustering-based trees, is available within the online genomic epidemiology platform Typhi Pathogenwatch (Argimon et al., 2021). This system is designed to facilitate genomic surveillance and outbreak analysis for Typhi, including contextualisation with global public data, by public health and research laboratories (Argimón et al., 2021; Ikhimiukor et al., 2022a; Lagrada et al., 2022) without requiring major investment in computational infrastructure or specialist bioinformatics training.
The increasing prevalence of AMR poses a major threat to effective typhoid fever control. The introduction of new antimicrobials to treat typhoid fever have been closely followed by the development of resistance, beginning with widespread chloramphenicol resistance in the early 1970s (Anderson, 1975; Andrews et al., 2018). By the late 1980s, MDR typhoid had become common. The genetic basis for multidrug resistance was a conjugative (i.e., self-transmissible) plasmid of incompatibility type IncHI1 (Anderson, 1975), which was first sequenced as part of the Typhi str. CT18 genome in 2001 (Parkhill et al., 2001). This plasmid accumulated genes (blaTEM-1, cat, dfr and sul) encoding resistance to all three first-line drugs, mobilised by nested transposons (Tn6029 in Tn21, in Tn9) (Holt et al., 2011b; Wong et al., 2015). The earliest known H58 isolates were MDR, and it has been proposed that selection for multidrug resistance drove the emergence and dissemination of H58 (Holt et al., 2011b), which is estimated to have originated in South Asia in the mid-1980s (Carey et al., 2022; da Silva et al., 2022; Wong et al., 2015) before spreading throughout South East Asia (Holt et al., 2011a; Thanh et al., 2016b) and into Eastern and Southern Africa (Feasey et al., 2015; Kariuki et al., 2010; Wong et al., 2015).
The MDR transposon has subsequently migrated to the Typhi chromosome on several independent occasions (Ashton et al., 2015; Wong et al., 2015), allowing for loss of the plasmid and fixation of the MDR phenotype in various lineages. Other MDR plasmids do occur in Typhi but are comparatively rare (Argimon et al., 2021; Ingle et al., 2019; Rahman et al., 2020; Tanmoy et al., 2018; Wong et al., 2015).
The emergence of MDR Typhi led to widespread use of fluoroquinolones (mainly ciprofloxacin) as first-line therapy in typhoid fever treatment. Ciprofloxacin non-susceptibility (CipNS, defined by minimum inhibitory concentration [MIC] ≥0.06 mg/L) soon emerged and became common, particularly in South and South East Asia (Chau et al., 2007; Dyson et al., 2019). The genetic basis for this is mainly substitutions in the quinolone resistance determining region (QRDR) of core chromosomal genes gyrA and parC, which directly impact fluoroquinolone binding. These substitutions have arisen in diverse Typhi strain backgrounds (estimated >80 independent emergences) (da Silva et al., 2022) but appear to be particularly common in H58 (4.3.1) subtypes (Roumagnac et al., 2006; da Silva et al., 2022; Wong et al., 2015). The most common genetic pattern is a single QRDR mutation (typically at gyrA codon 83 or 87), which results in a moderate increase in ciprofloxacin MIC to 0.06-0.25 mg/L (Day et al., 2018) and is associated with prolonged fever clearance times and increased chance of clinical failure when treating with fluoroquinolones (Thanh et al., 2016a; Wain et al., 1997). An accumulation of three QRDR mutations raises ciprofloxacin MIC to 8-32 mg/L and is associated with higher occurrence of clinical failure (Thanh et al., 2016a). Triple mutants appear to be rare, with the exception of a subclade of 4.3.1.2 bearing GyrA-S83F, GyrA-D87N and ParC-S80I (designated genotype 4.3.1.2.1 (Ingle et al., 2022)), which emerged in India in the mid-1990s and has since been introduced into Pakistan, Nepal, Bangladesh and Chile (Britto et al., 2020; Maes et al., 2020; da Silva et al., 2022; Thanh et al., 2016a).
The challenge of fluoroquinolone non-susceptible typhoid was met with increased therapeutic use of third-generation cephalosporins (such as ceftriaxone and cefixime) or azithromycin (for non-severe disease) (Balasegaram et al., 2012; Basnyat et al., 2021; Rai et al., 2012). Reports of ceftriaxone treatment failure in late 2016 in Hyderabad, Pakistan led to the discovery of an extensively drug-resistant (XDR, defined as MDR plus resistance to fluoroquinolones and third-generation cephalosporins) clone of Typhi (genotype 4.3.1.1.P1, a subtype of H58), which subsequently spread throughout Pakistan (Klemm et al., 2018; Rasheed et al., 2020; Yousafzai et al., 2019). This XDR clone harbours a common combination of chromosomal AMR determinants (integrated MDR transposon plus single QRDR mutation, GyrA-83) but has also acquired an IncY-type plasmid carrying resistance genes, including qnrS (which, combined with GyrA-83 results in a ciprofloxacin-resistant phenotype with MIC >1 mg/L) and the extended-spectrum beta-lactamase (ESBL) encoded by blaCTX-M-15 (Klemm et al., 2018). The ESBL gene has subsequently migrated from plasmid to chromosome in some 4.3.1.1.P1 isolates (Nair et al., 2021). Other ESBL-producing, ceftriaxone resistant (CefR) Typhi strains have been identified in India (Argimón et al., 2021; Jacob et al., 2021a; Nair et al., 2021; Rodrigues et al., 2017a; Sah et al., 2019), via both local ‘in-country’ surveillance and travel-associated infections. The only oral therapy available to treat non-severe XDR Typhi infection is azithromycin (Levine and Simon, 2018), which, although effective, shows prolonged bacteremia and fever clearance times in the human challenge model and is not recommended for treatment of complicated typhoid fever (Jin et al., 2019). Azithromycin resistant (AziR) Typhi, which is associated with mutations in the chromosomal gene acrB, has now been reported across South Asia (Carey et al., 2021; Duy et al., 2020; Iqbal et al., 2020; Sajib et al., 2021) and has been linked to treatment failure in Nepal (Duy et al., 2020); however, the prevalence so far remains low (Hooda et al., 2019; da Silva et al., 2022). Imported infections caused by XDR Typhi 4.3.1.1.P1 have been identified in Australia (Ingle et al., 2021), Europe (Herdman et al., 2021; Nair et al., 2021), and North America (Eshaghi et al., 2020; Watkins et al., 2020); imported AziR Typhi infections are rarer but have been reported in Singapore (Octavia et al., 2021).
The accumulation of resistance to almost all therapeutic options means that there is an urgent need to track the emergence and spread of AMR Typhi, both to guide empiric therapy to prevent treatment failure (Nabarro et al., 2022), and to direct the deployment of preventative interventions like typhoid conjugate vaccines (TCVs) and WASH infrastructure. Given the wealth of existing and emerging WGS data for Typhi, we aimed to create a system to enhance visibility and accessibility of genomic data to inform current and future disease control strategies, including identifying where empiric therapy may need review, and monitoring the impact of TCVs on AMR and vaccine escape. In forming the Global Typhoid Genomics Consortium (GTGC), we aim to engage with the wider typhoid research community to aggregate Typhi genomic data and standardised metadata to facilitate the extraction of relevant insights to inform public health policy through inclusive, reproducible analysis using freely available and accessible pipelines and intuitive data visualisation. Here, we present a large, geographically representative dataset of thirteen thousand Typhi genomes, and provide a contemporary snapshot of the global genetic diversity in Typhi and its spectrum of AMR determinants. The establishment of the GTGC marks twenty-one years of typhoid genomics and provides a platform for future typhoid genomics activities, which we hope will inform more sophisticated disease control.
Methods
Ethical approvals
Each contributing study or surveillance programme obtained local ethical and governance approvals, as reported in the primary publication for each dataset. For this study, inclusion of data that were not yet in the public domain by August 2021 was approved by the Observational / Interventions Research Ethics Committee of the London School of Hygiene and Tropical Medicine (ref #26408), on the basis of details provided on the local ethical approvals for sample and data collection (Table S1).
Sequence data aggregation
Attempts were made to include all Typhi sequence data generated in the 20 years since the first genome was sequenced, through August 2021. Genome data and the corresponding data owners were identified from literature searches and sequence database searches (European Nucleotide Archive (ENA); NCBI Short Read Archive (SRA) and GenBank; Enterobase).
Unpublished data, including those from ongoing surveillance studies and routine public health laboratory sequencing, were identified through professional networks, published study protocols (Carey et al., 2020), and an open call for participation in the GTGC. All data generators thus identified were invited to join the GTGC and to provide or verify corresponding source information, with year and location isolated being required fields (‘metadata’, see below). Nearly all those contacted responded, and are included as Consortium authors on this study. The exceptions, where authors did not respond to email inquiries, were: (i) one genome reported from Malaysia (Ahmad et al., 2017) and n=133 draft genomes reported from India (Katiyar et al., 2020), which were excluded as sequence reads were not available in NCBI; and (ii) n=39 genomes reported in studies of travel-associated or local outbreaks (Burnsed et al., 2019; Hao et al., 2020; Shin et al., 2021), which were included as raw sequence data and sufficient metadata were publicly available. A further n=850 genomes sequenced by US Centers for Disease Control and Prevention and available in NCBI were excluded from analysis because travel history was unknown and most US cases are travel-associated. Table 1 summarises all studies and unpublished public health laboratory datasets from which sequence data were sourced.
Whole genome sequence data, in the form of Illumina fastq files, were sourced from the European Nucleotide Archive (ENA) or Short Read Archive (SRA) or were provided directly by the data contributors in the case of data that was unpublished in August 2021. Run, BioSample, and BioProject accessions are provided in Table S2, together with contributed metadata and PubMed or preprint identifiers.
Sequence analysis
Primary sequence analysis was conducted on the Wellcome Sanger Institute compute cluster. Genotypes, as defined under the GenoTyphi scheme (Dyson and Holt, 2021; Wong et al., 2016) were called directly from Illumina reads using Mykrobe v0.12.1 with Typhi typing panel v20221207, and collated using the Python code available at https://github.com/katholt/genotyphi (v2.0) (Ingle et al., 2022).
Illumina reads were assembled using the Centre for Genomic Pathogen Surveillance (CGPS) assembly pipeline v2.1.0 (https://gitlab.com/cgps/ghru/pipelines/dsl2/pipelines/assembly/) (Underwood, 2020), which utilises the SPAdes assembler (v3.12.0) (Bankevich et al., 2012; Prjibelski et al., 2020). One readset failed assembly and was excluded. Assemblies were uploaded to Pathogenwatch to confirm species and serovar, and to identify AMR determinants and plasmid replicons (Argimon et al., 2021). Eight assemblies were excluded as they were identified as non-Typhi: either other serovars of S. enterica (2 Paratyphi B, 2 Enteritidis, 1 Montevideo, 1 Newport, 1 Durban) or other species (1 Klebsiella pneumoniae). Assemblies >5.5 Mbp or <4.5 Mbp in size were also excluded from further analysis (n=35 excluded, see size distributions in Figure S1). The resulting 13,000 whole genome assemblies are available in Figshare, doi: 10.26180/21431883.
Phylogenetic trees were generated using Pathogenwatch, which estimates pairwise genetic distances between genomes (based on counting SNVs across 3,284 core genes) and infers a neighbour-joining tree from the resulting distance matrix (Argimon et al., 2021). The Pathogenwatch collections used to generate the tree files are available at https://bit.ly/Typhi4311P1 (tree showing position of Rwp1-PK1, in context with other genomes from Pakistan) and https://bit.ly/Typhi232 (tree for genotype 2.3.2 genomes).
Metadata curation and variable definitions
Owners of the contributing studies were asked to provide or update source information relating to their genome data, using a standardised template (http://bit.ly/typhiMeta). Repeat isolates were defined as those that represent the same occurrence of typhoid infection (acute disease or asymptomatic carriage) as one that is already included in the data set. In such instances, data owners were asked to indicate the ‘primary’ isolate (either the first, or the best quality, genome for each unique case) to use in the analysis. Repeat isolates were then excluded from the data set entirely (excluded from Table S2).
Data provided on the source of isolates (specimen type and patient health status) are shown in Table S3. This information was used to identify isolates that were associated with acute typhoid fever. In total, n=6,462 genomes were recorded as isolated from symptomatic individuals. A further n=119 were recorded as isolated from asymptomatic carriers. The remaining genomes had no health status recorded (i.e., symptomatic vs asymptomatic carrier); of these, the majority were isolated from blood (n=3,365) or the specimen type was not recorded (n=2,522). Since most studies and surveillance programmes are set up to capture acute infections rather than asymptomatic carriers, we defined ‘Assumed acute illness’ genomes as those not recorded explicitly as asymptomatic carriers (n=119) or coming from gallbladder (n=1) or environmental (n=14) samples; this resulted in a total of 12,831 genomes that were assumed to represent acute illness.
We defined ‘country of origin’ as the country of isolation; or for travel-associated infections, the country recorded as the presumed country of infection based on travel history (Centers for Disease Control and Prevention (CDC), 2011; Ingle et al., 2021, 2019; Matono et al., 2017).
Countries were assigned to geographical regions using the United Nations Statistics Division standard M49 (see https://unstats.un.org/unsd/methodology/m49/overview/); we used the intermediate region label where assigned, and subregion otherwise. To identify isolate collections that were suitably representative of local pathogen populations, for the purpose of calculating genotype and AMR prevalences for a given setting, data owners were asked to indicate the purpose of sampling for each study or dataset. Options available were either ‘Non Targeted’ (surveillance study, routine diagnostics, reference lab, other; n=11,086), ‘Targeted’ (cluster investigation, AMR focused, other; n=1,862) or ‘Not Provided’ (n=17).
Antimicrobial resistance (AMR) determinants and definitions
AMR determinants identified in the genome assemblies using Pathogenwatch were used to define AMR genotype as follows. Multidrug resistant (MDR): resistance determinants for chloramphenicol (catA1 or cmlA), ampicillin (blaTEM-1D, blaOXA-7), and co-trimoxazole (at least one dfrA gene and at least one sul gene). Ciprofloxacin non-susceptible (CipNS): one or more of the quinolone resistance determining region (QRDR) mutations at GyrA-83, GyrA-87, ParC-80, ParC-84, GyrB-464 or presence of a plasmid-mediated quinolone resistance (PMQR) gene (qnrB, qnrD, qnrS); note this typically corresponds to MIC ≥0.06 mg/L (Day et al., 2018).
Ciprofloxacin resistant (CipR): QRDR triple mutant (GyrA-83 and GyrA-87, together with either ParC-80 or ParC-84), or PMQR gene together with GyrA-83, GyrA-87 and/or GyrB-464. This typically corresponds to MIC ≥1 mg/L, and CipR is a subset of CipNS. Ceftriaxone resistant (CefR): presence of an ESBL (blaCTX-M-12, blaCTX-M-15, blaCTX-M-23, blaCTX-M-55, blaSHV-12). Extensively drug resistant (XDR): MDR plus CipR plus CefR. Azithromycin resistance (AziR): mutation at AcrB-717. The above lists all those AMR determinants that were found here in ≥1 genome and used to define AMR profiles and prevalences; additional AMR genes sought by Typhi Pathogenwatch but not detected are listed in Supplementary Table 2 of (Argimon et al., 2021).
Genotype and AMR prevalence estimates and statistical analysis
All statistical analyses were conducted in R v4.1.2, code is available in R markdown format at https://github.com/katholt/TyphoidGenomicsConsortiumWG1 (v1.0, doi: 10.5281/zenodo.7487862). Genotype and AMR frequencies were calculated at the level of country and UN world region (based on ‘country of origin’) as defined above. Inclusion criteria for these estimates were: known ‘country of origin’, known year of isolation, non-targeted sampling, assumed acute illness (see definitions of these variables above). A total of 10,726 genomes met these criteria; the subset of 9,478 isolated from 2010 onwards were the focus of the majority of analyses and visualisations. The prevalence estimates reported in text and figures are simple proportions; 95% confidence intervals for proportions are given in text and supplementary tables where relevant. Annual prevalences were estimated for countries that had N≥50 representative genomes and ≥3 years with ≥10 representative genomes. Association between MDR prevalence and prevalence of IncHI1 plasmids amongst MDR genomes was assessed for countries with ≥5% MDR prevalence between 2000 and 2020. The significance of increases or decreases in prevalence was assessed using a Chi-squared test for trend in proportions (using the proportion.trend.test function in R). There are no established thresholds for the prevalence of resistance that should trigger changes in empirical therapy recommendations for enteric fever; hence we defined our own categories of resistance prevalence for visualisation purposes, to reflect escalating levels of concern for empirical antimicrobial use: (i) 0, no resistance detected; (ii) >0 and ≤2%, resistance present but rare; (iii) 2-10%, emerging resistance; (iv) 10-50%, resistance common; (v) >50%, established resistance. Robustness of prevalence estimates was assessed informally, by comparing overlap of 95% confidence intervals computed for different laboratories from the same country (for genomes isolated 2010-2020, and laboratories with N≥20 genomes [Southern Asia] or N≥10 [Nigeria] meeting the inclusion criteria during this period).
Data visualisations
All analyses and plots were generated using R v4.1.2, code is available in R markdown format at https://github.com/katholt/TyphoidGenomicsConsortiumWG1 (v1.0, doi: 10.5281/zenodo.7487862). Data processing was done using the R packages tidyverse v1.3.1, dplyr v1.0.7, reshape2 v1.4.4 and janitor v2.1.0; figures were generated using packages ggplot2 v3.3.5, ggExtra v0.9, patchwork v1.1.1, RColorBrewer v1.1-2 and pals v1.7; maps were generated using packages sf v1.0-5, rvest v1.0.2, maps v3.4.0, scatterpie v0.1.7, ggnewscale v0.4.5; trees were plotted using ggtreeio v1.18.1 and ggtree v3.2.1.
Data availability statement
All data analysed during this study are publicly accessible. Raw Illumina sequence reads have been submitted to the European Nucleotide Archive (ENA), and individual sequence accession numbers are listed in Table S2. The full set of n=13,000 genome assemblies generated for this study are available for download from FigShare: doi 10.26180/21431883. All assemblies of suitable quality (n=12,849) are included in the online platform Pathogenwatch (https://pathogen.watch/organisms/styphi), where they can be interactively explored and included in user-driven comparative analyses. All underlying code developed for data analysis is freely available at https://github.com/katholt/TyphoidGenomicsConsortiumWG1 (v1.0, doi: 10.5281/zenodo.7487862).
Results
Overview of available data
A total of 13,000 confirmed Typhi genomes were collated from 65 studies and five unpublished public health laboratory datasets (see Tables 1, S2). N=35 genomes had assembly sizes outside of the plausible range (4.5-5.5 Mbp, see Figure S1), leaving n=12,965 high quality genomes originating from 111 countries. The distribution of samples by world region (as defined by WHO statistics division M49) is shown in Table 2, with country breakdown in Table S4. The majority originated from Southern Asia (n=8,231), specifically India (n=2,705), Bangladesh (n=2,268), Pakistan (n=1,810) and Nepal (n=1,436). A total of n=1,140 originated from South-eastern Asia, with >100 each from Cambodia (n=279), Vietnam (n=224), the Philippines (n=209), Indonesia (n=145), and Laos (n=139). Overall, 1,106 genomes originated from Eastern Africa, including >100 each from Malawi (n=569), Kenya (n=254), Zimbabwe (n=110). Other regions of Africa were less well represented, with n=384 from Western Africa, n=317 from Southern Africa, n=59 from Middle Africa (so-named in the M49 region definitions, although more commonly referred to as Central Africa), and n=41 from Northern Africa (see Tables 2 and S4 for details).
Overall, there were 36 countries with ≥20 genomes (total n=12,409 genomes, 95.7%) and 21 countries with ≥100 genomes (n=11,761 genomes, 90.7%) (see Table S4). Countries with the most genomes available (n≥100 each) were mainly those where local surveillance studies have utilised WGS for isolate characterisation (India (Britto et al., 2020; da Silva et al., 2022), Bangladesh (Rahman et al., 2020; da Silva et al., 2022), Nepal (Britto et al., 2018; da Silva et al., 2022; Thanh et al., 2016a), Pakistan (da Silva et al., 2022), Cambodia (Kuijpers et al., 2017; Thanh et al., 2016b), Laos (Wong et al., 2015), Vietnam (Holt et al., 2011a), Kenya (Kariuki et al., 2021, 2010), Malawi (Feasey et al., 2015), Zimbabwe (Mashe et al., 2020; Thilliez et al., 2022), Ghana (Park et al., 2018), Nigeria (Ikhimiukor et al., 2022a; International Typhoid Consortium et al., 2016), Chile (Maes et al., 2022), Samoa (Sikorski et al., 2022)); plus South Africa, the Philippines (Lagrada et al., 2022), United Kingdom and United States, where Typhi isolates are sequenced as part of national surveillance programmes.
The genome collection included n=3,381 isolates that were recorded as travel-associated (see Tables 2 and S4), contributed mainly by public health reference laboratories in England (n=1,740), USA (n=749), Australia (n=490), New Zealand (n=144), France (n=116) and Japan (n=104). The most common countries of origin for travel-associated isolates were India (n=1,241), Pakistan (n=783), Bangladesh (n=264), Fiji (n=102), Samoa (n=87), Mexico (n=60), Chile (n=49), Papua New Guinea (n=45), Nigeria (n=42), and Nepal (n=39). For some typhoid-endemic countries, the majority of genome data originated from travel-associated infections captured in other countries; those in this category with total n≥10 genomes are Guatemala (n=22/22), El Salvador (n=19/19), Mexico (n=60/61), Peru (n=14/14), Haiti (n=12/12), Morocco (n=12/13), Iraq (n=19/19), Malaysia (n=35/35), Fiji (n=102/144) and Papua New Guinea (n=45/86) (full data in Table S4).
In total, n=10,726 genomes were assumed to represent acute typhoid fever and recorded as derived from ‘non-targeted’ sampling frames, i.e. local population-based surveillance studies or reference laboratory-based national surveillance programmes that could be considered representative of a given time (year of isolation) and geography (country and region of origin) (see Methods for definitions). The majority of these isolates (n=9,478, 88.4%) originate from 2010 onwards, hence we focus our reporting of genotype and AMR prevalences on this period. Most come from local typhoid surveillance studies (n=5,574) or routine diagnostics/reference laboratory referrals capturing locally acquired (n=1,543) or travel-associated (n=2,284) cases. All prevalence estimates reported in this study derive from this data subset, unless otherwise stated.
Geographic distribution of genotypes
The breakdown of genotype prevalence by world region, for genomes isolated from 2010 onwards, is shown in Figure 1a (denominators in Table 2, full data in Table S5). Annual breakdown of regional genotype prevalences is given in Figure S2 (raw data, proportions, and 95% confidence intervals in Table S5). Notably, while our data confirm that H58 genotypes (4.3.1 and derived) dominate in Asia, Eastern Africa, and Southern Africa, they were virtually absent from other parts of Africa, from South and Central America, as well as from Polynesia and Melanesia (Figure 1). Instead, each of these regions was dominated by their own local genotypes. Typhoid fever is no longer endemic in Northern America, Europe, or Australia/New Zealand. The genotype distributions shown for these regions were estimated from Typhi that were isolated locally but not recorded as being travel-associated; nevertheless, these genomes can be assumed to result from limited local transmission of travel-associated infections, and thus to reflect the diversity of travel destinations for individuals living in those regions. Annual national genotype prevalences for well-sampled countries with endemic typhoid are shown in Figure 1b (full data in Table S6 and Figure S3). Below, we summarise notable features of the global genotype distribution, by world region (as defined by WHO statistics division, see Methods).
Southern Asia
Southern Asia was the most represented region, with 6,623 genomes suitable for prevalence analysis. The genotype distribution confirms the widely-reported finding that the H58 lineage (4.3.1 and derived genotypes) is the dominant form of Typhi in Southern Asia, where it is thought to have originated (Carey et al., 2022; Roumagnac et al., 2006; da Silva et al., 2022; Wirth, 2015; Wong et al., 2015) (overall prevalence, 70.4% [95% CI, 69.3-71.5%]; n=4,662/6,623). Notably though, the distribution of H58 genotypes was different between countries in the region (see Figure 1b), and in Bangladesh it was associated with a minority of genomes (42% [n=670/1,591], compared with 73% in India [n=1,655/2,267], 74% in Nepal [n=941/1,275], and 94% in Pakistan [n=1,390/1,484]). India and Nepal were dominated by sublineage 2 (genotype 4.3.1.2 and derived genotypes; 54% [n=1,214/2,267] and 57% [n=736/1,275], respectively), which was rare in Bangladesh (0.6%; n=9/1,591) and Pakistan (3.2%; n=47/1,484). In India, H58 lineage 1 (4.3.1.1) was also present at appreciable frequency (12%; n=268/2,267) as was 4.3.1 (i.e. H58 that does not belong to any of the defined sublineages 4.3.1.1-3; 7.4% [n=168/2,267]). In Nepal, 4.3.1 was present at 12% frequency (n=152/1,275) and 4.3.1.1 at just 4.9% (n=63/1,275).
In Pakistan, lineage 1 (genotype 4.3.1.1 and derived genotypes) was most common (73%; 1,089/1,484), with the XDR sublineage (genotype 4.3.1.1.P1) appearing in 2016 (Gul et al., 2017; Klemm et al., 2018; Rasheed et al., 2020) and rapidly rising to dominance (87% in 2020 [n=27/31]; see Figure 1b). Pakistan also had prevalent 4.3.1 (17%; n=254/1,484). H58 lineage 1 (4.3.1.1) was the single most common genotype in Bangladesh, but made up only one-third of the Typhi population (34%; n=546/1,591). Bangladesh has its own H58 lineage 3 (4.3.1.3) (Rahman et al., 2020; Tanmoy et al., 2021), whose prevalence was 7.1% (n=113/1,591); only two 4.3.1 isolates and nine 4.3.1.2 isolates were detected. Non-H58 genotypes were also evident in the region, with the greatest diversity in Bangladesh (see Figure 1b). Those exceeding 5% in any one country were: 3.3.2 (5.8% in Bangladesh [n=93/1,591], 12.9% in Nepal [n=164/1,275]), 2.5 in India (8.4%; n=190/2,267), 3.3 in India (6.6%; n=150/2,267), 2.3.3 in Bangladesh (17.2%; n=274/1,591), and 3.2.2 in Bangladesh (6.6%; n=264/1,591). Annual prevalence estimates were fairly stable over the past decade, with the exception of the 4.3.1.1.P1 in Pakistan, which emerged in 2016 and became dominant shortly thereafter (see Figure 1b).
South-eastern and Western Asia
In South-eastern Asia, H58 accounted for 47.3% [95% CI, 43.2-51.3%; 276/584] of isolates in aggregate (mostly 4.3.1.1, 43.0% of total genomes; 251/584). However, the population structures varied between individual countries in the region (see Figures 1b and S3), with H58 accounting for nearly all isolates in Cambodia (98%, n=216/221, all lineage 1), Myanmar (94%, n=46/49, mixed lineages), and Singapore (n=4/4, mixed lineages), but largely absent from Indonesia (3%, n=2/65), Laos (4%, n=1/27) and the Philippines (0.5%, n=1/206). These latter countries showed distinct populations with multiple genotypes exceeding 5% frequency: 4.1 (26%, n=17/65), 3 (18%, n=12/65), 2.1 (15%, n=10/65) and 3.1.2 (12%, n=8/65) in Indonesia; 3.4 (44%, n=12/27), 3.5.2 (15%, n=4/27), 2.3.4 (11%, n=3/27), 3.2.1 (11%, n=3/27) and 4.1 (7%, n=2/27) in Laos; 3 (79%, n=163/206), 3.2.1 (11%, n=23/206) and 4.1 (7%, n=16/206) in the Philippines (Lagrada et al., 2022).
Data from Western Asia were limited to a small number of travel-associated infections (total n=21, from Iraq, Lebanon, Qatar, Saudi Arabia, Syria, United Arab Emirates), most of which were H58 (71%; n=15/21); with 38% 4.3.1.1 (n=8/21) and 19% 4.3.1.2 (n=4/21).
Africa
Only 1,410 (15%) of the 9,478 genomes from untargeted sampling frames in 2010-2020 were isolated from residents in or travellers to Africa. There is significant underrepresentation from this continent with high endemicity and varying epidemiology across subregions. Our aggregated data confirmed that H58 was the dominant cause of typhoid in Eastern Africa during the study period (93.3% H58 [95% CI, 91.5-95.0%] 774/830; see Figure 1a). It was recently shown that H58 in Kenya was derived from three separate introductions of H58 into the region, which are now assigned their own genotypes (Kariuki et al., 2021) (4.3.1.1.EA1, 4.3.1.2.EA2, 4.3.1.2.EA3). Here, we found that at the region level, 4.3.1.1.EA1 dominated (78%, [95% CI 75.1-80.8%] n=647/830; see Figure 1a). However, there were country-level differences, with 4.3.1.1.EA1 dominating in Malawi (94%; n=524/558), Tanzania (83%; n=15/18), Zimbabwe (80%; n=20/25) and earlier years in Kenya (59%, n=86/145 in 2012-2016), and 4.3.1.2.EA3 dominating in Rwanda (85%, n=23/27) and Uganda (97%, n=35/36) (Figures 1b and S3).
Although the specific periods of sampling differ for these countries, the prevalence of H58 was consistently high across the available time frames for all countries, with no change in dominant genotypes (see Figures 1b and S3; note the apparent shift to 4.3.1.2.EA3 in Kenya is based on n=4 isolates only so requires confirmation).
The majority of Typhi from Southern Africa were isolated in South Africa between 2017-2020 (92%; n=262/285), via routine sequencing at the National Institute for Communicable Diseases reference laboratory. H58 prevalence in South Africa was high (69.5%, [95% CI, 63.9-75.1%]; n=182/262) during this time period (mostly 4.3.1.1.EA1, 64%; n=168/262), but was much lower (25% [95% CI, 4-46%]) among the smaller sampling of earlier years (n=4/16 for 2010-2012) (see Figure 1b).
In Western Africa, the common genotypes were 3.1.1 (64.4%, [95% CI, 58.7-70.2%]; n=172/266) and 2.3.2 (13.9%, [95% CI, 9.7-18.0%] n=37/266) (Figure 1a). Most of these data come from the Typhoid Fever Surveillance in Africa Programme (TSAP) genomics report (Park et al., 2018) and a study of typhoid in Abuja and Kano in Nigeria (International Typhoid Consortium et al., 2016), which showed that in the period 2010-2013, 3.1.1 dominated in Nigeria and nearby Ghana and Burkina Faso, whereas 2.3.2 dominated in The Gambia and neighbouring Senegal and Guinea Bissau (Park et al., 2018). Here, we find that additional data from travel cases and recent Nigerian national surveillance (Ikhimiukor et al., 2022a) suggest that these patterns reflect long-established and persisting populations in the Western African region (see Figures S1b and S3): 3.1.1 was detected from Benin (2002-2009; n=4/4), Burkina Faso (2006-2013; n=11/17), Cote d’Ivoire (2006-2008; n=4/4), The Gambia (2015; n=2/28), Ghana (2007-2017; n=93/109), Guinea (2009; n=1/2), Mali (2008; n=1/5), Mauritania (2009; n=1/2), Nigeria (2008-2019; n=122/192), Sierra Leone (2015-2017; n=2/2) and Togo (2004-2006; n=2/3); and 2.3.2 from Burkina Faso (2012-2013; n=2/17), The Gambia (2008-2014; n=25/28), Ghana (2010-2018; n=9/109), Guinea Bissau (2012-2013; n=2/3), Mali (1999-2018; n=3/5), Niger (1990-1999; n=2/4), Nigeria (1984-2002; n=4/192), Senegal (2012; n=6/10), and Togo (2001; n=1/3).
Very limited genome data were available from the Middle Africa region (n=19; Table 2). Genomes from Democratic Republic of the Congo (DRC) comprised 16 genotype 2.5.1 isolates (15 isolated locally, plus one from USA CDC) and a single 4.3.1.2.EA3 isolate (from the UK reference lab). Two genomes each were available from Angola (both 4.1.1, via UK) and Chad (both 2.1, via France). Northern Africa was similarly poorly represented, with one isolate from Egypt (0.1, via UK), two from Morocco (0.1, via UK and 1.1, via USA), two from Sudan (genotype 4, via UK) and one from Tunisia (3.3, from UK).
The Americas
Strikingly, Central American isolates were dominated by 2.3.2 (55%, [95% CI, 45.2-64.8%] n= 55/100), which was also common in Western Africa (13.9%, [95% CI, 9.7-18.0%]; n=37/266) (Figure 1a). Little has been reported about Typhi populations from this region previously, and the genomes collated here were almost exclusively novel ones contributed via the US CDC and isolated between 2016 and 2019. The available genomes for the period 2010-2020 mainly originated from El Salvador (n=19, 2012-2019, 89% 2.3.2), Guatemala (n=22, 2016-2019, 41% 2.3.2) and Mexico (n=58, 2011-2019, 50% 2.3.2). Prior to 2010, genotype 2.3.2 was also identified in isolates from Mexico referred to the French reference lab in 1972 (representing a large national outbreak (Baine et al., 1977)) and 1998. The distance-based phylogeny for 2.3.2 included several discrete clades from different geographical regions in West Africa and the Americas (see Figure S4), consistent with occasional continental transfers between these regions followed by local clonal expansions. Three clades were dominated by West African isolates (one with isolates from West Coast countries, and two smaller clades from Nigeria and neighbouring countries); two clades of South American isolates (from Chile, Argentina and Peru); one small clade of Caribbean (mainly Haiti) and USA isolates; and one large clade of Central American isolates (from Mexico, Guatemala and El Salvador) (see Figure S4). Other common genotypes identified in Central America were 2.0.2 (overall prevalence 24% [95% CI, 16-32%, n=24/100]; 32% in Guatemala [n=7/22], 26% in Mexico [n=15/58], 11% in El Salvador [n=2/19],) and 4.1 (17%, [95% CI, 9.6-24% n=17/100]; 23% in Guatemala [n=5/22], 21% in Mexico [n=12/58], not detected from El Salvador).
There were 105 genomes available from South America, of which 92% (n=97) were from a recent national surveillance study in Chile (Maes et al., 2022). South American Typhi were genetically diverse, with no dominant genotype accounting for the majority of cases in the 2010-2020 period (Figure 1a). Genotypes with ≥5% prevalence in the region were 3.5 (27%; n=28/105), 1.1 (18%; n=19/105), 2 (18%; n=19/105), 1.2.1 (5.7%; n=6/105) and 2.0.2 (5.7%; n=6/105). WGS data recently reported by Colombia’s Instituto Nacional de Salud (Guevara et al., 2021) were not included in the regional prevalence estimates as they covered only a subset (5%) of surveillance isolates that were selected to maximise diversity, rather than to be representative. However, only four genotypes were detected in the Colombia study (1.1, 2, 2.5, 3.5), and two-thirds of isolates sequenced were genotype 2.5 (67%; n=51/77); 3.5 was also common, at 25% (n=20/77) (Guevara et al., 2021). Similarly, all five isolates from French Guiana (sequenced via the French reference laboratory) were genotype 2.5, consistent with limited diversity and a preponderance of genotype 2.5 organisms in the north of the continent.
Pacific Islands
In Melanesia and Polynesia, each island has their own dominant genotype (Figure 1a): 2.1.7 and its derivatives in Papua New Guinea (n=5/5 in post-2010 genomes, consistent with the longer-term trend) (Dyson et al., 2022), 3.5.3 and 3.5.4 in Samoa (96%; n=249/259, consistent with a recent report) (Sikorski et al., 2022), and 4.2 and its derivatives in Fiji (97%; n=31/32, consistent with recent data that was not yet available at the time of this analysis) (Davies et al., 2022).
Global distribution of AMR
We estimated the regional and national prevalence of clinically relevant AMR profiles in Typhi for the period 2010-2020, inferred from WGS data from non-targeted sampling frames for which country of origin could be determined (as per genotype prevalences, see Methods). In order to understand the potential implications of these AMR prevalences for local empirical therapy, we categorised them according to a traffic light-style system (see Methods), whereby amber colours signal emerging resistance of potential concern (<10%), and red colours signal levels of antimicrobial resistance that may warrant reconsideration of empirical antimicrobial use (>10%; see Figures 2 and S5). The regional view (Figure S5, Table S7) highlights that ciprofloxacin non-susceptibility (CipNS) is widespread, whereas ciprofloxacin resistance (CipR), azithromycin resistance (AziR), and XDR have been mostly restricted to Southern Asia. MDR was most prevalent in African regions, and to a lesser degree in Asia. Full country-level data is mapped in Figure S6 and detailed in Table S8. National estimates for countries with sufficient data where typhoid is endemic (≥50 representative genomes available for the period 2010-2020, see Figure 2) indicate that MDR remains common across all well-sampled African countries (39% in Nigeria, 61% in South Africa, 66% in Ghana, 78% in Kenya, 93% in Malawi), but is much more variable in Asia (3% in India [n=67/2,267] and Nepal [n=36/1,275], 25% in Bangladesh [n=393/1,591], 68% in Pakistan [n=1,004/1,484], 76% in Cambodia [n=167/221]) and essentially absent from Indonesia (n=0), the Philippines (n=0), Samoa (n=0), Mexico (n=1, 1.7%) and Chile (n=0). The underlying genotypes are shown in Figure S7, and highlight that MDR in Asia, Eastern Africa and Southern Africa has been mostly associated with H58 (i.e. 4.3.1 and derived genotypes) but in Western Africa is associated with the dominant genotype in that region, 3.1.1. In contrast, ciprofloxacin non-susceptibility was associated with more diverse Typhi genotypes in each country, including essentially all common genotypes in Southern Asian countries (Figure S7). National annual prevalence data suggest AMR profiles were mostly quite stable over the last decade (with the notable exception of the emergence and rapid spread of XDR Typhi in Pakistan) but reveal some interesting differences between settings in terms of AMR trends and the underlying genotypes (see Figures 3, S7-9).
Ciprofloxacin non-susceptible (CipNS)
Ciprofloxacin non-susceptibility (CipNS) was near-ubiquitous (exceeding 95% prevalence) in India and Bangladesh throughout the period 2010-2020 (Figures 3, S8). This was associated mainly with GyrA-S83F (79% prevalence in Bangladesh, 70% in India) and GyrA-S83Y mutations (9.2% prevalence in Bangladesh, 26% in India), which were detected across diverse genotype backgrounds (Figures S7, S9); in total, CipNS variants were present in 30 genotype backgrounds in India (out of n=34 genotypes, 88%) and 17 in Bangladesh (out of n=21 genotypes, 81%). In neighbouring Nepal, CipNS prevalence has stabilised in the 85-95% range since 2011 (70% GyrA-S83F, 12% GyrA-S83Y; CipNS in 12 genotype backgrounds) (see Figures 3 and S7). The persistence of ciprofloxacin-susceptible Typhi in Nepal was largely associated with genotype 3.3.2, which maintained annual prevalence of 3-10% (mean 5.8%) throughout 2010-2018, rising to 39% in 2019. In Pakistan, CipNS has exceeded 95% since 2012 (Figure 3), across n=14/17 genotypes (Figure S7). Sustained high prevalence of CipNS was also evident in Cambodia (4.3.1.1 with GyrA-S83F). In contrast, CipNS has been relatively rare in African countries, but has been increasing in recent years, especially in Kenya (from 20% in 2012 to 65% in 2016, p=3×10-9 using proportion trend test) and Nigeria (from 8% in 2013 to 80% in 2019, p=7×10-6; see Figure 3). CipNS in these settings was associated with QRDR mutations in the locally dominant genotypes, specifically GyrA-S83F (15% of 4.3.1.1.EA1), GyrA-S83Y (100% of 4.3.1.2.EA3) and GyrA-S464F in Kenya (100% of 4.3.1.2.EA2), and GyrA-S83Y (27% of 3.1.1) in Nigeria (see Figures S7 and S9).
Ciprofloxacin resistant (CipR)
Ciprofloxacin resistance emerges in a stepwise manner in Typhi, through acquisition of additional QRDR mutations and/or PMQR genes in strains already carrying a QRDR mutation. CipR genomes were common (≥10%) in Pakistan, India, and Nepal, and emerging (3-6%) in Bangladesh, South Africa, Chile and Mexico (Figure 2). A total of 26 distinct CipR genotypes (comprising unique combinations of Typhi genotype, QRDR mutations and/or PMQR genes) were identified, of which five were found in appreciable numbers (>5 genomes each, see Figure S10). The XDR strain 4.3.1.1.P1 (carrying GyrA-S83F + qnrS) was first identified in Pakistan in 2016 (Klemm et al., 2018; Rasheed et al., 2020), and here accounted for 75% of Typhi genomes from Pakistan in 2020 and a dramatic rise in CipR prevalence (Figure 3). This genotype was only detected three times without a known origin in Pakistan (one isolate each in India, Mexico, and USA, see Figure S10). The CipR strain 4.3.1.3.Bdq (carrying GyrA-S83F and qnrS) emerged in Bangladesh in ∼1989 (da Silva et al., 2022) and here accounted for 95% of CipR genomes in this country. 4.3.1.3.Bdq genomes were also detected in India (n=4), Singapore (n=1) and South Africa (n=1). The other major CipR genotypes were the QRDR triple-mutant 4.3.1.2.1, its derivative 4.3.1.2.1.1 (which also carries plasmid-borne qnrB), and a QRDR triple-mutant sublineage of 3.3. These three CipR variants were most common in India, where we estimated consistently high CipR prevalence (19-27% per year) from 2014 onwards (Figure 3), associated with 15 unique CipR genotypes (Figure S10). Most Indian CipR genomes belong to 4.3.1.2.1 (92.3%). CipR 4.3.1.2.1 was also found in 12 other countries, most notably Nepal (accounting for 95% of CipR genomes), where it has been shown to have been introduced from India and result in treatment failure (Thanh et al., 2016a); Pakistan (accounting for 6.6% of CipR genomes); Myanmar (accounting for n=17/17 CipR genomes); and Chile (accounting for n=5/5 CipR genomes) (see Figure S10). The 3.3 QRDR triple-mutant accounted for 3.8% of CipR genomes in India, and was also found in neighbouring Nepal (n=4, 3% of CipR). CipR genomes were identified from Zimbabwe (4.3.1.1.EA1 with gyrA S83F + qnrS, associated with recent CipR outbreaks (Thilliez et al., 2022)) and South Africa (five different genotypes, totalling 3.5%; see Figure S10), but were otherwise absent from African Typhi genomes.
Multidrug resistant (MDR)
Prevalence of MDR (co-resistance to ampicillin, chloramphenicol and co-trimoxazole) has declined in India (p=2×10-9 using proportion trend test) to 2% (0-3% per year, 2016-2020), and is similarly rare in Nepal (mean 5% in 2011-2019) (see Figure 3). MDR prevalence has also declined in Bangladesh (p=2×10-4 using proportion trend test), but remains high enough to discourage deployment of older first-line drugs, with prevalence exceeding 20% in most years (see Figure 3). In Pakistan, the emergence of the XDR strain 4.3.1.1.P1 has driven up MDR prevalence dramatically (p=4×10-11 using proportion trend test), to 87% in 2020 (see Figures 3 and S7b). MDR prevalence has remained high in Kenya and Malawi since the first arrival of MDR H58 strains (estimated early 1990s in Kenya (Kariuki et al., 2021); 2009 in Malawi (Feasey et al., 2015)), but has declined steadily in Nigeria, from 72% in 2009 to 10% in 2017 (p=3×10-4 using proportion trend test; see Figure 3). All MDR isolates in Nigeria were genotype 3.1.1 and carried large IncHI1 MDR plasmids, which are associated with a fitness cost (Doyle et al., 2007). Chromosomal integration of the MDR transposon, which accounted for 100% of MDR in Malawi and 19% in Kenya (all in H58 genotype backgrounds), is associated with comparably lower fitness cost; and this difference in fitness cost may explain why MDR has remained at high prevalence in some settings (where resistance is chromosomally integrated) while declining in other settings (where resistance is plasmid-borne).
Figure S11 shows prevalence of MDR overlaid with prevalence of IncHI1 plasmid carriage amongst MDR strains. Two countries showed a significant rise in MDR prevalence (Pakistan, p=4×10-11; South Africa, p=9×10-8); in both countries, this rise coincided with loss of IncHI1 plasmids (see Figure S11) and assumed migration of MDR to the chromosome (as has been clearly shown in XDR 4.3.1.1.P1 strains in Pakistan) (Klemm et al., 2018). A decline in the prevalence of MDR over time was observed in Cambodia as in Nigeria, whereby all MDR strains belonged to the same genotype (4.3.1.1 in Cambodia, 3.1.1 in Nigeria) and carried the IncHI1 plasmid (see Figure S11). As noted above, MDR was maintained at high levels in Kenya and Malawi, where the IncHI1 plasmid frequency was either in decline (Kenya) or entirely absent (Malawi; see Figure S11). Notably, a significant decline in total MDR prevalence was observed in Bangladesh (p=2×10-4), and in MDR prevalence within the dominant genotype 4.3.1.1 (p=0.049), despite the majority of MDR (and all MDR within 4.3.1.1) being chromosomal rather than plasmid-associated (Rahman et al., 2020; da Silva et al., 2022). However, as noted above, MDR did persist in Bangladesh (exceeding 20% in most years). This is consistent with the hypothesis that the MDR plasmid is associated with a fitness cost that is removed when the MDR transposon becomes chromosomally-integrated.
Extensively drug resistant (XDR)
The XDR 4.3.1.1.P1 sublineage (that is, MDR with additional resistance to fluoroquinolones and third-generation cephalosporins including ceftriaxone) was recognised as emerging in late 2016 in Sindh Province, where it caused an outbreak of XDR typhoid that has since spread throughout Pakistan (Klemm et al., 2018; Nair et al., 2021; Rasheed et al., 2020). Here, we identified the genome of strain Rwp1-PK1 (assembly accession NIFP01000000), isolated from Rawalpindi in July 2015, as genotype 4.3.1.1.P1. Rwp1-PK1 was isolated from a 17 year old male with symptomatic typhoid whose infection did not resolve following ceftriaxone treatment and was found to be phenotypically XDR (resistant to ampicillin, co-trimoxazole, chloramphenicol, ciprofloxacin, ceftriaxone) (Munir et al., 2016). The isolate was later sequenced and reported as carrying blaCTX-M-15, blaTEM-1, qnrS1 and GyrA-S83F (Gul et al., 2017), but was not genotyped nor included in comparative genomics analyses investigating the emergence of XDR in Pakistan, so has not previously been recognised as belonging to the 4.3.1.1.P1 XDR sublineage. We found that the Rwp1-PK1 genome carries the 4.3.1.1.P1 marker SNV, clusters with the 4.3.1.1.P1 sublineage in a core-genome tree (Figure 4), and shares the full set of AMR determinants typical of 4.3.1.1.P1, indicating that this XDR strain was present in northern Pakistan for at least a full year before it was reported as causing outbreaks in the southern province of Sindh.
Ceftriaxone resistant (CefR)
There was no evidence for establishment of 4.3.1.1.P1 nor other XDR lineages outside Pakistan. However, ESBL genes were identified in n=32 non-4.3.1.1.P1 genomes, belonging to eight other genotypes (Table 3). Several carried a blaCTX-M-15; these include instances with no other acquired AMR genes (genotype 3 in the Philippines (Hendriksen et al., 2015b; Lagrada et al., 2022); genotype 4.3.1.2 in Iraq (Nair et al., 2021)); one instance with chromosomally integrated AMR genes plus IncY plasmid-borne blaCTX-M-15 (genotype 2.5.1 in DRC (Phoba et al., 2017)); and instances with a 4.3.1.1.P1-like profile carrying qnrS in the IncY plasmid and the MDR locus in the chromosome (n=4 4.3.1, India and Pakistan; n=1 4.3.1.1, Pakistan; see Table 3). However, overall, blaCTX-M-15 IncY plasmids were rare (n=1 to 4 genomes) in all genotype backgrounds except 4.3.1.1.P1 (total n=655), suggesting that the IncY blaCTX-M-15 plasmid has not been stably maintained in other Typhi lineages (see Table 3). IncY plasmids were also identified in a single genotype 2.3.3 organism isolated in the UK in 1989 associated with travel to Pakistan (carrying catA1, tetA(B)); and in a sublineage of IncHI1-negative 3.1.1 genomes from Nigeria carrying blaTEM-1D, dfrA14, sul2, tetA(A), as has been recently reported (Ikhimiukor et al., 2022a; International Typhoid Consortium et al., 2016). Other examples of ESBL carriage in Typhi genomes appear to represent isolated events (1 or 2 genomes per ESBL/plasmid or ESBL/genotype combination, see Table 3), except for a sublineage of 4.3.1.2.1 from India carrying blaSHV-12 in a IncX3 plasmid backbone. Concerningly, the plasmid also carries qnrB and is present in the well-established 4.3.1.2.1 QRDR triple-mutant strain background, resulting in a combination of resistance to ciprofloxacin, third-generation cephalosporins and ampicillin (Argimón et al., 2021; Chattaway et al., 2021; Ingle et al., 2021; Jacob et al., 2021b) (although lacking resistance determinants for chloramphenicol, co-trimoxazole, and azithromycin). This group comprised 15 isolates from Mumbai (Argimón et al., 2021; Jacob et al., 2021b) (across two studies, 2015-2018), plus three additional isolates from travellers returning to England, Australia, and the USA from India (Chattaway et al., 2021; Ingle et al., 2021) (2018-2020). This strain therefore appears to have originated in Mumbai and persisted there since at least 2015 for at least six years, but our data do not indicate onward spread out of Maharashtra or India.
Azithromycin resistant (AziR)
Azithromycin resistance (AziR) associated mutations in acrB were identified in 74 genomes. The majority of acrB mutants were from Bangladesh (n=55, 73%), followed by India (n=11, 15%) (see Figure 5a), although the overall prevalence of resistance was very low even in these locations (2.6% in Bangladesh, 0.5% in India). Thirteen distinct combinations of genotype and acrB mutation were identified, implying at least thirteen independent events of AziR emergence; six were singleton isolates, and four were represented by 2-3 isolates each (Figure 5b). The three more common AziR variants all carried R717Q, in 4.3.1.1 (n=38, mainly from Bangladesh), 3.2.2 (n=12, from Bangladesh) or 4.3.1.2 (n=7, from India). Notably, half (n=7/13) of all acrB/genotype combinations were identified in Bangladesh (see Figure 4b). All acrB mutants also carried QRDR mutations, and eight were cipR: n=6 belong to the CipR 4.3.1.2.1 lineage in India (all carried R717Q and were isolated in 2017 in Chandigarh) and n=2 belong to the CipR 4.3.1.3.Bdq lineage (both carried R717L and were isolated in 2019, one in Singapore and one in Bangladesh).
Robustness of national estimates across studies
The estimates of genotype and AMR prevalence represented here reflect post hoc analyses of data that were generated for a variety of different primary purposes in different settings, by different groups using varied criteria for sample collection, including in-country surveillance and travel-associated cases recorded in other countries. To explore the robustness of these national-level estimates, we compared prevalence estimates for the same country from different studies/sources, where sufficient data existed to do so.
Southern Asian countries were each represented by multiple in-country data sources plus travel-associated data collected in three or four other countries. Figure S12 shows genotype prevalence estimates derived from these different sources (for laboratories contributing ≥20 isolates each) and Figure 6a shows the annual genotype frequency distributions (for years with ≥20 isolates). In most cases (67% of genotype-source combinations), genotype prevalences estimated from individual source laboratories yielded 95% confidence intervals (CIs) that overlapped with those of the pooled national estimates (see Figure S12). The main exception was for genotype 4.3.1.2 in India; for most source laboratories (many contributing via the Surveillance for Enteric Fever in India (SEFI) network (Carey et al., 2020; da Silva et al., 2022)), this was the most prevalent genotype, but the point estimates ranged from 16% to 82%, compared with the pooled estimate of 53.4% (95% CI, 51.4-55.5%), and 95% CIs were frequently non-overlapping (see Figure S12). High prevalence of 4.3.1.2 was estimated from contributing laboratories in urban Vellore (82% [95% CI, 78-87%]), Chennai (67% [56-77%]), Bengaluru (70% [62-78%]), and Mumbai (two laboratories, estimates 74% [65-83%] and 63% [46-79%]); with lower prevalence in northern India, New Delhi (three laboratories, estimates 48% [28-68%], 40% [31-49%], 39% [22-56%) and Chandigarh (39% [33-45%]). Two Indian laboratories were clear outliers, with little or no 4.3.1.2 but very high prevalence of a different genotype: 4.3.1.1 in rural Bathalapalli (81% [67-95%]) and 2.5 in the northern city of Ludhiana (77% [66-88]). The relative prevalence of 4.3.1.1.P1 (XDR lineage) in Pakistan versus its parent lineage 4.3.1.1 also varied between sources, which could be explained by differences in the sampling periods and locations relative to the emergence of 4.3.1.1.P1 (see Figure 6a). AMR prevalence estimates were also highly concordant across data sources (see Figure S13), and showed strikingly similar temporal trends (Figure 6b).
The only other country represented by ≥10 sequenced isolates each from multiple laboratories was Nigeria; these were located in Abuja (Zankli Medical Center, n=105, 2010-2013) and Ibadan (University of Ibadan, n=14, 2017-2018), and reference laboratories in England (n=15, 2015-2019) and the USA (n=10, 2016-2019) (see Figure 7). Genotype prevalence estimates were concordant across different sources, with single-laboratory 95% CIs overlapping with one another and with the pooled point estimate, for all five common genotypes (see Figure 7). The exception was that genotype 3.1.1 accounted for all n=14/14 isolates sequenced from Ibadan, but ranged from 53-70% prevalence at other laboratories and yielded a pooled national prevalence estimate of 67% [95% CI, 60-75%) (see Figure 7a, c). AMR prevalence estimates for Nigeria were more variable across laboratories (see Figure b), but this could be explained by their non-overlapping sampling times: Abuja data from early years (2010-2013) showed high MDR (49%) and low CipNS (4%); whereas Ibadan data from later years (2017-2018) showed comparatively lower MDR (21%) and higher CipNS (79%), in agreement with contemporaneous travel data (12% MDR, 60% CipNS, from total n=25 isolated 2015-2019).
Discussion
Strengths and limitations
This study presents the most comprehensive genomic snapshot of Typhi to date, with 12,965 high quality genomes originating from 111 countries in 21 world regions. The consortium model provides improved consistency and completeness of source data aggregated from 77 laboratories and 66 unique studies. Our dataset also includes 1,290 novel genomes sequenced by public health laboratories that would not otherwise have been published, including travel data from countries not previously represented in published Typhi genomics studies (e.g. El Salvador, Guatemala, Haiti, Mexico and Peru). However, it is a post hoc analysis of isolates that were cultured in different contexts (including routine diagnostics, as well as study settings where culture would not normally be undertaken) and sequenced for different reasons (including retrospective studies, outbreak investigations, and routine surveillance). The study therefore has important limitations, most notably the scarcity of genomic data from many countries and world regions where typhoid is believed to be endemic (GBD 2019 Antimicrobial Resistance Collaborators et al., 2022), including Northern and Middle Africa, Western Asia, as well as Central and South America (Figs 1-3, S2-8). These genomic data gaps reflect an underlying lack of routine blood culture or sustained blood-culture surveillance, and limited resources and expertise in many settings (Ikhimiukor et al., 2022b; Iskandar et al., 2021). In addition, public health authorities may be disincentivized to generate, analyse, and publish genomic data; we hope that this analysis strengthens the case for data generation and sharing for public good.
Substantial investments have been made in recent years to improve and expand microbiological surveillance capacity in some low- and middle-income countries, but major regional surveillance gaps remain. It is therefore important to maximise information recovery from available data sources, especially WGS, which provides data on the emergence and spread of AMR variants. While the inference of AMR phenotype from WGS is currently highly reproducible and accurate for Typhi (Argimon et al., 2021; Chattaway et al., 2021), continued phenotypic antimicrobial susceptibility testing remains crucial to monitor for emerging mechanisms and to guide changes in empiric therapy.
For now, routine sequencing of travel-associated Typhi infections diagnosed in high-income countries helps to fill some molecular surveillance gaps for some regions, assuming that accurate travel history is available and the sequence and metadata (including country of origin) are shared (Ingle et al 2019). For example, our study included >3,000 genomes shared by public health reference laboratories in England, Australia, New Zealand, France, Japan, and the USA. These infections mostly originate in other countries, and can in principle provide informative, if informal, sentinel surveillance for pathogen populations in countries with strong travel and/or immigration links to those with routine sequencing (Ingle et al., 2019). Indeed, for some countries and regions, travel data represented most or all of the available genome data (see Table 2, S4). In this study, where multiple data sources were available for the same country, we found that national genotype and AMR prevalence estimates for the period 2010-2020 were largely concordant between local surveillance studies and travel-associated cases captured elsewhere (Figs 6-7, S12-13), particularly when comparing contemporaneous annual prevalence estimates (Figs 6, 7c). This shows clearly that travel-associated Typhi isolated in low burden countries can be informative for surveillance of some high burden countries, which should serve as incentive for public health reference laboratories to share their data to the fullest extent they are able to under local regulations.
Another key limitation stemming from the post hoc nature of this study is that it is hard to assess how representative the prevalence estimates are for a given region/country and timeframe. The GTGC has developed new source/metadata standards for Typhi (see Methods), that include information on the purpose of sampling, which were completed by the original owners of each dataset (data available in Table S2). Such ‘purpose-of-sampling’ fields are currently lacking from metadata templates used for submission of bacterial genomes to the public sequencing archives (e.g. NCBI, ENA), and our approach was modelled on that established for sharing of SARS-CoV-2 sequence data, designed by the PHA4GE consortium (Griffiths et al., 2022). In this study, the purpose-of-sampling information was used to identify the subset of genome data that could be reasonably considered to be representative of national annual trends in genotype and AMR prevalence for public health surveillance purposes (n=9,478 genomes post 2010; Figs 1-3). These originate mainly from local typhoid surveillance studies (59%), or routine diagnostics/surveillance capturing locally acquired (19%) and travel-associated (24%) infections. The comparisons of estimates for a given country based on different sources of genomes (Figs 6-7, S12-13) are reassuring that the general scale and trends of AMR prevalence are reliable. The genome-based estimates are also in broad agreement with available phenotypic prevalence data on AMR in Typhi (Browne et al., 2020; Kariuki et al., 2015), although systematic aggregation of susceptibility data are limited. Notably, the genome data adds an additional layer of information on resistance mechanisms and the emergence and spread of lineages or variants. Importantly, our study shows clearly that, whilst much attention has been given to the emergence and spread of drug-resistant H58 Typhi, other clones predominate outside of Southern Asia and Eastern Africa (Figure 1) and can be associated with non-susceptibility to ciprofloxacin (Figures S7, S9), azithromycin (Figure 6) or ceftriaxone (Table 3), which are included in the World Health Organization Essential Medicines List as first choice treatment for enteric fever (World Health Organization, 2019).
AMR
Our data demonstrate that CipNS is emerging or established in all regions except Melanesia (here represented by n=35 genomes from Fiji and Papua New Guinea, mainly from 2010, although more recent reports support a lack of CipNS in Fiji (Davies et al., 2022; Strobel et al., 2019)) (see Figure S5). For countries with sufficient data to assess (≥50 genomes), CipNS was emerging or established in all countries except Ghana (Figures 2, S8), with no evidence of declining prevalence (Figures 3, S8). A diverse range of genotypes and QRDR mutations are involved (Figures S7, S9), likely reflecting the lack of fitness cost associated with these mutations (Baker et al., 2013). That QRDR mutations are so widespread is highly concerning, as infections with CipNS strains can take longer to resolve, and full clinical resistance can emerge relatively easily against this background, through acquisition of either a mobile qnr gene (as occurred in 4.3.1.1.P1 in Pakistan) or additional QRDR mutations (as occurred in 4.3.1.2.1 in India). Notably, the data suggest CipR typhoid is now a well-established problem across Southern Asia and is emergent in Chile, Mexico and South Africa (Figures 2, 3, S5, S8). A recent study estimating national annual antibiotic consumption highlighted differences in rates of fluoroquinolone usage between regions and countries, which could potentially drive these differences in resistance prevalence (Browne et al., 2021). The highest rates of fluoroquinolone consumption were estimated in South Asian countries, rising from 1.67 defined daily doses (DDD) per 1,000 per day in 2000 to 2.81 DDD/1,000/day in 2010 and 2.94 DDD/1,000/day in 2018 (see https://www.tropicalmedicine.ox.ac.uk/research/oxford/microbe/gram-project/antibiotic-usage-an d-consumption). Fluoroquinolone consumption was also estimated to increase substantially in Latin America, rising from 0.64 DDD/1,000/day in 2000 to 1.85 DDD/1,000/day in 2010 and 2.26 DDD/1,000/day in 2018 Our data show the highest incidence of CipR burden is associated with four main variants (Figure S10). In Pakistan, India and Bangladesh, it is associated with locall emerged variants; however, the relatively high burden in Nepal is associated with variants acquired from India (Britto et al., 2018; Thanh et al., 2016a). In other regions, CipR burden is low and so far linked mainly to the spread of 4.3.1.2.1 (Britto et al., 2020; da Silva et al., 2022) out of India (Britto et al., 2020; da Silva et al., 2022), plus occasional de novo emergence of resistant variants, which show no evidence of geographical spread (Figure S10). However, the high rates of CipNS in Kenya (53%) and Nigeria (40%) are concerning, especially given the increasing usage of fluoroquinolones in these countries (estimated 2.1 DDD/1,000/day in 2018 in Kenya and 2.76 DD/1,000/day in Nigeria) (Browne et al., 2021), which could potentially drive local emergence and spread of CipR.
While resistance to azithromycin and ceftriaxone have been detected (Table 3, Figures 4, 5, 6b, S5, S6, S8), their prevalence remains low and, with the exception of XDR 4.3.1.1.P1, clonal expansion of resistant variants has not been observed. To our knowledge, there are no data reported on the fitness cost of acrB mutations or CefR plasmids in Typhi; however, the genomic evidence suggests a higher fitness cost compared with QRDR mutations, providing further support for the use of ceftriaxone or azithromycin over ciprofloxacin as we work to introduce preventative measures. Most instances of ESBL-gene carriage in Typhi (conferring CefR phenotype) have been short-lived (Table 3), suggesting selection against the acquisition of new ESBL genes or plasmids. The expansion and dominance of the XDR 4.3.1.1.P1 genotype in Pakistan is obviously concerning (Figs 4,6, Fig S7a, S8); however, despite circulating at high prevalence in Pakistan for more than five years, the strain remains azithromycin-susceptible.
There is also limited evidence of local transmission of 4.3.1.1.P1 in other countries; however, most countries near Pakistan have limited data available. A short local outbreak of XDR 4.3.1.1.P1 was reported in China, linked to contamination of an apartment block’s water (Wang et al., 2022) and non-travel associated cases have been reported in the USA (Hughes et al., 2021). Notably, a CefR+CipR lineage of 4.3.1.2.1 that appears to be well-established in Mumbai, India, has been isolated only occasionally since 2015 (Argimón et al., 2021; Chattaway et al., 2021; Ingle et al., 2021; Jacob et al., 2021b) (Table 3); however, this is the only example of persistence of a CefR strain besides 4.3.1.1.P1, and there is no evidence it has yet spread outside Mumbai. We hypothesise that the lack of widespread dissemination of 4.3.1.1.P1 and ESBL-positive 4.3.1.2.1 so far may be due to the fitness cost imposed by the associated plasmids (∼85 Kbp IncY plasmid in 4.3.1.1.P1 (Klemm et al., 2018); ∼43 Kbp IncX3 plasmid in 4.3.1.2.1 (Argimón et al., 2021)). The temporal trend data on MDR prevalence and IncHI1 plasmids (Figure S11) suggest that migration of the MDR locus from the plasmid to the chromosome may have mitigated the fitness cost associated with plasmid-borne MDR. The same may be true for ESBL genes, that is, the movement of the ESBL locus from the plasmid to the chromosome (as has recently been reported in 4.3.1.1.P1 (Nair et al., 2021)) may result in a fitter CefR or XDR variant that can spread more easily. Our data show acrB mutations are occurring spontaneously and independently in multiple locations across a variety of genetic backgrounds (Figure 5). While they are still not prevalent, increased use of azithromycin through public health programmes (e.g. trachoma elimination) as well as widespread misuse of azithromycin to treat SARS-CoV-2 infections and use of azithromycin as first-line therapy for typhoid-like illness may lead to increased selection pressure. It will therefore be important to maintain and expand genomic surveillance, particularly in typhoid endemic countries where azithromycin is used widely. It is also notable that, while they are rare overall, acrB mutations have already arisen in two of the most common CipR lineages (4.3.1.2 and 4.3.1.3.Bdq); this relatively frequent co-occurrence warrants continued monitoring and investigation. While we did not detect the mobile azithromycin resistance gene mphA, it is circulating in other S. enterica serovars (Nair et al., 2016; Tack et al., 2022) and other enteric bacteria that share plasmids with Typhi (including the human-specific Shigella (Baker et al., 2018)), providing another potential mechanism for emergence of azithromycin resistance in Typhi.
Applications of genomic surveillance for typhoid fever control
We are at a pivotal stage in the history of typhoid control. Wider access to clean water and improved sanitation have led to a major reduction in global incidence of typhoid fever, which has also been reflected in declining incidence of other enteric diseases (Steele et al., 2016). This should continue but will require sustained investment from national and local governments and thus remains a long-term objective. In the short to medium term, widespread use of typhoid conjugate vaccines (TCVs) can help to further reduce global incidence of typhoid fever. The WHO has prequalified two TCVs and recommended their use in endemic countries, as well as settings where a high prevalence of AMR Typhi has been reported (World Health Organization, 2018). Gavi, the Vaccine Alliance, has committed funds to support the procurement and distribution of TCVs in typhoid endemic countries (Gavi: The Vaccine Alliance, n.d.). Four countries have undertaken Gavi-supported national introductions (Pakistan, Liberia, Zimbabwe, Nepal) and one country has self-financed a national introduction (Samoa) (Neuzil, 2020; Sikorski, 2020). In Pakistan and Zimbabwe, TCV introduction was stimulated by the occurrence of AMR Typhi outbreaks in major urban centres, highlighting that the case for prevention can be stronger when curative therapy is less available. Additional support is likely required to inform TCV decision-making in other typhoid endemic countries, particularly where burden and AMR data are scarce.
With increasingly limited treatment options, vaccines are an even more important tool to mitigate the public health burden of AMR Typhi, both through the prevention of drug-resistant infections and through broader, indirect effects, like reduction of empiric antimicrobial use leading to reduced selection pressure. While TCVs have been shown to be highly effective against drug-resistant Typhi (Batool et al., 2021; Yousafzai et al., 2021), public health policymakers have to weigh the value of TCVs against other competing immunisation priorities. While TCV introduction is scaled up globally, antimicrobial stewardship should also be prioritised.
Aggregated, representative data showing distribution and temporal trends in AMR can inform local treatment guidelines to extend the useful lifespan of antimicrobials licensed to treat typhoid fever, potentially including reverting to former last-line drugs in some settings. The traffic light system presented in this analysis (see Figures 2 and S5) provides a framework for monitoring trends in AMR and adjusting empiric therapy guidelines accordingly. Genomic surveillance has a particularly important role to play in monitoring for changes in clinically important resistances in Typhi, as a shift in resistance mechanism or early evidence of clonal spread, which can only be identified definitively using WGS, could provide early warning of a likely increase in prevalence. This study provides an analytical framework for Typhi genomic analysis, based on an open, robust, reproducible data flow and analysis framework leveraging open-access online data analysis platforms (Typhi Mykrobe for read-based genotyping (Ingle et al., 2022); the GHRU pipeline for genome assembly (Underwood, 2020), and Typhi Pathogenwatch for assembly-based genotyping and tree-building (Argimon et al., 2021)). We have made available all data processing and statistical analysis code, and underlying sequence and metadata, via GitHub and FigShare (see Methods). Together, these provide (i) a comprehensive data and code resource for the research and public health communities interested in typhoid surveillance data; (ii) a model for the inclusion of WGS in project-based or routine surveillance studies of typhoid that can be readily replicated and adapted; and (iii) a sustainable model for aggregated analysis of typhoid genomic surveillance data that can readily incorporate new data and extract features (genotypes, AMR determinants, plasmid replicons) of importance to clinical and public health audiences. Notably, this Consortium-driven effort shows that new insights can be gained from aggregated analysis of published data, which were not evident from the individual contributing studies, for example (i) the XDR strain 4.3.1.1.P1 existed in Pakistan in 2015, a year earlier than previously reported (Figure 4); (ii) the CefR+CipR strain reported in Mumbai (Argimón et al., 2021; Jacob et al., 2021b) has persisted between at least 2015-2020 and is now more easily identified as 4.3.1.2.1 with blaSHV-12; (iii) persistence of MDR in certain settings is correlated with migration of MDR from plasmid to chromosome (Figure S11), which has implications for the future persistence and potentially spread of ESBL strains.
This dataset provides clear, actionable information about the distribution and temporal trends in AMR across multiple countries and regions. Where data gaps exist, the potential of travel-associated data to serve as “sentinel” surveillance has been demonstrated previously by Ingle et al (Ingle et al., 2019) and supported by additional data included in this analysis. These data can and should inform prioritisation of TCV introduction and improvements to water, sanitation, and hygiene (WASH) infrastructure. Sustaining and expanding genomic surveillance can also facilitate measuring the impact of TCV introduction on local bacterial populations, as has been done for previous vaccines like pneumococcal conjugate vaccines. In addition, monitoring for potential “strain replacement” with other Salmonella serovars following TCV introduction can and should inform the prioritisation of the development and deployment of future combination Salmonella vaccines.
The SARS-CoV-2 pandemic illustrated the power of open, continuous data sharing and crowdsourced analysis, and the importance of ensuring that genomic surveillance leads to local benefits. The scale of this analysis, which was made possible through the efforts of an extensive network of collaborators, enables the extraction of key insights of public health relevance. The authors hope that this Consortium effort serves as a starting point for continued data generation and sharing and collective analysis, with additional participation from an expanded group of stakeholders. In particular, we hope that researchers and public health authorities from areas with little publicly available data see the value of reporting and sharing genomic data for collective public health benefit. In addition, we hope that the current momentum for donor and government support of molecular surveillance is sustained, so that additional groups are able to generate their own data and fill regional data gaps to inform local public health action.
Data Availability
All data analysed during this study are publicly accessible. Raw Illumina sequence reads have been submitted to the European Nucleotide Archive (ENA), and individual sequence accession numbers are listed in Table S2. The full set of n=13,000 genome assemblies generated for this study are available for download from FigShare: doi 10.26180/21431883. All assemblies of suitable quality (n=12,849) are included in the online platform Pathogenwatch (https://pathogen.watch/organisms/styphi), where they can be interactively explored and included in user-driven comparative analyses. All underlying code developed for data analysis is freely available at https://github.com/katholt/TyphoidGenomicsConsortiumWG1.
Author contributions
Conceptualisation: MEC, ZAD, SB, KEH; Methodology - DMA, SA, ZAD, KEH; Formal analysis: MEC, ZAD, DJI, KEH, AA, MA, MAC, KLC, JAC, FAF, BPH, KHK, MM, CMP, SVP, HEW; Investigation: DMA, AOA, APA, JRA, SA, SB, BB, AB, IB, MEC, MAC, JDC, KdS, AD, JdL, PGD, CD, GD, SD, ZAD, NAF, DOG, GG, MAG, ARG, CG, MG, RH, RSH, KEH, YH, JCH, BH, BPH, OK, DJI, SI, JJ, CJ, AK, AK, SK, AK, RK, RMK, ACL, MML, SPL, GAM, MM, TAM, CM, AM, GN, SN, SN, TKN, SNB, ENN, INO, SPBP, AJP, AKP, FQ, FNQ, SIAR, SDR, DAR, KLR, PR, RRB, TRC, JPR, SS, SS, KS, MSIS, JCS, JS, VS, JS, RS, SS, MS, MJS, AS, AMS, KAT, DT, AMT, MT, MST, RT, NRT, ST, KV, MV, SVP, BV, JW, LFW, HEW, FXW, JW;
Resources: NRT
Data curation: SA, MEC, ZAD, KEH, DJI, JAK;
Writing – original draft preparation: MEC, ZAD, KEH, DJI;
Writing – review & editing: all authors, particularly FXW, JAC, JRA, SA, MM, NAF, KT, PMA, SPL, CD, RJ, and CAM;
Visualisation: MEC, ZAD, DJI, KEH, AA, MA, MAC, KLC, JAC, FAF, BPH, KHK, MM, CMP, SVP, HEW;
Project administration: DMA, SB, MEC, ZAD, KEH.
Competing interests
All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: NAF chairs the Wellcome Surveillance and Epidemiology of Drug Resistant Infections (SEDRIC) group, which has a focus on antimicrobial resistance, and has no financial interests to declare. AJP is chair of the UK Department of Health and Social Care’s (DHSC) Joint Committee on Vaccination and Immunisation (JCVI) but does not take part in the JCVI COVID-19 committee. He was a member of WHO SAGE until 2022. AJPs employer, Oxford University has entered into a partnership with AstraZeneca for development of a COVID-19 vaccine. AJP has provided advice to Shionogi & Co., Ltd on development of a COVID19 vaccine. KMN receives grant support from the Bill and Melinda Gates Foundation for TyVAC – the Typhoid Vaccine Acceleration Consortium. IB has consulted for BlueDot, a social benefit corporation that tracks the spread of emerging infectious diseases. All other authors had no competing interests to declare.
Global Typhoid Genomics Consortium Group Authorship
Peter Aaby1, Ali Abbas 2, Niyaz Ahmed3, Saadia Andleeb4, Abraham Aseffa5, Kate S. Baker6, Adwoa Bentsi-Enchill7, Robert F. Breiman8, Carl Britto9, Josefina Campos10, Chih-Jun Chen11, Chien-Shun Chiou12, Viengmon Davong13, Marthie Ehlers14, Abul Faiz15, Danish Gul4, Rumina Hasan16, Mochammad Hatta17, Aamer Ikram18, Lupeoletalalelei Isaia19, Jan Jacobs20,21, Simon Kariuki22, Fahad Khokhar23, Elizabeth Klemm24, Laura M. F. Kuijpers25, Gemma Langridge26, Kruy Lim27, Octavie Lunguya28,29, Francisco Luquero30, Calman A. MacLennan31, Florian Marks32, 33, 34, 35, Masatomo Morita36, Mutinta Muchimba37, James C.L. Mwansa38, Kapambwe Mwape37,39,40, Jason M. Mwenda41, John Nash42, Kathleen M. Neuzil43, Paul Newton13,44,45, Stephen Obaro46,47, Sophie Octavia48, Makoto Ohnishi36, Michael Owusu49, Ellis Owusu-Dabo50, Se Eun Park32,51, Julian Parkhill23, Duy Thanh Pham52, Marie-France Phoba28, Derek J. Pickard53, Pilar Ramon-Pardo54, Farhan Rasheed55, Assaf Rokney56, Priscilla Rupali57, Ranjit Sah58, Sadia Shakoor59,60, Michelo Simuyandi37, Arvinda Sooka61, Jeffrey D. Stanaway62, A. Duncan Steele63, Bieke Tack20, Adama Tall64, Neelam Taneja65, Mekonnen Teferi66, Sofonias Tessema67, Gaetan Thilliez26, Paul Turner68, James E. Ussher69, Annavi Marie Villanueva70, Bart Weimer71, Vanessa K. Wong72, Raspail Carrel Founo Zangue73
Affiliations
1 Bandim Health Project, Guinea-Bissau
2 Department of Microbiology, Faculty of Veterinary Medicine, University of Kufa, Najaf, Iraq
3 University of Hyderabad, Hyderabad, India
4 Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology (NUST), Islamabad, Pakistan
5 Armauer Hansen Research Institute, Addis Ababa, Ethiopia
6 University of Liverpool, Liverpool, United Kingdom
7 World Health Organization, Geneva, Switzerland
8 Emory University, Atlanta, USA
9 Boston Children’s Hospital, Boston, USA
10 INEI-ANLIS “Dr Carlos G. Malbrán”, Buenos Aires, Argentina
11 Chang Gung Memorial Hospital, Taoyuan, Taiwan
12 Centers for Disease Control, Taipei, Taiwan
13 Lao-Oxford-Mahosot Hospital-Wellcome Trust Research Unit (LOMWRU), Microbiology Laboratory, Mahosot Hospital, Vientiane, Laos
14 University of Pretoria, Pretoria, South Africa
15 Dev Care Foundation, Dhaka, Bangladesh
16 Aga Khan University, Karachi, Pakistan
17 Department of Molecular Biology and Immunology, Faculty of Medicine, Hasanuddin University, Makassar, Indonesia
18 National Institute of Health, Islamabad, Pakistan
19 Ministry of Health, Government of Samoa, Apia, Samoa
20 Department of Clinical Sciences, Institute of Tropical Medicine, Antwerp, Belgium
21 Department of Microbiology, Immunology and Transplantation, KU Leuven, Leuven, Belgium
22 Malaria Branch, Kenya Medical Research Institute (KEMRI) Centre for Global Health Research, Kisumu, Kenya
23 Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
24 Novo Nordisk Foundation, Hellerup, Denmark
25 Department of Infectious Diseases, Leiden University Medical Center, Leiden, The Netherlands
26 Quadram Institute Bioscience, Norwich, UK
27 Sihanouk Hospital Center of HOPE, Phnom Penh, Cambodia
28 Department of Microbiology, Institut National de Recherche Biomédicale, Kinshasa, Democratic Republic of the Congo
29 Department of Medical Biology, University Teaching Hospital of Kinshasa, Kinshasa, Democratic Republic of the Congo
30 Global Alliance for Vaccines and Immunization (GAVI), Geneva, Switzerland
31 Jenner Institute, Nuffield Department of Medicine, University of Oxford, Oxford, UK
32 International Vaccine Institute, Seoul, Republic of Korea
33 Cambridge Institute of Therapeutic Immunology and Infectious Disease, University of Cambridge School of Clinical Medicine, Cambridge Biomedical Campus, Cambridge, UK
34 Madagascar Institute for Vaccine Research, University of Antananarivo, Antananarivo, Madagascar
35 Heidelberg Institute of Global Health, University of Heidelberg, Heidelberg, Germany
36 National Institute of Infectious Diseases, Tokyo, Japan
37 Center of Infectious Disease Research in Zambia, Lusaka, Zambia
38 Lusaka Apex Medical University, Lusaka, Zambia
39 Water and Health Research Center, Faculty of Health Sciences, University of Johannesburg, Johannesburg, South Africa
40 Department of Basic Medical Sciences, Michael Chilufya Sata School of Medicine, Copperbelt University, Ndola, Zambia
41 World Health Organization (WHO) Regional Office for Africa, Immunization and Vaccines Development, Brazzaville, Republic of Congo
42 National Microbiology Laboratory, Public Health Agency of Canada, Toronto, ON, Canada
43 University of Maryland School of Medicine Center for Vaccine Development and Global Health, Baltimore, MD, USA
44 Infectious Diseases Data Observatory (IDDO), Oxford, UK
45 Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, UK
46 University of Nebraska Medical Center, Omaha, NE, USA
47 International Foundation Against Infectious Diseases in Nigeria (IFAIN), Abuja, Nigeria
48 Environmental Health Institute, National Environment Agency, Singapore
49 Department of Medical Diagnostics, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
50 School of Public Health, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
51 Yonsei University Graduate School of Public Health, Seoul, Republic of Korea
52 Oxford University Clinical Research Unit, Vietnam
53 Department of Medicine, University of Cambridge, Cambridge, UK
54 Pan American Health Organization, AMR Special Program, Washington DC, USA
55 Allama Iqbal Medical College, Lahore, Pakistan
56 Ministry of Health, Jerusalem, Israel
57 Department of Infectious Diseases, Christian Medical College, Vellore, India
58 Tribhuvan University Teaching Hospital, Institute of Medicine, Kathmandu, Nepal
59 Pediatrics and Child Health, Aga Khan University, Karachi, Pakistan
60 London School of Hygiene & Tropical Medicine, London, UK
61 National Institute for Communicable Diseases, Johannesburg, South Africa
62 Institute for Health Metrics and Evaluation, University of Washington, Seattle, Washington, USA
63 Enteric and Diarrheal Diseases, Bill & Melinda Gates Foundation, Seattle, WA, USA
64 Institut Pasteur, Dakar, Senegal
65 Department of Medical Microbiology, Postgraduate Institute of Medical Education and Research, Chandigarh, India
66 Armauer Hansen Research Institute, Addis Ababa, Ethiopia
67 Africa Centres for Disease Prevention and Control, Addis Ababa, Ethiopia
68 Cambodia Oxford Medical Research Unit, Angkor Hospital for Children, Siem Reap, Cambodia
69 University of Otago, Dunedin, New Zealand
70 National Reference Laboratory for HIV/AIDS, Hepatitis, and Other Sexually-Transmitted Infections, San Lazaro Hospital, Manila, Philippines
71 Department of Population Health and Reproduction, 100K Pathogen Genome Consortium, School of Veterinary Medicine, UC Davis, Davis, California, USA
72 Addenbrooke’s Hospital, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, UK
73 University of Dschang, Dschang, Cameroon
Acknowledgements
ZAD received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 845681. MAC is affiliated to the National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Genomics and Enabling Data at University of Warwick in partnership with the UK Health Security Agency (UKHSA), in collaboration with University of Cambridge and Oxford. The views expressed are those of the author(s) and not necessarily those of the NIHR, the Department of Health and Social Care, the UKHSA, or the U.S. Centers for Disease Control and Prevention.
RAK, TM and GT were supported by the Bill & Melinda Gates Foundation (BMGF) grant OPP1217121 and the BBSRC Institute Strategic Programme BB/R012504/1 and its constituent project BBS/E/F/000PR10348. SB was supported by a Wellcome Trust Senior Fellowship. MML was supported by the Bill and Melinda Gates Foundation grants OPP1194582, INV-029806, and OPP1161058. MJS was supported by National Institutes of Health (NIH), National Institute of Allergy and Infectious Diseases (NIAID) (grant F30AI156973) and BMGF (OPP1194582/INV-000049). JRA, IB, SS, DOG, FNQ, SS, AMT, KV, JI, SPL, DT, JS, JCS, SI, RS received support from BMGF (grant INV-008335/OPP1113007). KMN receives support from BMGF (grant OPP1151153). IB reports funding from the Canadian Institutes of Health Research. SS notes support from BMGF (grant INV-042340) and Child Health Research Foundation. PT has received funding from the Wellcome Trust (grants 222156 and 220211).
NAF is a National Institute for Health and Care Research (NIHR) Professor of Global Health. AJP acknowledges funding support from the WHO and Gavi, the Vaccine Alliance. DMA notes support from the NIHR Global Health Research Unit on Genomic Surveillance of AMR. RSH is a NIHR Senior Investigator. AMS is supported by the SEQAFRICA project, funded by the Department of Health and Social Care’s Fleming Fund using UK aid. INO acknowledges support from the UK NIHR Global Health Research Unit on Genomic Surveillance of Antimicrobial Resistance Consortium Award NIHR (project #16/136/111), the UK Medical Research Council/ Department for International Development African Research Leaders Award: (MR/L00464X/), and a Calestous Juma Fellowship from the Bill and Melinda Gates Foundation (INV-036234).
DAR has received support from NIH, NIAID, and the United States Department of Health and Human Services (grant U19AI110820). AK has been supported by ICMR, India. CMP has received funding through a Joint Global Health Trials Scheme (MR/TOO5033/1), and acknowledges funding from the Department for Health and Social Care, the Department for International Development/Global Challenges Research Fund, the UK Medical Research Council, and the Wellcome Trust. JC acknowledges support from the ANLIS Malbran and the Ministry of Health, Argentina. FXW acknowledges funding support from the Institut Pasteur and Santé Publique France. EN notes support from Institut Pasteur and Santé Publique France.
GAM acknowledges support from BMGF (grant OPP1020327). RKL and VS recognize funding from NIHR (grant 16_136_111) and the Wellcome Trust (grant 206194). SK recognises support from the NIH (grant R01AI099525). RRB, SJL, JS, and ARG note funding support from BMGF. RFB has received support from BMGF and the NIH. GT and RK are supported by BMGF (grant OPP1217121) and the BBSRC Institute Strategic Programme BB/R012504/1 and its constituent project BBS/E/F/000PR10348. FM reports funding from BMGF (OPPGH5231 and OPP1127988). ST was supported by (BMGF INV-018979). DTP was supported by a Wellcome International Training Fellowship (grant 222983/Z/21/Z). KS and DJP note support from the Wellcome Trust. BW has received support from the US Food and Drug Administration. JAC received support from US National Institutes of Health grants U01AI062563, R01TW009237, and R01AI121378, and Bill & Melinda Gates Foundation grants OPPGH5231, OPP1558210, and OPP1151153.