ABSTRACT
Background Surveillance and prediction of antibiotic resistance in Escherichia coli relies on curated databases of genes and mutations. Such databases currently lack quantitative data estimating the effect on MIC caused by the acquisition of any given element for a particular antibiotic-species combination.
Methods Using a collection of 2875 E. coli isolates with linked whole genome sequencing and MIC data, we used multivariable interval regression models to estimate the change in MIC for specific antibiotics associated with the acquisition of genes and mutations in the AMRFinder database with and without an adjustment for population structure. We then tested the ability of these models to predict MIC and binary resistance/susceptibility using leave-one-out cross validation.
Findings We provide quantitative estimates (with confidence intervals) of the change in MIC associated with the acquisition of genes/mutations in the NCBI-AMRFinder database. Whilst the majority of genes and mutations (89/111 (80.2%) were associated with an increased MIC, a much smaller number (27/111, 24.3%) were found to be putatively independently resistance conferring (i.e. associated with an MIC above the EUCAST breakpoint) when acquired in isolation. We found evidence of differential effects of acquired genes and mutations between different generations of cephalosporin antibiotics and demonstrated that sub-breakpoint variation in MIC can be linked to genetic mechanisms of resistance. 20,697/24,858 (83.3%, range 52.9-97.7 across all antibiotics) of MICs were correctly exactly predicted and 23,677/24,858 (95.2%, range 87.3-97.7) to within +/-1 doubling dilution.
Interpretation Quantitative estimates of the independent effect on MIC of the acquisition of antibiotic resistance genes add to the interpretability and utility of existing databases. Using these estimates to predict antibiotic resistance phenotype demonstrates performance that is comparable to or better than approaches utilising machine learning models and crucially more readily interpretable. The methods outlined here could be readily applied to other antibiotic/pathogen combinations.
Funding This work was funded by the NIHR and the MRC.
Evidence before this study We searched PubMed from inception to 05/04/2024 using the terms ((Escherichia coli OR E. coli) AND ((MIC) OR (minimum inhibitory concentration))) AND (predict*) AND (whole genome sequencing). Of the 56 articles identified by these search terms, eight were of direct relevance to this study. These studies generally focused on single antibiotics (3 studies), had relatively small datasets (6 studies ¡1000 isolates) or used machine learning approaches on pan-genomes to predict binary (i.e. susceptible/resistant) phenotypes (2 studies). Only one study attempted to predict ciprofloxacin MICs in 704 E. coli isolates using a machine learning approach with known resistance conferring genes/mutations as features. To our knowledge, there are no studies estimating the independent effect (as opposed to the total effect of all elements present) of the acquisition of specific antibiotic resistance genes (ARGs) or resistance-associated mutations on MICs of different antibiotics in E. coli more generally.
What this study adds In this study we estimate the change in MIC for particular antibiotics associated with the acquisition of specific ARGs or resistance-associated mutations, adjusting for the presence of other relevant genes and population structure. In doing so we provide an approach to greatly enhance the information provided by existing ARG databases and approaches based on predicting binary susceptible/resistant phenotypes, for example by demonstrating differential effects of ARGs on resistance to antibiotics of the same class, enriching our understanding of the relationship between genotype and phenotype in a way that is easily interpretable. Using more “parsimonious” models for prediction, we demonstrate high overall accuracy comparable to or better, and crucially more readily interpretable, than recent machine learning models. We also demonstrate a genetic basis behind sub-breakpoint variation in MIC for some antibiotics, demonstrating the value of non-dichotomised phenotypes for identifying wildtype isolates (i.e. those carrying no ARGs) with greater confidence.
Implications of all available evidence Whole genome sequencing data can be used to predict MICs for most commonly used antibiotics for managing E. coli infections with accuracy approaching that of conventional phenotyping techniques, though very major error rates remain too high for deployment in routine clinical practice. Further studies focusing on genotypes with high phenotypic heterogeneity should investigate the phenotypic replicability, genetic heritability and clinical outcomes associated with these isolates.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
The computational aspects of this research were funded from the NIHR Oxford BRC with additional support from the Wellcome Trust Core Award Grant Number 203141/Z/16/Z. SL was funded by an MRC Clinical Research Training Fellowship MR/T001151/1. ASW and TEAP are also supported by the NIHR Oxford Biomedical Research Centre. ASW is an NIHR Senior Investigator. NS is an NIHR Oxford BRC Senior Fellow. This research is supported by the National Institute for Health Research (NIHR) Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance (NIHR200915), a partnership between the UK Health Security Agency (UKHSA) and the University of Oxford. The views expressed are those of the author(s) and not necessarily those of the NIHR, UKHSA or the Department of Health and Social Care. This research was supported by the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre (BRC). The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Routinely collected healthcare data, including microbiological data, were acquired via pseudonymised linkage in the Infections in Oxfordshire Research Database (IORD). IORD has generic Research Ethics Committee, Health Research Authority and Confidentiality Advisory Group approvals (19/SC/0403, 19/CAG/0144) as a de-identified electronic research database. The use of bacterial isolates obtained from clinical infections for the development of methods for genomic antimicrobial resistance prediction were covered by a separate approval (London - Queen Square Research Ethics Committee ; REC ref: 17/LO/1420)
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
↵** Joint senior authors
Data Availability
Raw reads for all isolates used in the study are available in NCBI under project accession numbers PRJNA604975 and PRJNA1007570.