Abstract
High-throughput sequencing measurements of the vaginal microbiome have yielded intriguing potential relationships between the vaginal microbiome and preterm birth (PTB; live birth prior to 37 weeks of gestation). However, results across studies have been inconsistent. Here we perform an integrated analysis of previously published datasets from 12 cohorts of pregnant women whose vaginal microbiomes were measured by 16S rRNA gene sequencing. Of 1926 women included in our analysis, 568 went on to deliver prematurely. Substantial variation between these datasets existed in their definition of preterm birth, characteristics of the study populations, and sequencing methodology. Nevertheless, a small group of taxa comprised a vast majority of the measured microbiome in all cohorts. We trained machine learning (ML) models to predict PTB from the composition of the vaginal microbiome, finding low to modest predictive accuracy (0.28-0.79). Predictive accuracy was typically lower when ML models trained in one dataset predicted PTB in another dataset. Earlier preterm birth (<32 weeks, <34 weeks) was more predictable from the vaginal microbiome than late preterm birth (34 - 37 weeks), both within and across datasets. Integrated differential abundance analysis revealed a highly significant negative association between L. crispatus and PTB that was consistent across almost all studies. The presence of the majority (18 out of 25) of genera was associated with a higher risk of PTB, with L. iners, Prevotella, and Gardnerella showing particularly consistent and significant associations. Some example discrepancies between studies could be attributed to specific methodological differences, but not most study-to-study variations in the relationship between the vaginal microbiome and preterm birth. We believe future studies of the vaginal microbiome and PTB will benefit from a focus on earlier preterm births, and improved reporting of specific patient metadata shown to influence the vaginal microbiome and/or birth outcomes.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES (Award number: R35GM133745)
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Ethics committee/IRB of North Carolina State University gave ethical approval for this work (Protocol Number 23575).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Revised the citations of supplemenatry file. Currently the citations of the supplementary figures/tables are the question marks due to Latex code issue. We fixed it in the revision.
Data Availability
ENA/SRA or dbGap with reference numbers summarized in the supplementary table S2. The processed data are available in the first author's github by request.