Abstract
Background Vaginal progesterone, low dose aspirin and care management (comprising increased outreach and education) has been shown to reduce the rate of prematurity in select populations, but identifying at-risk pregnancies has been problematic.
Objective(s) Test the hypothesis that screening singleton, non-anomalous pregnancies lacking traditional clinical risk factors with a validated blood test for preterm birth risk prediction, then targeting those with elevated risk for preventive treatment, would improve neonatal outcomes as compared to a large historical population.
Study Design The AVERT PRETERM trial took place from June 2018-September 2020 at ChristianaCare Hospital (Newark, DE). Singleton non-anomalous pregnancies with no history of preterm birth were enrolled in a prospective study arm and followed through neonatal hospital discharge. Participants were screened using a serum proteomic test for spontaneous preterm birth risk during a gestational age window spanning 191/7-206/7 weeks. Pregnancies identified by the test to be at elevated risk for preterm birth (≥16.0%, approximately twice the U.S. population risk) were offered aspirin 81 mg daily, open-label vaginal progesterone 200 mg daily and care management. We compared outcomes for women who screened either low-risk or higher-risk accepting treatment with those in a historical study arm of 10,000 pregnancies. Our co-primary outcomes were neonatal hospital length of stay and an ordinal neonatal morbidity index score based on the occurrence of grade III/IV interventricular hemorrhage, necrotizing enterocolitis (Bell stage II/III), respiratory distress syndrome, bronchopulmonary dysplasia, retinopathy of prematurity (stage III), sepsis, and neonatal death or neonatal intensive care unit length of stay. Cox proportional hazards survival analysis and ordinal logistic regression were utilized to evaluate outcomes and control for population differences.
Results A total of 1463 women were screened and tested before research operations were ceased due to the advent of the COVID-19 pandemic, and three women were subsequently deemed ineligible after screening. Of these, 34.72% (507/1460) were deemed high-risk, with 56.41% (286/507) accepting intervention and 43.59% (221/507) forgoing intervention. The remaining 65.3% (953/1460) were designated as low-risk. Women in the prospective arm were older, more obese, more likely to have hypertension and smoke, and less likely to use opioids compared to women in the historical arm. The primary analyses found that neonates in the prospective arm were discharged from the hospital earlier (P=0.01; hazard ratio, 1.35; 95% confidence interval, 1.08-1.70) and had lower neonatal morbidity index scores (P=0.031; odds ratio, 0.81; 95% confidence interval, 0.67-0.98). Average neonatal hospital length of stay decreased by 21%, and severe neonatal morbidity (neonatal morbidity index ≥3) was reduced on average by 18%.
Conclusions Identifying singleton, non-anomalous pregnancies lacking traditional risk factors with a validated proteomic blood test for preterm birth risk and providing treatment to those at risk resulted in improved neonatal outcomes compared to controls in a racially diverse cohort. This test-and-treatment strategy shows promise for ameliorating the impacts of premature birth among individuals in this previously unidentified patient population.
Introduction
Preterm birth (PTB) remains the leading cause of perinatal mortality,1,2 and children born prematurely are at greater risk for chronic medical conditions3,4 and developmental delays.5 These risks are inversely proportional to the neonate’s gestational age (GA) at birth.6 Improvements in survival are largely attributable to the use of improved neonatal care7 and antenatal corticosteroids.8 Strategies targeting at-risk women, such as focused care management9 (comprising increased outreach and education) along with low-dose aspirin (LDASA),10 and vaginal progesterone11,12 have demonstrated reductions in PTB, but their impact has been limited by poor precision in identifying at-risk pregnancies. Shortened cervical length measured by transvaginal sonography in the second trimester has been shown to predict the risk of preterm delivery.13 With vaginal progesterone treatment, the PTB rate <34 weeks’ gestation is decreased by approximately 45%, but the rate of PTB <37 weeks’ gestation is unchanged.14 The utility of this strategy is blunted by the limited ability of cervical sonography to identify women who will deliver preterm, with most women delivering prematurely not having a short cervix at the time of routine sonography.15,16 Similarly, LDASA has been shown to decrease PTB incidence and severity17,18 but generally has been recommended for pregnancies at risk for preeclampsia.19 Finally, care management of higher-risk pregnancies has been shown to decrease PTB; historically, however, its use has targeted pregnancies with known PTB risk factors,9,20 which are frequently absent in those who deliver prematurely.
Recently, novel discoveries in proteomics have come to identify and validate proteins that are differentially expressed in pregnancies that deliver prematurely compared to term births. One validated spontaneous PTB (sPTB) predictor, which evaluates the ratio of serum insulin-like growth factor binding protein 4 (IGFBP4) to sex hormone-binding globulin (SHBG) in the window of 191/7-206/7 weeks’ gestation, stratified risk in U.S. women with a precision (area under the receiver operating characteristic curve [AUC]) of 0.80.21 A risk score threshold corresponding to twice the U.S. population PTB risk was subsequently validated22 and shown to significantly stratify higher- and lower-risk subjects for an extended blood draw window of 180/7-206/7 weeks’ gestation.
Nonetheless, the question of whether targeting treatment can improve neonatal outcomes for women indicated by the test to be at increased risk for preterm delivery but who lack traditional clinical risk factors remains largely uninvestigated. The purpose of the AVERT PRETERM trial was to test the hypothesis that screening singleton non-anomalous pregnancies with a validated blood test for PTB risk prediction, and then treating higher-risk pregnancies with care management, LDASA, and vaginal progesterone, would improve neonatal outcomes compared to a large historical population.
Materials and Methods
The AVERT PRETERM trial was conducted from June 2018-September 2020 at ChristianaCare Hospital, a regional health care system based in Newark, DE. ChristianaCare serves a mixed urban and rural population across Delaware and Maryland.
Trial Oversight
The study protocol (NCT03151330) was approved by the ChristianaCare institutional review board prior to participant enrollment. An independent data safety monitoring board convened prior to study initiation, approved the protocol, and provided oversight of adverse events. All subjects provided written informed consent to participate in the study. The listed authors accept responsibility for the accuracy and completeness of the data and for fidelity in the conduct of the trial.
In April 2020, all non-COVID research was halted at ChristianaCare Health System, and the decision was made to terminate the trial and reassess the analytic plan in a blinded fashion. In July 2020, an investigation23 showed that COVID-19 infection led to an increase in prematurity. To avoid bias in comparison of pre-pandemic controls to prospective subjects who reached term during the pandemic period, we modified the statistical analysis plan (SAP), while blinded, by limiting the primary analysis to subjects who had reached 37 weeks’ gestation before the local spread of SARS-CoV-2 and the associated research shutdown.
Screening and Recruitment
Two separate study arms were defined, both of which comprised singleton pregnancies without ultrasound evidence of mullerian or fetal anomalies, cervical shortening (<25 mm), genetic anomalies, history of a prior PTB, cervical cerclage or medical conditions with clear indication for PTB <37 weeks’ gestation. Pregnancies in the prospective arm were identified and screened for eligibility in ambulatory sites and/or at the time of routine imaging ultrasound. Due to the requirements of the proteomic blood test for prematurity (PreTRM), women in the prospective arm were excluded if they had a blood transfusion during the current pregnancy, known hyperbilirubinemia, or were taking traditional or low molecular weight heparin. Similarly, prospective patients were excluded if they had a known reaction or contraindication to progesterone or aspirin.
Procedures
Following consent, blood was obtained from women in the prospective arm during the window of 191/7-206/7 weeks’ gestation. Samples were processed, shipped, stored, and analyzed using the PreTRM test in a Clinical Laboratory Improvement Amendments- and College of American Pathologists-certified laboratory (Sera Prognostics, Inc., Salt Lake City, UT), as described previously.21,24 Test results were shared with the participant and her care provider. Pregnancies with a sPTB risk score ≥16.0% (approximately twice the U.S. population risk) were then offered and consented again to receive care management, consisting of twice-weekly nursing contacts to monitor medication compliance and symptom development, aspirin 81 mg daily, and progesterone 200 mg intravaginally daily. The remainder of care was determined by the treating clinician. Outcomes for both study arms were obtained through a validated obstetrical registry25 or, for patients in the prospective arm who delivered at another institution, through review of individual medical records. External data review was performed for all prospective patients in addition to 10% of the historical arm to ensure that eligibility requirements were met. An error rate of <3% was deemed acceptable. All PTB cases in both arms were reviewed further for accurate assessment of primary outcomes by a single investigator (MKH). Additionally, because secular changes in care might affect outcomes in a non-random fashion amongst women in the prospective arm, major changes in guidance or management protocols were documented on a quarterly basis.
Trial Outcomes
Two co-primary outcomes were selected, the first being neonatal hospital length of stay (NNLOS). Because of the competing risk of death resulting in shorter stays, we set the NNLOS for fetuses or newborns who expired to be one day longer than the longest neonatal stay observed among all neonates. The second co-primary outcome was based on a neonatal morbidity index (NMI)14 composed of the following components: grade III/IV intraventricular hemorrhage, necrotizing enterocolitis (Bell stage II/III), respiratory distress, bronchopulmonary dysplasia (need for oxygen at 28 days of life or 36 weeks post-conceptional age), retinopathy of prematurity (stage III), proven sepsis (positive blood cultures) and postnatal death. To provide severity estimates, NMI scoring was organized as follows: 0, no events; 1, one event or neonatal intensive care unit (NICU) admission <5 days with no perinatal mortality; 2, two events or five to 20 days of NICU admission with no perinatal mortality; 3, three events or >20 days in the NICU without perinatal mortality; and 4, perinatal mortality.
Power and Sample Size Analysis
At termination, 1873 eligible subjects had been enrolled in the prospective arm, with 1460 reaching 37 weeks’ gestation prior to the first reported COVID-19 case in Delaware. A total of 10000 consecutive historical controls who met the same eligibility criteria were selected from an approximately two-year period immediately prior to study initiation.
From institutional data on pregnancies at ChristianaCare, we estimated a historical PTB rate of 9.1%. Sample size estimation was based on simulations of the co-primary outcomes using a simulated GA distribution with a singleton PTB rate of 9.1% and an effect of intervention based on existing literature. α-level spending of 0.05 was shared between co-primaries using Holm’s method.
For the first co-primary outcome, the proportion of subjects with NMI scores ≥3 was assumed to be 2.0-2.3% in the prospective arm and near 3.6% in the control arm, based on a previous clinical utility study.26 Assuming these proportions and a one-sided Fisher’s Exact test, a sample size of approximately 1453 subjects with outcomes in the prospective arm with 55% compliance among higher-risk subjects, and approximately 10000 historical controls, would provide power of 0.7-0.9.27
For the second co-primary outcome, NNLOS from time of birth up to discharge, the hazard ratio (HR) based on simulations was expected to be 1.32-1.46, based on the previously referenced clinical utility study.26 Assuming these HRs, a sample size of approximately 1453 subjects with outcomes in the prospective arm with 55% compliance among higher-risk subjects and approximately 10000 historical controls, would provide power of at least 0.8.27
Statistical Analysis
Descriptive statistics were used to describe all data. Variables were summarized using means and standard deviations for continuous data, and percentages and frequencies for categorical data. Comparisons for baseline characteristics were performed; the Wilcoxon rank sum test was used to compare continuous variables between two groups, and contingency table analysis (chi-square) was used to compare categorical variables. In comparisons, differences with P<0.05 were considered significant.
The first co-primary hypothesis, reduction in severe composite NMI scores, was tested using ordinal logistic regression, adjusted by covariates. The second co-primary hypothesis, reduction in neonatal length of stay, and the two co-secondary hypotheses, reduction in NICU length of stay (NICULOS) and increase in GA at birth (GAB), were all tested using Cox proportional hazards regression, adjusted by covariates.
The co-primary and co-secondary analyses used a modified intent-to-treat population, defined as all subjects for whom both co-primary outcomes were known; and who were either selected for the historical control arm, received a low-risk PreTRM test result, or consented to and initiated treatment with vaginal progesterone and LDASA <24 weeks’ gestation after receiving a higher-risk PreTRM test result.
As prespecified by a selection procedure defined in the SAP, subjects in the highest 8.5% quantile of each arm were included in the NICULOS and NNLOS analyses, while subjects in the lowest 8.5% quantile of each arm were included in the GAB analysis. Specifically, 8.5% was selected to be 1.2 times the PTB rate of the historical control arm (7.1%). This was done to examine outcomes at the extremes of PTB rather than capturing short NICU stays for conditions that tend to dominate NICU admissions, including transient tachypnea of the newborn, hypoglycemia and temperature instability.
As prespecified in the SAP, the covariates included in the models for the primary and secondary analyses were maternal age, parity and maternal substance use disorders, assessed as Neonatal Opioid Withdrawal Syndrome. Additional covariates were examined in sensitivity analyses.
Differences between the treatment arms in the proportion of each treatment group at each NMI level were calculated using the fitted ordinal logistic regression model with the covariates maternal age, parity and maternal substance use disorders. Percent differences between the treatment arms for time to event analyses were calculated using the hazard rate for each group as estimated by the fitted Cox proportional hazards model with the covariates maternal age, parity and maternal substance use disorders.
The proportional odds assumption for the effect of treatment arm in the ordinal logistic regression was examined and found not to be violated. The proportional hazards assumption for treatment arm in the Cox regression analyses was tested and found not to be violated.
Analyses were performed using R software version 4.2.2.32 Two-tailed P-values <0.05 were considered significant. Holm’s multiple comparisons correction was used for the co-primary hypotheses.
Results
At study termination, 1873 eligible subjects had been enrolled in the prospective arm, 1463 of whom aligned with pre-COVID-19 patient care conditions and were screened with the PreTRM test (Figure 1). Of these, 34.7% (507/1463) were deemed higher risk by the PreTRM test. Among the screened population, 83.3% (1218/1463) had clinical outcomes and qualified either as low-risk (77.1%; 939/1218) or as higher-risk accepting treatment (22.9%; 279/1218).
Baseline participant characteristics and delivery data are shown in Table 1. The prospective arm was noted to be significantly older, more obese, less likely to be nulliparous, and more likely to have hypertension and smoke than historical controls. Similarly, body mass index was higher in the prospective arm compared to historical controls – mostly due to higher weight, though the prospective arm was nominally taller than historical controls. The proportion of Black women in both arms was 26.5%, reflecting the racial diversity in the study site’s patient population. Three women reported stopping vaginal progesterone due to adverse effects, and one woman discontinued LDASA.
Results of hypothesis tests for the co-primary and co-secondary outcomes are presented in Table 2. NNLOS was significantly reduced in the prospective arm vs the historical arm (HR 1.35; 95% CI, 1.08-1.7; P=0.01). The Kaplan-Meier plot for NNLOS is shown in Figure 2, reflecting a reduction of mean NNLOS of 21% (Cox proportional hazard [PH] P = 0.01). NMI scores were significantly reduced in the prospective arm vs the historical arm (OR 0.81; 95% CI 0.67-0.98; P=0.03), indicating more favorable outcomes for the prospective arm (Table 2). Specifically, the probability of NMI ≥1 (any impairment) or NMI ≥3 (severe NMI) was reduced on average approximately by 13%-17% or 18%, respectively, across a range of co-variate values (Supplemental Table 1).
After achieving statistical significance for both co-primary endpoints per the SAP, we proceeded to test the two co-secondary hypotheses. In Cox regression analysis, children tended to leave the NICU earlier (HR 1.2; 95%CI, 0.96-1.51), but this difference was not significant (P=0.12). No differences in overall weeks’ gestation was noted (P=0.584, HR 0.962).
In a non-prespecified analysis, we noted that although there was no difference in overall GA at delivery, there was a difference in the number of pregnancies that delivered <32 weeks’ gestation (HR, 0.52; 95%CI, 0.28-0.94; Cox PH P=0.009) (Figure 3), corresponding to an extension of mean gestation for births <32 weeks of 2.5 weeks (mean of 27.46 and 29.93 weeks for historical and prospective arms, respectively). Figure 4 shows a Kaplan-Meier plot for neonatal hospital stay among babies born <32 weeks. Neonates left the hospital earlier in the prospective arm than the historical arm (HR, 1.84, 95% CI 1.01-3.36, Cox PH P=0.046) with differences in mean stays of approximately 30% (mean of 97.23 and 68.47 days for historical and prospective arms, respectively).This suggests that an increase in pregnancy duration amongst births <32 weeks in the prospective arm was associated with shorter neonatal hospital stay.
Finally, in an exploratory analysis of all subjects (not only the 8.5%ile specified in the primary analysis) NNOLOS was reduced by 16% (mean of 5.1 and 4.28 days, HR: 1.15, 95%CI, 1.08-1.22, P<0.001) and 17% (mean 3.95 and 3.27 days, HR 1.15, 95% CI, 1.08-1.22, P<0.001), with and without the adjustment for neonatal death, respectively.
Comment
Principal Findings
The results of this trial demonstrate that women who were screened and stratified with a proteomic test for PTB risk, and were treated with care management, LDASA, and vaginal progesterone, had shorter NNLOS and less severe neonatal morbidities compared to a large historical arm, after controlling for population differences.
Results in the Context of What is Known
Historically, evidence has existed for the benefit of intervening on pregnant women stratified by clinical risk factors. These data suggest that proteomic biomarker-based stratification and focused intervention may improve outcomes in otherwise low-risk women.
Clinical Implications
These findings suggest a potential approach for universal screening and treatment to help ameliorate complications of PTB amongst women who lack traditional risk factors for preterm delivery. The results of this study resonate with an investigation33 that randomized 1,191 women to either knowing and receiving treatment vs not knowing the results of their screening test. In that study, the NICU length of stay due to sPTB was significantly less amongst those screened and treated vs. those not receiving their test results (median 6.8 days vs. 45.5 days; P=0.005), though the study was limited by small numbers. We note that similar strategies of both evaluating prior pregnancy for PTB and/or routine cervical length have been widely accepted in many centers,28,29 but this strategy captures a limited number of PTBs, and pregnancy history is not relevant for the first-time mothers who represent a sizable portion of the population.
Given the notable increase in PTB rate among Black women (14.4%) as compared to the U.S. population as a whole (10.2%),2 it is important to note that Black participants were represented in the AVERT PRETERM prospective and historical arms with a proportion (26.5%) nearly double recent U.S. population estimates (13.6%).30 The results in this study population, along with those in two large and similarly diverse studies,21,31 indicate that the PreTRM test will be applicable across the diverse U.S. population.
Research Implications
Patient knowledge of preterm birth risk via validated proteomic testing may offer an opportunity to combine a novel risk assessment with strategies proven to be effective with clinical risk factors. Based on survey data, pregnant women are frequently resistant to taking medication due to perceived deleterious effects during pregnancy. Prospective trials randomizing women to being informed of results with clinical treatment vs testing with blinded results have begun and will provide more definitive evidence.
Strengths and Limitations
Strengths of the trial include a multimodal strategy incorporating interventions that have consistently shown benefit in reducing the impacts of prematurity, as well as a novel proteomic blood test that has been validated in multiple cohorts. Additionally, we note that the dataset used to obtain the historical controls has been well validated, and all cases of PTB were reviewed. Clinically, the impact appears to be greatest amongst pregnancies delivering <34 weeks’ gestation, which remains the primary driver of newborn and child morbidity and mortality.
Limitations of the study are inherent in the design of comparing a prospective arm that differed from the historical arm in several maternal demographic and medical conditions. These were addressed through multivariable modeling, but we note that this remains a potential source of bias. Even so, significant demographic differences in the prospective arm vs the historical arm – older age, more hypertension and more smoking – likely biases the prospective arm toward increased PTB incidence, further underscoring the importance of our findings. Additionally, at least for the use of LDASA, guideline changes have expanded the number of women eligible for treatment, and a recent estimate suggests that most pregnancies should be counseled about LDASA.38. Finally, we note some overlap of our co-primary outcomes, as NICULOS was also part of the NMI.
Conclusions
In summary, (1) screening singleton, non-anomalous pregnancies lacking traditional clinical risk factors with a validated proteomic blood test for preterm birth prediction, then (2) targeting those with elevated risk for preventive treatment, improved neonatal outcomes as compared to a large historical population. This test-and-treatment strategy led to shorter hospital stays and improved NMI scores across a diverse population.
Acknowledgements
We wish to acknowledge the funding provided by Sera Prognostics, Inc., to conduct this independent investigation. Jennifer Logan, PhD, an employee of Sera Prognostics, Inc., provided medical writing support for the manuscript.
Footnotes
Conflicts of Interest: BS, JS, and MGW are paid consultants of Sera Prognostics, Inc. For the purposes of this study, JS and MGW reported to MKH and were paid consultants of ChristianaCare, and BS was supported by Sera Prognostics, Inc. MKH received an investigator-initiated grant from Sera Prognostics, Inc., to conduct this study.
Source of Funding: This study was performed as an investigator-initiated trial funded by Sera Prognostics, Inc. The funder provided testing and funds to conduct the trial. The study plan was mutually agreed upon by the investigators and the funder; however, the funder was blinded to study results until after data lock and analysis completion.