Abstract
Identifying pregnancies at risk for preterm birth, one of the leading causes of worldwide infant mortality, has the potential to improve prenatal care. However, we lack broadly applicable methods to accurately predict preterm birth risk. The dense longitudinal information present in electronic health records (EHRs) is enabling scalable and cost-efficient risk modeling of many diseases, but EHR resources have been largely untapped in the study of pregnancy. Here, we apply machine learning to diverse data from EHRs to predict singleton preterm birth. Leveraging a large cohort of 35,282 deliveries, we find that a prediction model based on billing codes alone can predict preterm birth at 28 weeks of gestation (ROC-AUC=0.75, PR-AUC=0.40) and outperforms a comparable model trained using known risk factors (ROC-AUC=0.59, PR-AUC=0.21). Our machine learning approach is also able to accurately predict preterm birth sub-types (spontaneous vs. indicated), mode of delivery, and recurrent preterm birth. We demonstrate the portability of our approach by showing that the prediction models maintain their accuracy on a large, independent cohort (5,978 deliveries) with only a modest decrease in performance. Interpreting the features identified by the model as most informative for risk stratification demonstrates that they capture non-linear combinations of known risk factors and patterns of care. The strong performance of our approach across multiple clinical contexts and an independent cohort highlights the potential of machine learning algorithms to improve medical care during pregnancy.
Introduction
Preterm birth, occurring before 37 weeks of completed gestation, affects approximately 10% of pregnancies globally (1–3) and is the leading cause of infant mortality worldwide (4, 5). The causes of preterm birth are likely multifactorial since different biological pathways and environmental exposures can trigger premature labor (6). Large epidemiological studies have identified many risk factors, including multiple gestations (1), cervical anatomic abnormalities (7), and maternal age (8). Notably, even though a history of preterm birth (9) is one of the strongest risk factors, the recurrence rate remains low at < 30% (10, 11). Additionally, maternal race influences risk for preterm birth with black women having twice the prevalence compared to white women (1, 12). Preterm births have a heterogenous clinical presentation and cluster based on maternal, fetal or placental conditions (3). These obstetric and systemic comorbidities (e.g. pre-existing diabetes, cardiovascular disease) can also increase risk for preterm birth (13, 14).
Despite our understanding of numerous risk factors, there are no accurate methods to predict preterm birth. Some biomarkers associate with preterm birth, but their best performance is limited to a subset of all cases (15). Recently, analysis of maternal cell-free RNA has emerged as a promising approach (16), but initial results were based on a small pregnancy cohort and require further validation. In silico classifiers based on demographic and clinical risk factors have the advantage of not requiring serology or invasive testing. However, even in large cohorts (>1 million individuals), demography- and risk-factor-based models report poor to moderate performance for clinical application (17–21). To date, we lack effective screening tools and preventative strategies for prematurity (22).
Electronic health records (EHRs) are scalable, readily available, and cost-efficient for disease-risk modeling (23). EHRs capture longitudinal data across a broad set of phenotypes with high temporal resolution. EHR data can be combined with socio-demographic factors and family medical history to comprehensively model disease risk. EHRs are also increasingly being augmented by linking patient records to molecular data, such as DNA and laboratory test results. Since preterm birth has a substantial heritable risk (24), combining rich phenotypes with genetic risk may lead to better prediction.
Machine learning models have shown promise for accurate risk stratification across a variety of clinical domains (25–27). However, despite the rapid adoption of machine learning in translational research, a review of 107 risk prediction studies reported that most models used only few variables, did not consider longitudinal data, and rarely evaluated model performance across multiple sites (28). Some medical domains have yet to incorporate machine learning methods. Pregnancy research is especially well poised to benefit from machine learning approaches (29). Per standard of care during pregnancy, women are carefully monitored with frequent prenatal visits, medical imaging, and clinical laboratories tests. Compared to other clinical contexts, pregnancy and the corresponding clinical surveillance occur in a defined timeframe based on gestational length. Thus, EHRs are well-suited for modeling pregnancy complications, especially when combined with the well documented outcomes at the end of pregnancy.
In this study, we combine multiple sources of data from EHRs to predict preterm birth using machine learning. From Vanderbilt’s EHR database (≥3.2 Million records) and linked genetic biobank (≥100,000 individuals), we identified a large cohort of women (n=35,282) with documented deliveries at Vanderbilt. We trained models (gradient boosted decision trees) that combine demographic factors, clinical history, laboratory tests, and genetic risk with billing codes (ICD-9 and CPT) to predict preterm birth. We find models trained on all billing codes from the mother’s EHR distinguished preterm births from term and postterm births with high accuracy compared to other EHR features. We assess the clinical potential of these models by quantifying performance across different contexts. When restricting features to those available at different stages in pregnancy, billing-code-based models can accurately predict preterm birth at 28 weeks of gestation. Furthermore, this approach maintains high accuracy for predicting spontaneous preterm birth and preterm risk among mothers with a history of preterm birth. Finally, we demonstrate the generalizability of this approach by evaluating billing-code-based models on an external, independent cohort from University of California, San Francisco (UCSF, n=5,978). Prediction models trained at Vanderbilt maintain high accuracy in the external cohort with only a modest drop in performance. Our findings provide a proof-of-concept that machine learning on rich phenotypes in EHRs show promise for portable, accurate, and non-invasive prediction of preterm birth. The strong predictive performance across clinical context and preterm birth subtypes argues that machine learning models have the potential to add value during the clinical management of pregnancy.
Results
Characteristics and phenotyping of delivery cohort from Vanderbilt EHRs
From the Vanderbilt EHR database (∼3.2 Million patients), we identified a ‘delivery cohort’ of 35,282 women with at least one delivery in the Vanderbilt hospital system (Fig. 1A). In addition to ICD and CPT billing codes, we extracted demographic data, past medical histories, obstetric notes, clinical labs, and genome-wide genetic data for the delivery cohort when available. Because billing codes were the most prevalent data in this cohort (n=35,282), we quantified the pairwise overlap between billing codes and each other data type. The largest subset included women with billing codes paired with demographic data (n=33,570). The smallest subset was women with billing codes paired with genetic data (n=905; Fig. 1C). The mean maternal age of women at the first delivery in the delivery cohort was 27.3 years (Fig. 1D). The majority of women in the cohort self- or third-party reported as white (n=21,343), black (n=6,178), or Hispanic (n=3,979; Fig. 1E). The estimated gestational age (EGA) distribution had a mean of 38.5 weeks (38.0 to 40.3 weeks, 25th to 75th percentile; Fig. 1F). The rate of multiple gestations (e.g. twins, triplets) was (7.6%, n=1,353). Since multiple gestation pregnancies are more likely to deliver preterm, we developed prediction models using singleton pregnancies unless otherwise stated.
We used billing codes and EGA to ascertain the delivery date and type (preterm vs. not-preterm, Methods). In the delivery cohort, we identified 7,774 preterm births. To evaluate the accuracy of ascertaining preterm births, a domain expert blinded to the delivery type reviewed clinical notes from 104 EHRs selected at random from the delivery cohort. The ascertainment algorithm had precision/positive predictive value (PPV) of 96% and recall/sensitivity of 96% using the chart reviewed label as the gold standard (Fig. S1A).
Boosted decision trees using billing codes identify preterm deliveries
Using this richly phenotyped delivery cohort, we evaluated how well the clinical phenome, defined as only billing codes (ICD-9 and CPT) before and after delivery, could identify preterm births. With counts of each billing code (excluding those used to ascertain delivery type), we trained gradient boosted decision trees (30) to classify each mother’s first delivery as preterm or not-preterm (Fig. 2A). Boosted decisions trees are well-suited for EHR data because they require minimal transformation of the raw data, are robust to correlated features, and capture non-linear relationships (31). Moreover, boosted decision trees have been successfully applied on a variety of clinical tasks (32–34).
In all evaluations, we held out 20% of the cohort for testing and used the remaining 80% for training and validation (Fig. 2A). Boosted decision tree models trained on ICD-9 and CPT codes accurately identified preterm births (singletons and multiple gestations) with PR-AUC=0.86 (chance=0.22) and ROC-AUC=0.95 (Fig. S2 A and B). While the combined ICD-9 and CPT based model achieved the best performance, models trained on either ICD-9 or CPT individually also performed well (PR-AUC ≥0.82; chance=0.22, ROC-AUC ≥0.93). All three models demonstrated good calibration with low Brier scores (≤0.092; Fig. S2C). Thus, billing codes across an EHR show potential as a discriminatory feature for predicting preterm birth.
Accurate prediction of preterm birth at 28 weeks of gestation
To evaluate preterm birth prediction in a clinical context, we trained a boosted decision tree model (Fig. 2A) on billing codes present before each of the following timepoints: 0, 13, 28, and 35 weeks of gestation (Fig. 2B). We downsampled to achieve comparable number of singleton deliveries across each timepoint to mitigate sample size as a potential confounder while comparing performance. We only considered active pregnancies at each timepoint; for example, a delivery at 29 weeks would not be included in the 35 week model, since the outcome would already be known. The ROC-AUC increased from conception (0 weeks; 0.63) to the highest performance at 35 weeks (0.75; Fig. 2C). The PR-AUC (Fig. 2D), which accounts for preterm birth prevalence, obtains the strongest performance at 28 weeks (0.33, chance=0.13). However, as we show in the next section, this is an underestimate of the ability to predict preterm delivery at 28 weeks due to our downsampling of the number of training examples. As expected, when we included multiple gestations, the model performed even better (PR-AUC=0.42 at 28 weeks, chance=0.14; Fig. S3). Results were similar when models were trained using billing codes available before different timepoints from the date of delivery (Fig. S4).
To confirm that the number of contacts with the health system was not driving performance, we trained a classifier based on the total number of codes in an individual’s EHR before delivery to predict preterm birth. This simple classifier failed to discriminate between delivery types with PR-AUC and ROC-AUC only slightly higher than chance (PR-AUC=0.19, chance=0.19; ROC-AUC=0.56, chance=0.5, Fig. S5). Therefore, cumulative disease burden or the number of contacts alone are not informative in predicting preterm birth.
Integrating other EHR features does not improve model performance
In addition to billing codes, EHRs capture aspects of an individual’s health through different types of structured and unstructured data. We tested whether incorporating additional features from EHRs can improve preterm birth prediction. Models were evaluated using data available at 28 weeks of gestation, given that it is sufficiently early for intervention and enabled accurate predictions using billing codes. From the EHRs, we extracted sets of features including demographic variables (age, race), clinical keywords from obstetric notes, clinical lab tests ran during the pregnancy, and predicted genetic risk (polygenic risk score for preterm birth). To measure the performance gain for each feature set, we compared models trained using: the feature set only, billing codes only, and billing codes combined with the feature set (Fig. 3A). Within each feature set, the same pregnancies comprised the training and held-out sets for the three models. However, the number of deliveries (training + held-out sets) varied widely across feature sets (n=20,342 to 462) due to the differing availability of each feature type.
Models using only demographic factors, clinical keywords, and genetic risk had ROC-AUC and PR-AUC similar to chance (Fig. 3B). Clinical labs had moderate predictive power with ROC-AUC of 0.63 and PR-AUC of 0.24 (Fig. 3B). Compared to models using only billing codes, adding additional feature sets did not substantially improve performance (Fig. 3B). We note that some features sets, such as clinical labs and genetic risk, were evaluated on held-out sets with small numbers of deliveries (180 and 92, respectively). However, even after increasing the sample size by including EHR features present before and after delivery, we did not observe a consistent gain in performance compared to models trained using only billing codes (Fig. S6).
Models using billing codes outperforms prediction from risk factors
Although there are well known risk factors for preterm birth, there exists no clinical risk calculator that is routinely implemented in clinical care. We sought to compare the performance of our EHR-based prediction models to known risk factors. Such risk factors would inform a physician’s gestalt for risk-stratifying a pregnancy. We compared the 28 week billing-code-based model to a model trained using a set of known clinical risk factors (17) that included: self- or third-party reported race (Black, Asian, or Hispanic), age at delivery (> 34 or <18 years old), diabetes status, sickle cell disease status, presence of fetal abnormalities, pre-pregnancy BMI >35, and pre-pregnancy hypertension (blood pressure > 120/80, Methods).
The billing-code-based model significantly outperformed a model trained with clinical risk factors at predicting preterm birth at 28 weeks of gestation (PR-AUC=0.40 vs. 0.21; Fig. 4B). The pattern was similar for ROC-AUC (risk factors=0.59, billing codes=0.75; Fig. S7). The stronger performance of the billing-code-based classifier was true for women across the spectrum of comorbidity burden. It had higher precision across individuals with different numbers of risk factors. Performance peaked for individuals with 0 (precision=0.39) and 4+ (precision=0.43) risk factors, but we did not observe a trend between model performance and increasing number of clinical risk factors (Fig. 4C). This suggests that combinations of billing codes have the potential to quantify preterm birth risk better than risk factors that are currently used to inform clinical judgement.
Machine learning models can predict spontaneous preterm births
The multifactorial etiologies of preterm birth lead to clinical presentations with different comorbidities and trajectories. Medically-indicated and idiopathic spontaneous preterm births are distinct in etiologies and outcomes. Identifying pregnancies that ultimately result in spontaneous preterm deliveries is particularly valuable, and we anticipated that spontaneous preterm birth would be more challenging to predict than preterm birth overall. To test this, we identified spontaneous preterm births in the held-out set (n=75) at 28 weeks of gestation by excluding women with medically induced labor, a cesarean section delivery, or PPROM (Methods). We intentionally used a conservative phenotyping strategy that aimed to minimize false positive spontaneous preterm births to evaluate the model’s ability to predict spontaneous preterm births. The prediction model trained using billing codes up to 28 weeks of gestation classified 48% (recall) of all spontaneous preterm births as preterm; this is significantly higher than the risk factor only model (recall = 33%; Fig. 4D).
Performance varies based on clinical context and delivery history
To further explore the sensitivity of the performance of our approach to clinical context and patient history, we evaluated how delivery type (vaginal vs. cesarean-section) and a previous preterm birth influence preterm birth prediction. We trained two classifiers using billing codes (ICD-9 and CPT) occurring before 28 weeks of gestation: one on a cohort of cesarean-section (n=5,475) singleton deliveries and one on vaginal deliveries (n=15,487). Preterm birth prediction accuracy was higher in the cesarean-section cohort (PR-AUC=0.47, chance=0.20) compared to the vaginal delivery cohort (PR-AUC=0.23, chance = 0.10; Fig. 5A). Cesarean-sections also had higher ROC-AUC compared to vaginal deliveries (0.75 vs. 0.68, Fig. S8). As expected, the preterm birth prevalence was higher in the cesarean-section cohort.
Women with a history of preterm birth are at significantly higher risk for a subsequent preterm birth than women without a previous history. Therefore, we tested if models trained on EHR data of women with a history of preterm birth could accurately predict the status of their next birth. We assembled 1,416 women with a preterm birth and a subsequent delivery in the cohort and split them into a training set (80%) and held-out set (20%) to evaluate the model performance (Methods). For these women, 53% of the second deliveries were preterm. Due to limited availability of estimated gestational age data for the recurrent preterm births, which is necessary to approximate the date of conception, we trained models using billing codes (ICD-9 and CPT) present before each of the following timepoints: 10, 30, and 60 days before the delivery. These models were all able to discriminate term from preterm deliveries better than chance (Fig. 5B; PR-AUCs≥0.75). The model predicting a second preterm birth at 10 days before delivery achieved the highest performance with PR-AUC=0.84 (Fig. 5B, chance=0.53) and ROC-AUC=0.82 (Fig. S9).
Models accurately predict preterm birth in an independent cohort
To evaluate whether preterm birth prediction models trained on the Vanderbilt cohort performed well on EHR data from other databases, we compared their performance on the held-out Vanderbilt cohort (n=4,215) and an independent cohort from UCSF (n=5,978). The UCSF cohort was ascertained using similar rules as the Vanderbilt cohort (Methods); age and distribution of race are provided in Table S1. However, we note that the UCSF cohort has a lower preterm birth prevalence (6%) compared to the Vanderbilt cohort (13%).
To facilitate the comparison, we trained models to predict preterm birth in the Vanderbilt cohort using only ICD-9 codes present before 28 weeks of gestation. We did not consider CPT codes in this analysis due to differences in the available billing code data between Vanderbilt and UCSF. As expected from the previous results, the model accurately predicted preterm birth in the held-out set from Vanderbilt (PR-AUC of 0.34, chance=0.12), but performance was slightly lower than using ICD and CTP codes (Fig. 4B).
The model trained at Vanderbilt also achieved strong performance in the UCSF cohort (PR-AUC of 0.31 vs 0.34 at Vanderbilt; Fig. 6A). The classifier had a higher ROC-AUC (0.80) in UCSF cohort compared to the Vanderbilt cohort (0.72; Fig. S10). This is likely due to the lower prevalence of preterm birth in the UCSF and the sensitivity of ROC-AUC to class imbalance (35). Overall, these models show striking reproducibility across two independent cohorts.
Similar features are predictive across the independent cohorts
The architecture of boosted decision trees enables straightforward identification of features (ICD-9 codes) with the largest impact on the model predictions. We used SHapley Additive exPlanation values (SHAP) (36, 37) to quantify the marginal additive contribution of each feature to the model predictions for each individual. For each feature in the ICD-9-based model, we calculated the mean absolute SHAP values across all women in the held-out set. The mean absolute SHAP value for each feature was highly correlated (spearman R=0.93, p-value < 2.2E-308) between the held-out Vanderbilt set and the UCSF cohort (Fig. 6B). Ten of the top 15 features ranked based on the mean absolute SHAP value were shared across both cohorts. Examination of the codes driving prediction revealed many known risk factors such as fetal abnormalities, history of twin pregnancy, history of preterm birth, diabetes, and other comorbidities (Fig. 6C). The majority of the top features involved codes indicating screening, routine or otherwise, during pregnancy. Three top features only in the UCSF dataset included codes for supervision of high-risk pregnancies (Fig. 6C).
Discussion
Preterm birth is a major health challenge affecting 5-20% of pregnancies (1, 2, 12) and leading to significant morbidity and mortality (38, 39). Predicting preterm birth risk could inform clinical management, but no accurate classification strategies are routinely implemented (22). Here, we take a step toward addressing this need by demonstrating the potential for machine learning on dense phenotyping from EHRs to predict preterm birth. Our models predict preterm birth accurately across challenging clinical contexts (e.g., spontaneous and recurrent) at 28 weeks of gestation. Compared to other data types in the EHRs, models using billing codes alone had the highest prediction accuracy and outperformed those using clinical risk factors. Demonstrating the potential broad applicability of our approach, the model accuracy remained high in an external independent cohort. Combinations of many known risk factors and patterns of care drove prediction; this suggests that the algorithm builds on existing knowledge. Thus, we conclude that machine learning based on EHR data has the potential to predict preterm birth accurately across multiple healthcare systems.
Our models have several distinct advantages compared to published approaches. First, they have robust performance. Previous models using risk factors (diabetes, hypertension, sickle cell disease, history of preterm birth) to predict preterm birth, despite having cohorts up to two million women (17), have reported ROC-AUCs between 0.69 and 0.74 (18, 19, 21). Our models obtain a ROC-AUC of 0.75 and PR-AUC of 0.40 using data available at 28 weeks gestation. Furthermore, given the unbalanced classification problem (preterm births are less common than non-preterm), we report high PR-AUCs in addition to high ROC-AUCs. Compared to a recent deep learning model using word embeddings from EHRs for predicting extreme preterm birth (birth before 28 weeks of gestation, ROC-AUC of 0.83, 40), our models achieved similar accuracy using only billing codes to predict all preterm births. We did not stratify preterm births by severity since more than 85% of preterm births occur after 32 weeks of gestation (56). However, this is an interesting topic for further work.
Second, our models use readily available data throughout pregnancy that do not require invasive sampling. While some studies have also obtained high ROC-AUCs (e.g., 0.81-0.88), they used serum biomarkers across small cohorts (16) or acute obstetric changes within days of delivery (20). This can enable cost-effective and broad application as illustrated by our evaluation of the classifiers on EHR data from UCSF.
Third, the gradient boosted decision trees we implement are more interpretable than ‘black-box’ deep learning models that cannot easily identify features driving predictions. This ability could lead to better understanding of the risk factors and differences in risk factors in different regions of the country or the world. The ease of interpretation of our decision trees is a necessary factor for future deployment in clinical settings. Our models rediscovered several known risk factors for preterm birth, which establishes further confidence in our machine-learning-based risk prediction models.
Finally, our approach generalizes across hospital systems. We demonstrate that billing-code-based models trained at Vanderbilt achieve similar accuracy in an independent cohort from UCSF. The generalizability of machine learning models can be constrained by the sampling of the training data. Thus, the accurate prediction in an independent dataset from an external institution points to several inherent strengths of the model. First, successful replication indicates the models’ ability to learn predictive signals despite regional variation in assigning billing codes to an EHR. Second, the large cohorts used to train and evaluate models at Vanderbilt and USCF guard against potential weakness of EHRs. Miscoding or omission of key data points are unavoidable in EHRs (41). The large cohort used to train our models mitigates these errors and enables the high accuracy in the UCSF dataset, even with its different demographics. Additionally, idiosyncratic patterns of patient care at the institution used to develop the algorithm, which would be present in the Vanderbilt training and held-out sets, are unlikely to be present in the external UCSF cohort and inflate the out-of-sample accuracy. Third, the top features driving model performance are shared across institutions and reflect combinations of known risk factors and patterns of care. This aids interpretability of the underlying algorithm and likely reflects underlying pathophysiology that is innate to preterm birth.
We see several avenues for further improving our algorithm. First, some of the top features reflected routine obstetric care for high-risk pregnancies. Thus, the learning problem could be engineered to force the algorithm to discover new unappreciated risk factors. Second, we were surprised that the addition of features beyond billing codes, such as lab values, concepts extracted from clinical notes, and genetic information did not significantly improve performance. In some cases, any redundant information already captured by the billing codes would not improve the model’s accuracy; this is likely true for clinical notes. However, other sources, like currently available genetic data and polygenic risk scores, may not effectively capture underlying etiologies of preterm birth; thus, these sources may not add more discriminatory power. Indeed, the largest published genome-wide study for preterm birth only explains a very small fraction of the heritability (24), and a polygenic risk score derived from it was not predictive in our cohort. Further sub-phenotyping of preterm birth will not only aid in prediction, but also understanding its multifactorial etiology and developing personalized treatment strategies. More explicit modeling of the temporal dependence between EHR features may further increase performance. Finally, while we evaluated the ability of our classifiers to discriminate preterm births, further studies evaluating the calibration of these models are necessary to better risk stratify of pregnancies.
The strong predictive performance of our models suggests that they have the potential to be clinically useful. Compared to a machine learning model trained using only known risk factors, the billing-code-based classifier incorporated a broad set of clinical features and predicted preterm birth with higher accuracy. Furthermore, the superior performance was not driven by the number of risk factors or the total burden of billing codes. These results indicate the algorithm is not simply identifying less healthy individuals or those with greater healthcare usage. The models also accurately predicted many preterm births in challenging and important clinical contexts such as spontaneous and recurrent preterm birth. Spontaneous preterm births are common (1, 12, 42), and unlike iatrogenic deliveries, they are more difficult to predict because they are driven by unknown multifactorial etiologies(12, 22). Similarly, since a prior history of preterm birth is one of the strongest risk factors (43), distinguishing pregnancies most at risk for recurrent preterm birth has potential to provide clinical value.
However, additional work is needed before this approach is ready for clinical application. Though it has strong performance, a more comprehensive evaluation of the algorithm against current clinical practice is needed to determine how early and how much improvement in standard of care this approach could provide (44). Furthermore, while our cohorts include diverse individuals and the algorithm generalizes well, the approach must be evaluated to ensure that it does not introduce of amplify biases against specific groups or types of preterm birth (45). In addition, we anticipate further gains in the clinical value of this approach as more modalities of data becomes incorporated in the EHR (46) and diverse populations become available. Addressing these questions and taking other necessary steps toward clinical utility will require the close collaboration of diverse experts from basic, clinical, social, and implementation sciences.
Our results provide a proof-of-concept that machine learning algorithms can use the dense phenotype information collected during pregnancy in EHRs to predict preterm birth. The significant prediction accuracy across clinical contexts and compared to existing risk factors suggests such modeling strategies can be clinically useful. We are optimistic that with the ever-growing number of EHRs, improvement in tools for extracting meaningful data from them, and integration of complementary molecular data, machine learning approaches can improve the clinical management of preterm birth.
Materials and Methods
Ascertaining delivery type and date for Vanderbilt cohort
We identified women with at least one delivery (n=35,282, ‘delivery-cohort’) at Vanderbilt Hospital based on the presence of delivery-specific billing codes (ICD-9/10 and CPT) or estimated gestational age (EGA) documented in the EHR. Combining delivery specific ICD-9/10 (‘delivery-ICDs’), CPT (‘delivery-CPTs’), and EGA data, we developed an algorithm to label each delivery as preterm or not preterm. Women with multiple gestations (e.g. twins, triplets) were identified using ICD and CPT codes and exclude for singleton-based analyses. See Supplementary Materials and Methods for exact codes.
We demarcate multiple deliveries by grouping delivery-ICDs in intervals of 37 weeks starting with the most recent delivery-ICD. This step is repeated until all delivery-ICDs in a patient’s EHR are assigned to a pregnancy. We chose 37-week intervals to maximally discriminate between pregnancies. For each delivery, we assign a list of labels (preterm, term, or postterm) ascertained using the delivery-ICDs. EGA values were mapped to multiple pregnancies using the same procedure. The most recent EGA value determined the time interval to group preceding EGA values. Based on the most recent EGA value for each pregnancy, we assigned labels to each delivery (EGA <37 weeks: preterm; ≥37 and <42 weeks: term, ≥42 weeks: postterm). After pooling delivery labels based on delivery-ICDs and EGA, we assigned a consensus delivery label by selecting the oldest gestational age based classification (i.e. postterm > term > preterm).
Since CPT codes do not encode delivery type, we combined the delivery-CPTs with timestamps of delivery-ICDs and EGAs to approximate the date of delivery. Delivery-CPTs were grouped into multiple pregnancies as described above. The most recent timestamp from delivery-CPTs, delivery-ICDs, and EGA values was used as the approximate delivery date for a given pregnancy.
Validating delivery type based on chart review
To validate the delivery type ascertained from billing codes and EGA, we used chart-reviewed labels as the gold standard. For 104 randomly selected EHRs from the delivery cohort, we extracted the date and gestational age at delivery from clinical notes. For earliest delivery recorded in the EHR, we assigned a chart-review based label according to the gestational age at delivery (<37 weeks: preterm; 37 and 42 weeks: term, ≥42 weeks: postterm). The precision/positive predictive value for the ascertained delivery type as a binary variable (‘preterm’ or ‘not-preterm’) was calculated using the chart reviewed label as the gold standard. To compare the ascertainment strategy to a simpler phenotyping algorithm, we compared the concordance of the label derived from delivery-ICDs to one based on the gestational age within three days of delivery. This simpler phenotyping approach resulted in a lower PPV (85%) and recall (93%; Fig. S1B) compared to the billing-code-based ascertainment strategy.
Training and evaluating boosted decision trees to predict preterm birth
All models for predicting preterm birth used boosted decision trees as implemented in XGBoost v0.82 (30). Unless stated otherwise, we trained models to predict the earliest delivery in a woman’s EHR as preterm or not-preterm. The delivery cohort was randomly split into training (80%) and held-out (20%) sets with equal proportion of preterm cases. For all models we excluded ICD-9, CPT codes, and EGA used to ascertain delivery type and date. On the training set, we use tree of Parzen estimators as implemented in hyperopt v0.1.1 (47) to optimize hyperparameters by maximizing the mean average precision. The best set of hyperparameters was selected after 1,000 trials using 3-fold cross-validation over the training set (80:20 split with equal proportion of preterm cases). We evaluated the performance of all models on the held-out set using Scikit-learn v0.20.2 (48). All performance metrics reported are on the held-out set. For precision-recall curves, we define baseline chance for each model as the prevalence of preterm cases. To ensure no data leaks were present in our training protocol, we trained and evaluated a model using a randomly generated dataset (n=1,000 samples) with a 22% preterm prevalence. As expected, this model did not do better than chance (AUC=0.50, PR-AUC=0.22, data not show). All trained models with their optimized hyperparameters are provided at https://github.com/abraham-abin13/ptb_predict_ml.
Identifying preterm births using features derived from EHRs
As a first step, we evaluated whether billing codes could discriminate between delivery types. Models were trained to predict preterm birth using the total counts of each ICD-9, CPT, or ICD-9 and CPT code across a woman’s EHR. We excluded any codes used to ascertain delivery type or date. All three models were trained and evaluated on the same cohort of women who had at least one ICD-9 and CPT code.
In addition to billing codes, we extracted structured and unstructured features from the EHRs (Fig. 3A). Structured data included self or third-party reported race (Fig. 1E), age at delivery, past medical and family history (92 features, see Supplementary Materials and Methods), and clinical labs. For training models, we only included clinical labs obtained during the first pregnancy and excluded values greater than four standard deviations from the mean. To capture the trajectory of each clinical lab (307 clinical labs, see Supplementary Materials and Methods), we trained models using the mean, median, minimum, and maximum values across the pregnancy as features. For unstructured clinical text in obstetric and nursing clinical notes, we applied CLAMP (49) to extract UMLS (Unified Medical Language System) concepts unique identifiers (CUIs and included those with positive assertions with > 0.5% frequency across all EHRs). When training preterm birth prediction models, we one-hot encoded categorical features. No transformations were applied to the continuous features.
A subset of women (n=905) was genotyped on the Illumina MEGAEX platform. We applied standard GWAS quality control steps (50) using PLINK v1.90b4s (51). We calculated a polygenic risk score for each white woman with genotype data based on the largest available preterm birth GWAS (24) using PRSice-2 (52, 53). We assumed an additive model and summed the number of risk alleles at single nucleotide polymorphisms (SNPs) weighted by their strength of association with preterm birth (effect size). PRSice determined the optimum number of SNPs by testing the polygenic risk score for association with preterm birth in our delivery-cohort at different GWAS p-value thresholds. We included date of birth and five genetic principal components to control for ancestry. Our final polygenic risk score used 356 preterm birth associated SNPs (GWAS p-value < 0.00025).
Next, we evaluated whether adding EHR features could improve preterm birth prediction. Since the number of women varied across EHR feature, we created subsets of the delivery cohort for each EHR feature. Each subset included women with at least one recorded value for the EHR feature and billing codes. Then we trained three models as described above for each subset: 1) using only the EHR feature being evaluated, 2) using ICD-9 & CPT codes, and 3) using the EHR feature with ICD-9 & CPT codes. Thus, all three models for a given EHR feature were trained and evaluated on the same cohort of deliveries (Fig. 3A).
Predicting preterm birth before delivery using billing codes and clinical risk factors
In addition to training models using features across a woman’s EHR, we also evaluated models using features present before delivery. Subtracting the estimated gestational age (recorded within three days of delivery) from the date of delivery, we obtained date of conception. Next, we trained models using ICD-9 and CPT codes timestamped before different gestational timepoints (Fig. 4A): 0, 13, 28, 32, and 35 weeks of gestation. We compared the performance of models using only billing codes to clinical risk factors obtained from the EHR. All risk factors were encoded as binary features. Risk factors such as diabetes status, fetal abnormalities, and sickle cell disease status was defined based on at least one corresponding ICD-9 code occurring before the date of delivery. The remaining risk factors, such as race (Black, Asian, or Hispanic was encoded as higher risk), age at delivery (> 34 or <18 years old), pre-pregnancy BMI ≥ 35, and pre-pregnancy hypertension (>120/80), were extracted from structured fields in EHR. Pre-pregnancy value was defined as the most recent measurement occurring before nine months of the delivery date. The association between risk factors and preterm birth was evaluated using a chi-squared test of independence implemented in SciPy v1.2.0 (54).
In addition to evaluating models based on the date of conception, we trained models at different timepoints before the date of delivery (Fig. S3) using the same cohort of women by requiring every individual in this cohort had to have at least one ICD-9 or CPT code before each timepoint. Evaluating models before the date of delivery increased the sample size (n=15,481) compared to a prospective conception-based design (n=12,410) and yielded similar results.
Evaluating model performance on spontaneous preterm births, by delivery type, and recurrent preterm birth
We compared how models trained used billing codes (ICD-9 &CPT) performed in different clinical contexts. First, we evaluated the accuracy of predicting spontaneous preterm birth using models trained to predict all types of preterm births. From all preterm cases in the held-out set, we excluded women who met any of the following criteria to create a cohort of spontaneous preterm births: medically induced labor, delivery by cesarean section, or preterm premature rupture of membranes. The ICD-9 and CPT codes used to identify exclusion criteria are provided in Supplementary Materials and Methods. We calculated recall/sensitivity as the number of predicted spontaneous preterm births out of all spontaneous preterm births in the held-out set. We used the same approach to quantify performance of models trained using clinical risk factors (Fig. 4C).
We trained models to predict preterm birth among cesarean sections and vaginal deliveries separately using billing codes (ICD-9 & CPT) as features. Deliveries were labeled as cesarean sections or vaginal deliveries if they had at least one relevant billing code (ICD-9 or CPT) occurring within ten days of the date of first delivery in EHR. Billing codes used to determine delivery type are provided in Supplementary Materials and Methods. Deliveries with billing codes for both cesarean and vaginal deliveries were excluded. We trained separate models to predict cesarean and vaginal deliveries (Fig. 5 and Fig. S8).
We evaluated how well models using billing codes could predict recurrent preterm birth. From our delivery cohort, we retained women whose first delivery in the EHR was preterm and a second delivery for which we ascertained the type (preterm vs. not-preterm) as described above for the first delivery. We trained models using billing codes (ICD-9 & CPT) at timepoints before the date of delivery because the majority of this cohort did not have reliable EGA at the second delivery. As described earlier, separate models were trained using billing codes timestamped before timepoint being evaluated.
Preterm birth prediction in independent UCSF cohort
We evaluated how well models trained at Vanderbilt using billing codes would replicate in an external cohort assembled at UCSF. Only the first delivery in the EHR was used for prediction. Women with twins or multiple gestations, identified using billing codes (Supplementary Materials and Methods), were excluded. Delivery type (preterm vs. not preterm) was assigned based on the presence of ICD-10 codes. Term (or not-preterm) deliveries were determined by the presence of an ICD-10 code beginning with the characters “O80”, specifying an encounter for full-term delivery. Preterm deliveries were determined by both the absence of ICD-10 codes beginning with “O80” and the presence of codes beginning with “O60.1”, the family of codes for preterm labor with preterm delivery. We trained models using ICD-9 codes present before 28 weeks of gestation on the Vanderbilt cohort to predict preterm birth. CPT codes were not used since they were not available from the UCSF EHR system. The 28-week model was evaluated on the Vanderbilt held-out set and the independent UCSF cohort.
Feature interpretation from boosted decision tree models
To determine feature importance, we used SHapley Additive exPlanation values (SHAP) (36, 37, 55) to determine the marginal additive contribution of each feature. For the held-out Vanderbilt cohort and the UCSF cohort, a SHAP value was calculated for each feature per individual. Feature importance was summarized by taking the mean of the absolute value of SHAP scores across individuals. The top fifteen features based on the mean absolute SHAP value in either the Vanderbilt or UCSF cohorts values are reported. To compare how feature importance varies at Vanderbilt and UCSF, we computed the Pearson correlation of the mean absolute SHAP values.
Data Availability
All code and models in this study are available at https://github.com/abraham-abin13/ptb_predict_ml.
Funding
AA was supported by the American Heart Association fellowship 20PRE35080073, National Institutes of Health (NIH, T32GM007347), the March of Dimes, and the Burroughs Wellcome Fund. MS, IK, and BL were supported by the March of Dimes and NIH (NLM K01LM012381). JAC was supported by the NIH(R35GM127087), March of Dimes, and the Burroughs Wellcome Fund. This work was conducted in part using the resources of the Advanced Computing Center for Research and Education at Vanderbilt University. The dataset(s) used for the analyses described were obtained from Vanderbilt University Medical Center’s BioVU which is supported by numerous sources: institutional funding, private agencies, and federal grants. These include the NIH funded Shared Instrumentation Grant S10RR025141; and CTSA grants UL1TR002243, UL1TR000445, and UL1RR024975. Genomic data are also supported by investigator-led projects that include U01HG004798, R01NS032830, RC2GM092618, P50GM115305, U01HG006378, U19HL065962, R01HD074711; and additional funding sources listed at https://victr.vumc.org/biovu-funding/. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health, the March of Dimes, or the Burroughs Wellcome Fund.
Author contributions
Conceptualization, Methodology
A.A. and J.A.C conceived and designed the study.
Data curation
C.A.B. extracted billing codes, clinical notes, and performed concept extraction on the Vanderbilt cohort. P.S and L.K.D extracted, cleaned, and provided clinical laboratory data during pregnancy on the Vanderbilt cohort.
Resources
D.R.V provided obstetric and nursing notes on the Vanderbilt cohort. B.L., I.K, MS extracted the delivery cohort from UCSF.
Formal Analysis, Investigation
A.A. performed all analyses on the Vanderbilt cohort under supervision from J.A.C. B.L. and I.K evaluated models on UCSF cohorts under supervision from M.S.
Funding acquisition
J.A.C.
Writing
A.A. wrote the manuscript with guidance from J.A.C, M.S., L.M. and A.R.
Competing interests
LJM is a consultant for Mirvie, Inc.
Data and materials availability
All code and models in this study are available at https://github.com/abraham-abin13/ptb_predict_ml.
Supplementary Materials
Materials and Methods
Delivery-specific ICD-9/10 codes used to ascertain delivery type
The following ICD-9/10 codes were used to ascertain delivery type as described in the Methods section.
Preterm Birth: ‘O60.1 ‘, ‘O60.10’, ‘O60.10X0’, ‘O60.10X1’, ‘O60.10X2’, ‘O60.10X3’, ‘O60.10X4’, ‘O60.10X5’, ‘O60.10X9’, ‘O60.12’, ‘O60.12X0’, ‘O60.12X1’, ‘O60.12X2’,‘O60.12X3’, ‘O60.12X4’, ‘O60.12X5’, ‘O60.12X9’, ‘O60.13’, ‘O60.13X0’, ‘O60.13X1’, ‘O60.13X2’, ‘O60.13X3’, ‘O60.13X4’, ‘O60.13X5’, ‘O60.13X9’, ‘O60.14’, ‘O60.14X0’, ‘O60.14X1’, ‘O60.14X2’, ‘O60.14X3’, ‘O60.14X4’, ‘O60.14X5’, ‘O60.14X9’, ‘644.2’, ‘644.20’, ‘644.21’
Term Birth: ‘O60.20’, ‘O60.20X0’, ‘O60.20X1’, ‘O60.20X2’, ‘O60.20X3’, ‘O60.20X4’, ‘O60.20X5’, ‘O60.20X9’, ‘O60.22’, ‘O60.22X0’, ‘O60.22X1’, ‘O60.22X2’, ‘O60.22X3’, ‘O60.22X4’, ‘O60.22X5’, ‘O60.22X9’, ‘O60.23’, ‘O60.23X0’, ‘O60.23X1’, ‘O60.23X2’, ‘O60.23X3’, ‘O60.23X4’, ‘O60.23X5’, ‘O60.23X9’, ‘O80’, ‘O48.0’, ‘650’, ‘645.1’, ‘645.10’, ‘645.11’, ‘645.13’, ‘649.8’, ‘649.81’, ‘649.82’
Postterm Birth: ‘O48.1’, ‘645.2’, ‘645.20’, ‘645.21’, ‘645.23’, ‘645.00’, ‘645.01’, ‘645.03’
Delivery-specific CPT codes used to ascertain delivery type
The following CPT codes were used to ascertain delivery date: ‘59400’, ‘59409’, ‘59410’, ‘59414’, ‘59510’, ‘59514’, ‘59515’, ‘59525’, ‘59610’, ‘59612’, ‘59614’, ‘59618’, ‘59620’, ‘59622’.
Identifying multiple gestations using billing codes
Pregnancies with multiple gestations were identified using the presence of any of the following billing codes. For singleton only analyses, we excluded women with multiple gestation.
ICD-9 Multiple Gestations:
‘651’,’651.7’,’651.70’,’651.71’,’651.8’,’651.81’,’651.83’,’651.9’,’651.91’,’651.93’,’652.6’,’652. 60’,’652.61’,’652.63’,’V91’,’V91.9’,’V91.90’,’V91.91’,’V91.92’,’V91.99’,’651’,’651.0’,’651. 00’,’651.01’,’651.03’,’651.1’,’651.10’,’651.11’,’651.13’,’651.2’,’651.20’,’651.21’,’651. 23’,’651.3’,’651.30’,’651.31’,’651.33’,’651.4’,’651.40’,’651.41’,’651.43’,’651.5’,’651.5 0’, ‘651.51’, ‘651.53’, ‘V91’,’V91.0’,’V91.00’,’V91.01’,’V91.02’,’V91.03’,’V91.09’,’V91.1’,’V91.10’,’V91.1 1’,’V91.12’,’V91.19’,’V91.2’,’V91.20’,’V91.21’,’V91.22’,’V91.29’,’V91.9’,’V91.90’,’ V91.91’,’V91.92’, ‘V91.99’
CPT Twin codes: ‘74713’,’76802’,’76810’,’76812’,’76814’
ICD-10 Multiple Gestations:
‘BY4BZZZ’,’BY4DZZZ’,’BY4GZZZ’,’O30.801’,’O30.802’,’O30.803’,’O30.809’,’O30.811’,’ O30.812’,’O30.813’,’O30.819’,’O30.821’,’O30.822’,’O30.823’,’O30.829’,’O30.891’,’O30.89 2’,’O30.893’,’O30.899’,’O30.91’,’O30.92’,’O30.93’,’O31.BX10’,’O31.BX11’,’O31.BX12’,’O31.BX13’,’O31.BX14’,’O31.BX15’,’O31.BX19’,’O31.BX20’,’O31.BX21’,’O31.BX22’,’O31.BX23’,’O31.BX24’,’O31.BX25’,’O31.BX29’,’O31.BX30’,’O31.BX31’,’O31.BX32’,’O31.B X33’,’O31.BX34’,’O31.BX35’,’O31.BX39’,’O31.BX90’,’O31.BX91’,’O31.BX92’,’O31.BX 93’,’O31.BX94’,’O31.BX95’,’O31.BX99’
Past medical and family history extracted from EHRs used to predict preterm birth
The following past medical and family history features were extracted from EHRs for women with at least one recorded delivery at Vanderbilt Hospital.
Maternal History
‘Abortion’, ‘Alcohol ‘, ‘Baby’s father had a child with birth defect not listed’, ‘Baby’s father’s family has history of birth defect not listed’, ‘Drugs ‘, ‘Endocrine Metabolic Patient ‘, ‘Endocrine metaboloic Patient History ‘, ‘Gravidity’, ‘Hematoligic ‘, ‘Maternal metabolic or endocrine disorders (Diabetes, PKU) ‘, ‘Menses every 28 to 30 days ‘, ‘Patient History Breast Disease ‘, ‘Patient History Congential Heart Defect ‘, ‘Patient History Cystic Fibrosis ‘, ‘Patient History Down Syndrome ‘, ‘Patient History GI Problems ‘, ‘Patient History Genetic other’, ‘Patient History Gyn Problems ‘, ‘Patient History Heart Disease ‘, ‘Patient History Hemophilia or other blood disorders ‘, ‘Patient History Huntington’s Chorea ‘, ‘Patient History Hypertension ‘, ‘Patient History Immune or Infectious Disease ‘, ‘Patient History Infertility or Recurrent Spontaneous Abortions ‘, ‘Patient History Malignancies ‘, ‘Patient History Mental Retardation ‘, ‘Patient History Multiple births ‘, ‘Patient History Muscular Dystrophy ‘, ‘Patient History Neural Tube Defect ‘, ‘Patient History Neurological Disorder ‘, ‘Patient History Operations or Accidents ‘, ‘Patient History Other Hospitalizations ‘, ‘Patient History Other ‘, ‘Patient History Other inherited or chromosomal disorders ‘, ‘Patient History Other structural Birth defect ‘, ‘Patient History Phlebitis or varicocities ‘, ‘Patient History Pulmonary Disease ‘, ‘Patient History Recurrent Pregnancy loss defined as more than 2 or stillbirth’, ‘Patient History STDs ‘, ‘Patient History Sickle Cell Disease (African or Carribean American) ‘, ‘Patient History Thalessemia (Italian, Greek, Mediterranean, or Asian Background); MCV <80 ‘, ‘Patient History Tobacco, Alcohol, Drugs ‘, ‘Patient History Urinary tract problems including UTIs and Pyel ‘, ‘Patient History of Seizure’, ‘Patient History of sexual/physical abuse or trauma ‘, ‘Patient’s age greater than 34 at delivery ‘, ‘Pregnancy Induced Hypertension’, ‘Prior Preterm_births’, ‘Regular exercise ‘, ‘Term_births’, ‘Tobacco ‘, ‘Urinary tract infection’, ‘Live_Children’
Family History
‘Familly History Thalessemia (Italian, Greek, Mediterranean, or Asian Background); MCV <80 ‘, ‘Family History Breast Disease ‘, ‘Family History Congential Heart Defect ‘, ‘Family History Cystic Fibrosis ‘, ‘Family History Down Syndrome ‘, ‘Family History GI Problems ‘, ‘Family History Genetic other’, ‘Family History Gyn Problems ‘, ‘Family History Heart Disease ‘, ‘Family History Hemophilia or other blood disorders ‘, ‘Family History Huntington’s Chorea ‘, ‘Family History Hypertension ‘, ‘Family History Immune or Infectious Disease ‘, ‘Family History Infertility or Recurrent Spontaneous Abortions ‘, ‘Family History Jewish, Cajun, French Canadian (Tay Sachs) ‘, ‘Family History Jewish: Canavan Disease, Gauchers ‘, ‘Family History Malignancies ‘, ‘Family History Mental Retardation ‘, ‘Family History Metabolic or endocrine disorders (Diabetes, PKU) ‘, ‘Family History Multiple births ‘, ‘Family History Muscular Dystrophy ‘, ‘Family History Neural Tube Defect ‘, ‘Family History Neuroligcal Disorder ‘, ‘Family History Operations or Accidents ‘, ‘Family History Other Hospitalizations ‘, ‘Family History Other ‘, ‘Family History Other inherited or chromosomal disorders ‘, ‘Family History Other structural Birth defect ‘, ‘Family History Phlebitis or varicocities ‘, ‘Family History Pulmonary Disease ‘, ‘Family History Recurrent Pregnancy loss defined as more than 2 or stillbirth’, ‘Family History STDs ‘, ‘Family History Sickle Cell Disease (African or Carribean American) ‘, ‘Family History Tobacco, Alcohol, Drugs ‘, ‘Family History Urinary tract problems including UTIs and Pyel ‘, ‘Family History of Seizure’, ‘Family History of sexual/physical abuse or trauma ‘, ‘Jewish, Cajun, French Canadian (Tay Sachs) ‘, ‘Jewish: Canavan Disease, Gauchers’
Clinical labs measured during pregnancy used to predict preterm birth
‘albumin urine, lactic acid venous, cd3 #/cumm, total protein urine, glucose blood, wbc blood, eo automated abs, atyp lymphs (abs), reaction time, lmw heparin assay, rdwsd, glucose spinal fluid, control ptt, rbc folate, calcium blood, gentamicin level, urea nitrogen ur spot, mch, aldosterone, magnesium blood, mchc, factor viii activity, sodium blood, igg quantitative blood, bicarbonate (calc), hcg beta (3rd irp), dhea sulfate, hdl cholesterol, protein csf, f t4, alt blood, neutrophil %, k-time, metamyelo %, estriol unconjugated, sodium urine spot, cellano antigen, icterus index, nucleated rbc, protein total blood, eosoinophil (abs), erythropoietin, neutrophils %, immature retic fraction, zinc serum, c-peptide, imm granulocytes %, lipemia index, monocytes %, ssb (la)(ena) ab, igg, beta-hcg serum, protein urine, bedside glucose, troponin t, intact-pth, sm (smith) autoabs eia, ferritin, absolute cd8, sex hormone bind globulin, eosinophils %, protein c activity, cd8(cd3+)/cd45 #/cumm, glucose tol 50g, basophils %, wbc, albumin, mcv, gamma globulin, testosterone free, fio2, lymph %, pan t cd3 %, troponin-i, mono (abs), rheumatoid factor, quant d-dimer for dic, pcv blood, hgb a1c glycated poc, 25-hydroxy d3, eosoinophil (abs), carboxyhemoglobin, urea nitrogen blood, hgb a1c glycated, cholesterol blood, lamotrigine, cystatin-c, carbon dioxide blood, apa-igg, neutrophils %, myelocytes %, hdl cholesterol, vit e(alpha-tocopherol), glucose whole blood, calcium ionized, gamma glut trans blood, follicle stimulating hrm, total hemoglbin, creatinine g/24 hour, atyp lymphs %, wbc urine micro, nt automated abs, chloride blood, imm platelet fraction, fasting glucose, po2/fio2, sodium whole blood, ast blood, albumin/creatinine ratio, angle(alpha), rbc, vit d 1,25-dihydroxy, c3 quantitative blood, lymphs (abs), ldl cholesterol, triglycerides blood, testosterone, ed troponin-i wbld, o2 saturation, creatinine urine per day, triiodothyronine free, eosinophil %, rbc, rbc urine micro, thyroid stim hormone, anti-myeloperoxidase, c-reactive protein, deamidated gliadin iga abs, hyaline cast, ammonia, igg beta 2 glycoprotein i, progesterone rapid, vitamin d 25-oh total, t helper cd4 #/cumm, patient (pt), schedule q hr, keppra (levetiracetam), creatine kinase total, maternal alphafeto pr0, creatinine urine “spot”, ret ct, creatinine urine “timed”, specific gravity ua, iron blood, kappa light chain quant, lithium blood, 2 hour glucose, vancomycin level, anion gap, luteinizing hormone, iga quantitative blood, phenytoin (dilantin), methemoglobin, alpha-1 globulin, thyroglobulin serum, renin activity, c4 quantitative blood, rdw, urobilinogen, maternal weight, venous ph, % cd3, protein urine spot, carbamazepine (tegretol), hep b surface ab value, anti-protease 3, hemoglobin s, sed rate, amylase blood, ssa (ro)(ena) ab, igg, 25-hydroxy d total, total gamma globulin, adrenocorticotropic horm, retic hgb equiv, neut (abs), insulin, albumin, lymphs %, antithrombin iii act, myelocytes (abs), lymps %, nucleated rbc, alkaline phosphatase bld, # wbc\’s counted, fibrinogen, ed creatinine wbld, ph arterial, metamyelocytes (abs), kappa/lambda ratio, ret abs, beta globulin, basophils %, albumin blood, ed inr wbld, anti thyroid peroxab, tc:hdl ratio, afp tumor, vitamin a (retinol), albumin/creat ratio, patient location, ck-mb ratio, total volume, total t4, creatinine blood, absolute cd3, collection time, current gest age, apa-igm, ck blood, hemoglobin blood, max amplitude, transferrin blood, cd8(cd3+)/cd45 %, cd4:cd8 ratio, monocytes %, protein urine timed, beta globulin, dose, % cd8, estradiol, nucleated rbc#, cortisol, prolactin, lymphs (abs), granular cast, protein-s-activity, pcv blood, mono (abs), brain natriuretic peptide, fk-506 (tacrolimus), bilirubin conjugated, bilirubin total blood, chloride whole blood, 25-hydroxy d2, hemoglobin a, haptoglobin blood, folate serum, ck-mb, glucose urine, nucleated cell, absolute cd4, baso (abs), creatinine urine, scl-70 autoabs eia, infusion start time, squamous epithelial, g parameter, osmolality blood, baso (abs), vitamin b-12, hours of collection, inr, lipase blood, hemoglobin a2, potassium urine spot, factor v leiden coag, phosphorus inorganic, percent saturation, valproate(depakane), ldh blood, anti-dna(sle)current, lambda light chain quant, bcrabl/bcr ratio, free phenytoin, t helper cd4 %, % cd4, mean platelet volume, creatinine urine, ketone urine, igm quantitative blood, patient ptt, glucose body fluid, vit e(gamma-tocopherol), igm beta 2 glycoprotein i, maternal b-hcg, drvvt, protein total blood, ph urine, lymph abs, alpha-2 globulin, retinyl palminate, d-dimer (patient), lymphs %, o2 saturation (calc), uric acid blood, ldl cholesterol calc, non-hdl, triiodothyronine, total, testerone free female, potassium whole blood, deamidated gliadin igg abs, urobilinogen, monocytes %, protein /24 hour, neut (abs), tibc blood, apa-iga, platelet count, albumin urine, o2 saturation(venous, prealbumin blood, basophils %, eosinophil %
Identifying cesarean section and vaginal deliveries
The following ICD-9 and CPT codes were used to label deliveries as a cesarean section vs. vaginal deliveries. We excluded deliveries if they had codes for both types of deliveries.
Cesarean section: ‘669.7’, ‘669.70’, ‘669.71’, ‘763.4’, ‘74.0’, ‘74.1’, ‘74.2’, ‘74.4’, ‘74.9’,’74.99’, ‘59510’, ‘59514’, ‘59515’, ‘59618’, ‘59620’, ‘59622’
Vaginal Deliveries: ‘59409’, ‘59410’, ‘59610’, ‘59612’, ‘59614’
Identifying spontaneous preterm births from electronic health records
From all preterm cases, we excluded women meeting any of the following criteria: medically induced labor, delivery by cesarean section, or preterm premature rupture of membranes. The following ICD-9 and CPT codes were used to identify these exclusion criteria.
Medically induced labor: ‘73.01’, ‘73.1’, ‘73.4’, ‘73.0’, ‘73.09’
Cesarean Section delivery: ‘669.7’, ‘669.70’, ‘669.71’, ‘763.4’, ‘74.0’, ‘74.1’, ‘74.2’, ‘74.4’, ‘74.9’,’74.99’, ‘59510’, ‘59514’, ‘59515’, ‘59618’, ‘59620’, ‘59622’
Preterm premature rupture of membranes: ‘658.13’,’658.10’,’658.11’
Acknowledgments
We thank the members of the Capra lab for thoughtful discussion on this project.