Abstract
Objective To develop an algorithm that infers patient delivery dates (PDDs) and delivery-specific details from Electronic Health Records (EHRs) with high accuracy.
Materials and Methods We obtained EHR data from 1,060,100 female patients treated at Penn Medicine hospitals or outpatient clinics between 2010-2017. We developed an algorithm called MADDIE: Method to Acquire Delivery Date Information from Electronic Health Records that infers a PDD for distinct deliveries based on EHR encounter dates assigned a delivery code, the frequency of code usage, and the time differential between code assignments. We validated MADDIE’s PDDs against a birth log independently maintained by the Department of Obstetrics and Gynecology.
Results MADDIE identified 50,560 patients having 63,334 distinct deliveries. MADDIE was 98.6% accurate (F1-score 92.1%) when compared to the birth log. The PDD was on average 0.68 days earlier than the true delivery date for patients with only one delivery (± 1.43 days) and 0.52 days earlier for patients with more than one delivery episode (± 1.11 days).
Discussion MADDIE is the first algorithm to successfully infer PDD information using only structured delivery codes and identify multiple deliveries per patient. MADDIE is also the first to validate the accuracy of the PDD using an external gold standard of known delivery dates as opposed to manual chart review of a sample.
Conclusion MADDIE infers delivery dates and delivery-specific details from the EHR with high accuracy and relies only on structured EHR elements while harnessing temporal information and the frequency of code usage to identify accurate PDDs.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This research is funded by the University of Pennsylvania
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study was approved by the Institutional Review Board of the University of Pennsylvania.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
All diagnosis and procedure code sets used in this analysis is included as a supplemental file with the publication of the paper. The patient-level data are not available for sharing due to privacy concerns, but relevant summary statistics are provided in the manuscript.