ABSTRACT
Background Text in electronic health records (EHRs) and big data tools offer the opportunity for surveillance of adverse events (patient harm associated with medical care) (AEs) in the unstructured notes. Writers may explicitly state an apparent association between treatment and adverse outcome (“attributed”) or state the simple treatment and outcome without an association (“unattributed”). We chose the case of transfusion adverse events (TAEs) and potential TAEs (PTAEs) because real dates were obscured in the study data, and new TAE types were becoming recognized during the study data period.
Objective Develop a new method to identify attributed and unattributed potential adverse events using the unstructured text of EHRs.
Methods We used EHRs for adult critical care admissions at a major teaching hospital, 2001-2012. We formed a transfusion (T) group (21,443 admissions treated with packed red blood cells, platelets, or plasma), excluded 2,373 ambiguous admissions, and formed a comparison (C) group of 25,468 admissions. We concatenated the text notes for each admission, sorted by date, into one document, and deleted replicate sentences and lists. We identified statistically significant words in T vs. C. T documents were filtered to those words, followed by topic modeling on the T filtered documents to produce 45 topics.
For each topic, the three documents with the maximum topic scores were manually reviewed to identify events that occurred shortly after the first transfusion; documents with clear alternative explanations for heart, lung, and volume overload problems (e.g., advanced cancer, lung infection) were excluded. We also reviewed documents with the most topics, as well as 20 randomly selected T documents without alternate explanations.
Results Topics centered around medical conditions. The average number of significant topics was 6.1. Most PTAEs were not attributed to transfusion in the notes.
Admissions with a top-scoring cardiovascular topic (heart valve repair, tapped pericardial effusion, coronary artery bypass graft, heart attack, or vascular repair) were more likely than random T admissions to have at least one heart PTAE (heart rhythm changes or hypotension, proportion difference = 0.47, p = 0.022). Admissions with a top-scoring pulmonary topic (mechanical ventilation, acute respiratory distress syndrome, inhaled nitric oxide) were more likely than random T admissions (proportion difference = 0.37, p = 0.049) to have at least one lung PTAE (hypoxia, mechanical ventilation, bilateral pulmonary effusion, or pulmonary edema).
Conclusions The “Shakespeare Method” could be a useful supplement to AE reporting and surveillance of structured EHR data. Future improvements should include automation of the manual review process.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
The Food and Drug Administration supported work by Drs. Bright, Bright-Ponte, and Palmer, and contracted with Booz Allen Hamilton to perform work done by Drs. Rankin and Blok, and Ms. Dowdy.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The data are managed by the Massachusetts Institute of Technology. Our use of the data was approved by their IRB. The research was designated not human subjects research by the Food and Drug Administration Institutional Review Board.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
We used the Medical Information Mart for Intensive Care III (MIMIC-III) data available from https://mimic.physionet.org/about/mimic/. The administrators of this website control all data releases.
ABBREVIATIONS USED MORE THAN ONCE
- AE
- adverse events
- ARDS
- acute respiratory distress syndrome
- C
- comparison group of admissions
- CABG
- coronary artery bypass graft
- DIC
- disseminated intravascular coagulation
- EHR
- electronic healthcare record
- FDA
- Food and Drug Administration
- GI
- gastrointestinal
- MIMIC-III
- Medical Information Mart for Intensive Care III
- MVA
- motor vehicle accident
- NLP
- natural language processing
- PTAE
- potential transfusion adverse event
- T
- group of admissions that received transfusion of blood components
- TACO
- transfusion-associated circulatory overload
- TAE
- transfusion adverse event
- TPA
- tissue plasminogen activator
- TRALI
- transfusion-related acute lung injury
- TTP
- thrombotic thrombocytopenic purpura