PT  - JOURNAL ARTICLE
AU  - Subramaniam, Suganya
AU  - Rizvi, Sara
AU  - Ramesh, Ramya
AU  - Sehgal, Vibhor
AU  - Gurusamy, Brinda
AU  - Arif, Hikamtullah
AU  - Tran, Jeffrey
AU  - Thamman, Ritu
AU  - Anyanwu, Emeka
AU  - Mastouri, Ronald
AU  - Mackensen, G. Burkhard
AU  - Arnaout, Rima
TI  - Mapping echocardiogram reports to a structured ontology: a task for statistical machine learning or large language models?
AID  - 10.1101/2024.02.20.24302419
DP  - 2024 Jan 01
TA  - medRxiv
PG  - 2024.02.20.24302419
4099  - http://medrxiv.org/content/early/2024/02/21/2024.02.20.24302419.short
4100  - http://medrxiv.org/content/early/2024/02/21/2024.02.20.24302419.full
AB  - Background Big data has the potential to revolutionize echocardiography by enabling novel research and rigorous, scalable quality improvement. Text reports are a critical part of such analyses, and ontology is a key strategy for promoting interoperability of heterogeneous data through consistent tagging. Currently, echocardiogram reports include both structured and free text and vary across institutions, hampering attempts to mine text for useful insights. Natural language processing (NLP) can help and includes both non-deep learning and deep-learning (e.g., large language model, or LLM) based techniques. Challenges to date in using echo text with LLMs include small corpus size, domain-specific language, and high need for accuracy and clinical meaning in model results.Methods We tested whether we could map echocardiography text to a structured, three-level hierarchical ontology using NLP. We used two methods: statistical machine learning (EchoMap) and one-shot inference using the Generative Pre-trained Transformer (GPT) large language model. We tested against eight datasets from 24 different institutions and compared both methods against clinician-scored ground truth.Results Despite all adhering to clinical guidelines, there were notable differences by institution in what information was included in data dictionaries for structured reporting. EchoMap performed best in mapping test set sentences to the ontology, with validation accuracy of 98% for the first level of the ontology, 93% for the first and second level, and 79% for the first, second, and third levels. EchoMap retained good performance across external test datasets and displayed the ability to extrapolate to examples not initially included in training. EchoMap’s accuracy was comparable to one-shot GPT at the first level of the ontology and outperformed GPT at second and third levels.Conclusions We show that statistical machine learning can achieve good performance on text mapping tasks and may be especially useful for small, specialized text datasets. Furthermore, this work highlights the utility of a high-resolution, standardized cardiac ontology to harmonize reports across institutions.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was supported by the National Institutes of Health and the U.S. Department of Defense, both to R.A.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Research was conducted with approval from the University of California, San Francisco Institutional Review Board.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesThe UCSF data dictionary may be made available upon reasonable request for non-commercial use and with approval(NLP)Natural language processing(ML)machine learning(LLM)large language model(GPT)generative pre-trained transformer(NER)named entity recognition