Structured Abstract
Importance Large language models (LLMs) have proven useful for extracting data from publicly available sources, but their uses in clinical settings and with clinical data are unknown.
Objective To determine the accuracy of data extraction using “Versa Chat,” a chat implementation of the general-purpose OpenAI gpt-35-turbo LLM model, versus manual chart review for hepatocellular carcinoma (HCC) imaging reports.
Design We engineered a prompt for the data extraction task of six distinct data elements and input 182 abdominal imaging reports that were also manually tagged. We evaluated performance by calculating accuracy, precision, recall, and F1 scores.
Setting/Participants Cross-sectional abdominal imaging reports of patients diagnosed with hepatocellular carcinoma enrolled in the Functional Assessment in Liver Transplantation (FrAILT) study.
Competing Interest Statement
The authors of this manuscript have the following potential conflicts of interest to disclose: -Dr. Jin Ge receives research support from Merck and Co; and consults for Astellas Pharmaceuticals/Iota Biosciences. -Dr. Jennifer C. Lai receives research support from Lipocene and Vir Biotechnologies; receives an education grant from Nestle Nutrition Sciences; serves on an advisory board for Novo Nordisk; and consults for Genfit, Third Rock Ventures, and Boehringer Ingelheim.
Funding Statement
The authors of this study were supported in part by the KL2TR001870 (National Center for Advancing Translational Sciences, Ge), P30DK026743 (UCSF Liver Center Grant, Ge, Li and Lai), ACG Junior Faculty Development Award (American College of Gastroenterology Institute, Li), and R01AG059183/K24AG080021 (National Institute on Aging, Lai). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or any other funding agencies. The funding agencies played no role in the analysis of the data or the preparation of this manuscript.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study was approved by the UCSF Institutional Review Board in Study #11-07513.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
Financial/Grant Support: The authors of this study were supported in part by the KL2TR001870 (National Center for Advancing Translational Sciences, Ge), P30DK026743 (UCSF Liver Center Grant, Ge, Li, and Lai), ACG Junior Faculty Development Award (American College of Gastroenterology Institute, Li), and R01AG059183/K24AG080021 (National Institute on Aging, Lai). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or any other funding agencies. The funding agencies played no role in the analysis of the data or the preparation of this manuscript.
Disclosures: The authors of this manuscript have the following potential conflicts of interest to disclose:
- Dr. Jin Ge receives research support from Merck and Co; and consults for Astellas Pharmaceuticals/Iota Biosciences.
- Dr. Jennifer C. Lai receives research support from Lipocene and Vir Biotechnologies; receives an education grant from Nestle Nutrition Sciences; serves on an advisory board for Novo Nordisk; and consults for Genfit, Third Rock Ventures, and Boehringer Ingelheim.
Writing Assistance: None.
Minor edits to the abstract.
Data Availability
Aggregate data produced in the present study may be available upon reasonable request to and with approval by the authors.
Abbreviations
- CI
- confidence interval
- FrAILT
- Functional Assessment in Liver Transplantation
- GAI
- generative artificial intelligence
- HCC
- hepatocellular carcinoma
- LI-RADS
- Liver Imaging Reporting and Data System
- LLM
- large language model
- NLP
- natural language processing
- PHI
- protected health information
- UCSF
- University of California, San Francisco