Abstract
This study evaluates the reliability of the largest public-facing large language models in providing accurate breast cancer radiotherapy recommendations. We assessed ChatGPT 3.5, ChatGPT 4, ChatGPT 4o, Claude 3.5 Sonnet, and ChatGPT o1 in three common clinical scenarios. The clinical cases are as follows: post-lumpectomy radiotherapy in a 40 year old woman, (2) postmastectomy radiation in a 40 year old woman with 4+ lymph nodes, and (3) postmastectomy radiation in an 80 year old woman with early stage tumor and negative axillary dissection. Each case was designed to be unambiguous with respect to the Level I evidence and clinical guideline-supported approach. The evidence-supported radiation treatments are as follows: (1) Whole breast with boost (2) Regional nodal irradiation (3) Omission of post-operative radiotherapy. Each prompt is presented to each LLM multiple times to ensure reproducibility. Results indicate that the free, public-facing models often fail to provide accurate treatment recommendations, particularly when omission of radiotherapy was the correct course of action. Many recommendations suggested by the LLMs increase morbidity and mortality in patients. Models only accessible through paid subscription (ChatGPT o1 and o1-mini) demonstrated greatly improved accuracy. Some prompt-engineering techniques, rewording and chain-of-reasoning, enhanced the accuracy of the LLMs, while true/false questioning significantly worsened results. While public-facing LLMs show potential for medical applications, their current reliability is unsuitable for clinical decision-making.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study did not receive any funding
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
All data produced in the present work are contained in the manuscript