Abstract
Vision-language models (VLMs) can analyze multimodal medical data. However, a significant weakness of VLMs, as we have recently described, is their susceptibility to prompt injection attacks. Here, the model receives conflicting instructions, leading to potentially harmful outputs. In this study, we hypothesized that handwritten labels and watermarks on pathological images could act as inadvertent prompt injections, influencing decision-making in histopathology. We conducted a quantitative study with a total of N = 3888 observations on the state-of-the-art VLMs Claude 3 Opus, Claude 3.5 Sonnet and GPT-4o. We designed various real-world inspired scenarios in which we show that VLMs rely entirely on (false) labels and watermarks if presented with those next to the tissue. All models reached almost perfect accuracies (90 - 100 %) for ground-truth leaking labels and abysmal accuracies (0 - 10 %) for misleading watermarks, despite baseline accuracies between 30-65 % for various multiclass problems. Overall, all VLMs accepted human-provided labels as infallible, even when those inputs contained obvious errors. Furthermore, these effects could not be mitigated by prompt engineering. It is therefore imperative to consider the presence of labels or other influencing features during future evaluation of VLMs in medicine and other fields.
Competing Interest Statement
DT received honoraria for lectures by Bayer, Philips and AstraZenica and holds shares in StratifAI GmbH, Germany and Synagen GmbH, Germany. SF has received honoraria from MSD and BMS. JNK declares consulting services for Owkin, France, DoMore Diagnostics, Norway, Panakeia, UK, Scailyte, Switzerland, Cancilico, Germany, Mindpeak, Germany, MultiplexDx, Slovakia, and Histofy, UK; furthermore he holds shares in StratifAI GmbH, Germany and in Synagen GmbH, Germany, has received a research grant by GSK, and has received honoraria by AstraZeneca, Bayer, Eisai, Janssen, MSD, BMS, Roche, Pfizer and Fresenius. ICW has received honoraria by AstraZeneca. No other competing financial interests are declared by any of the authors.
Clinical Protocols
https://github.com/KatherLab/patho_prompt_injection
Funding Statement
JC is supported by the Mildred-Scheel-Postdoktorandenprogramm of the German Cancer Aid (grant 70115730). CVS is supported by a grant from the Interdisciplinary Centre for Clinical Research within the Faculty of Medicine at the RWTH Aachen University (PTD 1-13/IA 532313), the Junior Principal Investigator Fellowship Program of RWTH Aachen Excellence Strategy, the NRW Rueckkehr Programme of the Ministry of Culture and Science of the German State of North Rhine-Westphalia, and by the CRC 1382 (ID 403224013) funded by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation). SF is supported by the German Federal Ministry of Education and Research (SWAG, 01KD2215C), the German Cancer Aid (DECADE, 70115166 and TargHet, 70115995), and the German Research Foundation (504101714). MJ is supported by the German Cancer Aid (TargHet, 70115995). FK is funded by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) via the Clinician Scientist Program in Evolutionary Medicine (413490537). DT is funded by the German Federal Ministry of Education and Research (TRANSFORM LIVER, 031L0312A), the European Unions Horizon Europe and Innovation Programme (ODELIA, 101057091), and the German Federal Ministry of Health (SWAG, 01KD2215B). JNK is supported by the German Cancer Aid (DECADE, 70115166), the German Federal Ministry of Education and Research (PEARL, 01KD2104C; CAMINO, 01EO2101; SWAG, 01KD2215A; TRANSFORM LIVER, 031L0312A; TANGERINE, 01KT2302 through ERA-NET Transcan; Come2Data, 16DKZ2044A; DEEP-HCC, 031L0315A), the German Academic Exchange Service (SECAI, 57616814), the German Federal Joint Committee (TransplantKI, 01VSF21048), the European Unions Horizon Europe and Innovation Programme (ODELIA, 101057091; GENIAL, 101096312), the European Research Council (ERC, NADIR, 101114631), the National Institutes of Health (EPICO, R01 CA263318), and the National Institute for Health and Care Research (NIHR, NIHR203331) Leeds Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, or the Department of Health and Social Care. This work was funded by the European Union. Views and opinions expressed are, however, those of the authors only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Ethics committee/IRB of the Medical Faculty of the Technical University Dresden gave ethical approval for this work (BO-EK-412102024).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data availability
The original data (images, prompts, model outputs, ratings, summary statistics) are available in the supplementary data and supplementary information. The following publicly accessible data was used for this study: COAD-READ, the THCA, the OV, and the PRAD cohort of the TCGA, can be found at https://portal.gdc.cancer.gov/. Slides for the lymph node metastasis scenario can be found at https://www.cancerimagingarchive.net/collection/sln-breast/.