Systematic benchmarking demonstrates large language models have not reached the diagnostic accuracy of traditional rare-disease decision support tools

Justin T Reese; Leonardo Chimirri; Yasemin Bridges; Daniel Danis; J Harry Caufield; Kyran Wissink; Julie A McMurry; Adam SL Graefe; Elena Casiraghi; Giorgio Valentini; Julius OB Jacobsen; Melissa Haendel; Damian Smedley; Christopher J Mungall; Peter N Robinson

doi:10.1101/2024.07.22.24310816

Systematic benchmarking demonstrates large language models have not reached the diagnostic accuracy of traditional rare-disease decision support tools

View ORCID ProfileJustin T Reese, View ORCID ProfileLeonardo Chimirri, View ORCID ProfileYasemin Bridges, View ORCID ProfileDaniel Danis, View ORCID ProfileJ Harry Caufield, View ORCID ProfileKyran Wissink, View ORCID ProfileJulie A McMurry, View ORCID ProfileAdam SL Graefe, View ORCID ProfileElena Casiraghi, View ORCID ProfileGiorgio Valentini, View ORCID ProfileJulius OB Jacobsen, View ORCID ProfileMelissa Haendel, View ORCID ProfileDamian Smedley, View ORCID ProfileChristopher J Mungall, View ORCID ProfilePeter N Robinson

doi: https://doi.org/10.1101/2024.07.22.24310816

Justin T Reese

1Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA

2Monarch Initiative

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for Justin T Reese

Leonardo Chimirri

2Monarch Initiative

3Berlin Institute of Health at Charite Universitaetsmedizin Berlin, Berlin, Germany

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for Leonardo Chimirri

Yasemin Bridges

2Monarch Initiative

4William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for Yasemin Bridges

Daniel Danis

2Monarch Initiative

3Berlin Institute of Health at Charite Universitaetsmedizin Berlin, Berlin, Germany

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for Daniel Danis

J Harry Caufield

2Monarch Initiative

5University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for J Harry Caufield

Kyran Wissink

3Berlin Institute of Health at Charite Universitaetsmedizin Berlin, Berlin, Germany

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for Kyran Wissink

Julie A McMurry

2Monarch Initiative

5University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for Julie A McMurry

Adam SL Graefe

3Berlin Institute of Health at Charite Universitaetsmedizin Berlin, Berlin, Germany

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for Adam SL Graefe

Elena Casiraghi

6Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA

7AnacletoLab, Dipartimento di Informatica, Università degli Studi di Milano, Milano, Italy

8ELLIS-European Laboratory for Learning and Intelligent Systems

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for Elena Casiraghi

Giorgio Valentini

7AnacletoLab, Dipartimento di Informatica, Università degli Studi di Milano, Milano, Italy

8ELLIS-European Laboratory for Learning and Intelligent Systems

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for Giorgio Valentini

Julius OB Jacobsen

2Monarch Initiative

4William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for Julius OB Jacobsen

Melissa Haendel

2Monarch Initiative

5University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for Melissa Haendel

Damian Smedley

2Monarch Initiative

4William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for Damian Smedley

Christopher J Mungall

2Monarch Initiative

6Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for Christopher J Mungall

Peter N Robinson

2Monarch Initiative

3Berlin Institute of Health at Charite Universitaetsmedizin Berlin, Berlin, Germany

8ELLIS-European Laboratory for Learning and Intelligent Systems

9The Jackson Institute for Genomic Medicine, 10 Discovery Drive, Farmington CT 06032, USA

Find this author on Google Scholar
Find this author on PubMed
Search for this author on this site
ORCID record for Peter N Robinson
For correspondence: peter.robinson@bih-charite.de

Abstract
Full Text
Info/History
Metrics
Supplementary material
Data/Code
Preview PDF

Data Availability

All data produced are available online on Zenodo at: https://zenodo.org/records/14008477.

https://zenodo.org/records/14008477

View the discussion thread.

PreviousNext

Posted November 07, 2024.

Download PDF

Supplementary Material

Data/Code

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Your Email *

Your Name *

Send To *

Enter multiple addresses on separate lines or separate them with commas.

You are going to email the following Systematic benchmarking demonstrates large language models have not reached the diagnostic accuracy of traditional rare-disease decision support tools

Message Subject (Your Name) has forwarded a page to you from medRxiv

Message Body (Your Name) thought you would like to see this page from the medRxiv website.

Your Personal Message

CAPTCHA

This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.

Systematic benchmarking demonstrates large language models have not reached the diagnostic accuracy of traditional rare-disease decision support tools

Justin T Reese, Leonardo Chimirri, Yasemin Bridges, Daniel Danis, J Harry Caufield, Kyran Wissink, Julie A McMurry, Adam SL Graefe, Elena Casiraghi, Giorgio Valentini, Julius OB Jacobsen, Melissa Haendel, Damian Smedley, Christopher J Mungall, Peter N Robinson

medRxiv 2024.07.22.24310816; doi: https://doi.org/10.1101/2024.07.22.24310816

Share This Article:

Citation Tools

Systematic benchmarking demonstrates large language models have not reached the diagnostic accuracy of traditional rare-disease decision support tools

medRxiv 2024.07.22.24310816; doi: https://doi.org/10.1101/2024.07.22.24310816

Citation Manager Formats

BibTeX
Bookends
EasyBib
EndNote (tagged)
EndNote 8 (xml)
Medlars
Mendeley
Papers
RefWorks Tagged
Ref Manager
RIS
Zotero

Tweet Widget
Facebook Like
Google Plus One

Subject Area

Genetic and Genomic Medicine

Subject Areas

All Articles

Addiction Medicine (399)
Allergy and Immunology (710)
Anesthesia (201)
Cardiovascular Medicine (2947)
Dentistry and Oral Medicine (334)
Dermatology (249)
Emergency Medicine (440)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1041)
Epidemiology (12753)
Forensic Medicine (12)
Gastroenterology (828)
Genetic and Genomic Medicine (4587)
Geriatric Medicine (419)
Health Economics (729)
Health Informatics (2918)
Health Policy (1069)
Health Systems and Quality Improvement (1080)
Hematology (389)
HIV/AIDS (924)
Infectious Diseases (except HIV/AIDS) (14101)
Intensive Care and Critical Care Medicine (847)
Medical Education (426)
Medical Ethics (115)
Nephrology (469)
Neurology (4362)
Nursing (236)
Nutrition (639)
Obstetrics and Gynecology (806)
Occupational and Environmental Health (735)
Oncology (2273)
Ophthalmology (647)
Orthopedics (258)
Otolaryngology (325)
Pain Medicine (279)
Palliative Medicine (83)
Pathology (501)
Pediatrics (1197)
Pharmacology and Therapeutics (504)
Primary Care Research (496)
Psychiatry and Clinical Psychology (3757)
Public and Global Health (6944)
Radiology and Imaging (1529)
Rehabilitation Medicine and Physical Therapy (906)
Respiratory Medicine (915)
Rheumatology (438)
Sexual and Reproductive Health (444)
Sports Medicine (385)
Surgery (489)
Toxicology (60)
Transplantation (212)
Urology (181)