Abstract
Diabetic retinopathy (DR) is a sight-threatening condition caused by diabetes. Screening programmes for DR include eye examinations, where the patient’s fundi are photographed, and the findings, including DR severity, are recorded in the medical report. However, statistical analyses based on DR severity require structured labels that calls for laborious manual annotation process if the report format is unstructured. In this work, we propose a large language model DR-GPT for classification of the DR severity from unstructured medical reports. On a clinical set of medical reports, DR-GPT reaches 0.975 quadratic weighted Cohen’s kappa using truncated Early Treatment Diabetic Retinopathy Study scale. When DR-GPT annotations for unlabeled data are paired with corresponding fundus images, the additional data improves image classifier performance with statistical significance. Our analysis shows that large language models can be applied for unstructured medical report databases to classify diabetic retinopathy with a variety of applications.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
Yes
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Not Applicable
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study is based on a retrospective and registry-based dataset, and as such does not involve experiments on humans and/or the use of human tissue samples and no patients were imaged for this study. Studies based on retrospective and registry-based dataset do not need ethical permission or informed consent from subjects according to the law of Finland (Medical Research Act (488/1999) and Act on Secondary Use of Health and Social Data (552/2019)) and according to European General Data Protection Regulation (GDPR) rules 216/679. The research permit was granted by the Helsinki University Hospital Chief Medical Officer (decision number 67/2020), Helsinki, Finland, July 1, 2020.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Not Applicable
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Not Applicable
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Not Applicable
Data Availability
Data cannot be shared publicly because of the data protection law of Finland, the General Data Protection Regulation (GDPR) of European Union, and our research permission granted by Helsinki University Hospital that do not allow sharing of individual patients’ data. Data are available from Helsinki University Hospital for researchers who meet the criteria for access to confidential data.