Abstract
Recent evidence indicates that Type 2 Diabetes Mellitus (T2DM) is a complex and highly heterogeneous disease involving various pathophysiological and genetic pathways, which presents clinicians with challenges in disease management. While deep learning models have made significant progress in helping practitioners manage T2DM treatments, several important limitations persist. In this paper we propose DARE, a model based on the transformer encoder, designed for analyzing longitudinal heterogeneous diabetes data. The model can be easily fine-tuned for various clinical prediction tasks, enabling a computational approach to assist clinicians in the management of the disease. We trained DARE using data from over 200,000 diabetic subjects from the primary healthcare SIDIAP database, which includes diagnosis and drug codes, along with various clinical and analytical measurements. After an unsupervised pre-training phase, we fine-tuned the model for predicting three specific clinical outcomes: i) occurrence of comorbidity, ii) achievement of target glycaemic control (defined as glycated hemoglobin < 7%) and iii) changes in glucose-lowering treatment. In cross-validation, the embedding vectors generated by DARE outperformed those from baseline models (comorbidities prediction task AUC = 0.88, treatment prediction task AUC = 0.91, HbA1c target prediction task AUC = 0.82). Our findings suggest that attention-based encoders improve results with respect to different deep learning and classical baseline models when used to predict different clinical relevant outcomes from T2DM longitudinal data.
Competing Interest Statement
JF-N, DM reports a relationship with: AstraZeneca Pharmaceuticals LP that includes: funding grants and speaking and lecture fees; Ascensia Diabetes Care pain L that includes: speaking and lecture fees; Boehringer Ingelheim GmbH that includes: funding grants and speaking and lecture fees; G K that includes: funding grants and speaking and lecture fees; Lilly pain that includes: funding grants and speaking and lecture fees; M D that includes: funding grants and speaking and lecture fees; Novartis Pharmaceuticals Corporation that includes: funding grants and speaking and lecture fees; Novo Nordisk Inc that includes: funding grants and speaking and lecture fees; Sanofi that includes: funding grants and speaking and lecture fees. EM, BV, JE, AG, ER, EA, IP, AP-L declare no conflict of interest
Funding Statement
This work was supported by the Grant PID2021-122952OB-I00 funded by AEI 10.13039/501100011033 and by ERDF A way of making Europe; the Networking Biomedical Research Centre in the subject area of Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), initiatives of Instituto de Investigacin Carlos III (ISCIII); ISCIII (grant AC22/00035); and the CERCA Programme / Generalitat de Catalunya. B2SLab is certified as 2021 SGR 01052. This study was possible thanks to the commitment of physicians and nurses working in the Catalan Health Institute to provide optimal care to patients with diabetes. CIBER of Diabetes and Associated Metabolic Diseases (CIBERDEM) is an initiative from Instituto de Salud Carlos III, Madrid, Spain. This analysis is part of the DiaCare Project of Novo Nordisk and the Fundació TicSalut (Departament de Salut, Generalitat de Catalunya), in collaboration with Evidenze Health España, for the benefit of people with type 2 diabetes.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
IDIAP Ethics Committee of Institut Universitari d'Investigació en Atenció Primària (IDIAP Jordi Gol) gave ethical approval for this work
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
↵* Enrico Manzini: enrico.manzini{at}upc.edu; Didac Mauricio: didacmauricio{at}gmail.com
Data Availability
The data analysed in this study is subject to the following licenses/restrictions: restrictions apply to the availability of some or all data generated or analysed during this study because they were used under license. The corresponding authors will on request detail the restrictions and any conditions under which access to some data may be provided.