ABSTRACT
Objective To propose a novel approach for enhancing clinical prediction models by combining structured and unstructured data with multimodal data fusion.
Methods We presented a comprehensive framework that integrated multimodal data sources, including textual clinical notes, structured electronic health records (EHRs), and relevant clinical data from National Electronic Injury Surveillance System (NEISS) datasets. We proposed a novel hybrid fusion method, which incorporated state-of-the-art pre-trained language model, to integrate unstructured clinical text with structured EHR data and other multimodal sources, thereby capturing a more comprehensive representation of patient information.
Results The experimental results demonstrated that the hybrid fusion approach significantly improved the performance of clinical prediction models compared to traditional fusion frameworks and unimodal models that rely solely on structured data or text information alone. The proposed hybrid fusion system with RoBERTa language encoder achieved the best prediction of the Top 1 injury with an accuracy of 75.00% and Top 3 injuries with an accuracy of 93.54%.
Conclusion Our study highlights the potential of integrating natural language processing (NLP) techniques with multimodal data fusion for enhancing clinical prediction models’ performances. By leveraging the rich information present in clinical text and combining it with structured EHR data, the proposed approach can improve the accuracy and robustness of predictive models. The approach has the potential to advance clinical decision support systems, enable personalized medicine, and facilitate evidence-based health care practices. Future research can further explore the application of this hybrid fusion approach in real-world clinical settings and investigate its impact on improving patient outcomes.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study did not receive any funding
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
↵† Jiancheng Ye and Jiarui Hai are co-first authors.
DATA AVAILABILITY STATEMENT
All data referred to in the manuscript are publicly available. The relevant code and analyses are available at: https://github.com/haidog-yaqub/Clinical_HybridFusion.