RT Journal Article SR Electronic T1 AI-driven Integration of Multimodal Imaging Pixel Data and Genome-wide Genotype Data Enhances Precision Health for Type 2 Diabetes: Insights from a Large-scale Biobank Study JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2024.07.25.24310650 DO 10.1101/2024.07.25.24310650 A1 Huang, Yi-Jia A1 Chen, Chun-houh A1 Yang, Hsin-Chou YR 2024 UL http://medrxiv.org/content/early/2024/07/26/2024.07.25.24310650.abstract AB The rising prevalence of Type 2 Diabetes (T2D) presents a critical global health challenge. Effective risk assessment and prevention strategies not only improve patient quality of life but also alleviate national healthcare expenditures. The integration of medical imaging and genetic data from extensive biobanks, driven by artificial intelligence (AI), is revolutionizing precision and smart health initiatives.In this study, we applied these principles to T2D by analyzing medical images (abdominal ultrasonography and bone density scans) alongside whole-genome single nucleotide variations in 17,785 Han Chinese participants from the Taiwan Biobank. Rigorous data cleaning and preprocessing procedures were applied. Imaging analysis utilized densely connected convolutional neural networks, augmented by graph neural networks to account for intra-individual image dependencies, while genetic analysis employed Bayesian statistical learning to derive polygenic risk scores (PRS). These modalities were integrated through eXtreme Gradient Boosting (XGBoost), yielding several key findings.First, pixel-based image analysis outperformed feature-centric image analysis in accuracy, automation, and cost efficiency. Second, multi-modality analysis significantly enhanced predictive accuracy compared to single-modality approaches. Third, this comprehensive approach, combining medical imaging, genetic, and demographic data, represents a promising frontier for fusion modeling, integrating AI and statistical learning techniques in disease risk assessment. Our model achieved an Area under the Receiver Operating Characteristic Curve (AUC) of 0.944, with an accuracy of 0.875, sensitivity of 0.882, specificity of 0.875, and a Youden index of 0.754. Additionally, the analysis revealed significant positive correlations between the multi-image risk score (MRS) and T2D, as well as between the PRS and T2D, identifying high-risk subgroups within the cohort.This study pioneers the integration of multimodal imaging pixels and genome-wide genetic variation data for precise T2D risk assessment, advancing the understanding of precision and smart health.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis study was funded by research grants from the Academia Sinica (AS-PH-109-01 and AS-SH-112-01).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Data application and use were approved by the Taiwan Biobank and the Institute Review Board (AS-IRB01-17049 and AS-IRB01-21009).I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesThe data analyzed in this study were obtained from the Taiwan Biobank with proper approval. The Taiwan Biobank retains ownership rights, so the data have not been deposited in a public repository. Researchers interested in accessing the data must apply through the Taiwan Biobank's formal process. Detailed instructions for data access requests can be found on the Taiwan Biobank's official website (https://www.twbiobank.org.tw/index.php). This paper provides Source data in the Supplementary Information and Source Data files. Meta-GWAS summary statistics of T2D in multiple populations from the DIAGRAM Consortium are available at https://diagram-consortium.org/downloads.html. The linkage disequilibrium reference from various populations of the 1000 Genomes Project can be downloaded from https://github.com/getian107/PRScsx. https://www.twbiobank.org.tw/index.php https://diagram-consortium.org/downloads.html https://github.com/getian107/PRScsx