RT Journal Article SR Electronic T1 Improved Type 2 Diabetes Risk Stratification in the Qatar Biobank Cohort by Ensemble Learning Classifier Incorporating Multi-Trait, Population-Specific, Polygenic Risk Scores JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.06.23.23291830 DO 10.1101/2023.06.23.23291830 A1 Ahmed, Ikhlak A1 Ziab, Mubarak A1 Taheri, Shahrad A1 Chagoury, Odette A1 Hussain, Sura A. A1 Lakshmi, Jyothi A1 Bhat, Ajaz A. A1 Fakhro, Khalid A. A1 Al-Shabeeb Akil, Ammira S. YR 2023 UL http://medrxiv.org/content/early/2023/09/05/2023.06.23.23291830.abstract AB Background Type 2 Diabetes (T2D) is a pervasive chronic disease influenced by a complex interplay of environmental and genetic factors. To enhance T2D risk prediction, leveraging genetic information is essential, with polygenic risk scores (PRS) offering a promising tool for assessing individual genetic risk. Our study focuses on the comparison between multi-trait and single-trait PRS models and demonstrates how the incorporation of multi-trait PRS into risk prediction models can significantly augment T2D risk assessment accuracy and effectiveness.Methods We conducted genome-wide association studies (GWAS) on 12 distinct T2D-related traits within a cohort of 14,278 individuals, all sequenced under the Qatar Genome Programme (QGP). This in-depth genetic analysis yielded several novel genetic variants associated with T2D, which served as the foundation for constructing multiple weighted PRS models. To assess the cumulative risk from these predictors, we applied machine learning (ML) techniques, which allowed for a thorough risk assessment.Results Our research identified genetic variations tied to T2D risk and facilitated the construction of ML models integrating PRS predictors for an exhaustive risk evaluation. The top-performing ML model demonstrated a robust performance with an accuracy of 0.8549, AUC of 0.92, AUC-PR of 0.8522, and an F1 score of 0.757, reflecting its strong capacity to differentiate cases from controls. We are currently working on acquiring independent T2D cohorts to validate the efficacy of our final model.Conclusion Our research underscores the potential of PRS models in identifying individuals within the population who are at elevated risk of developing T2D and its associated complications. The use of multi-trait PRS and ML models for risk prediction could inform early interventions, potentially identifying T2D patients who stand to benefit most based on their individual genetic risk profile. This combined approach signifies a stride forward in the field of precision medicine, potentially enhancing T2D risk prediction, prevention, and management.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis research was funded by the Path Towards Precision Medicine (PPM) of Qatar National Research Fund (QNRF), a subsidiary of Qatar Research Development and Innovation Council (QRDI) and Sidra Medicine Research Division, grants number PPM 03-0311-190017, and SDR1000043, respectively. Author A.S.A is the project lead principal investigator, budget holder, and the recipient of the research funding support from both organizations.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:QBB Cohort Study QF-QBB-RES-ACC-0075 QBB IRB Approval number is: Full Board-2017-QF-QBB-RES-ACC-0075-0023. Deidentified research conducted under the project E-2020-QF-QBB-RES-ACC-0150-0143. QBB IRB Approval number, E-2017-QF-QBB-RES-ACC-0026-000. Sidra IRB MOPH Assurance: IRB-A-Sidra-2019-0020 Sidra IRB MOPH Registration: IRB-Sidra-2020-009 Sidra IRB DHHS Assurance: FWA00022378. Sidra IRB DHHS Registration: IRB00009930. Sidra Medicine, IRB protocol # 1660756, May 30, 2024. Approval category: Expedited.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesData described in the manuscript, including all relevant raw data, will be freely available to any scientist wishing to use them for non-commercial purposes without breaching participant confidentiality. Requests should be sent directly to the corresponding author.T2DType 2 DiabetesPRSpolygenic risk scoresGWASgenome-wide association studiesQGPQatar Genome ProgrammeMLmachine learningAUCArea under the ROC CurveSNPsnucleotide polymorphismsMENAMiddle East and North AfricaSVMSupport Vector MachineQBBQatar Biobank cohortHMCHamad Medical CorporationWGSWhole genome sequencingHDLhigh-density lipoprotein cholesterolLDLlow-density lipoprotein cholesterolTSHThyroid stimulating hormoneTGLTriglyceridesINTInverse Normal TransformationHWEHardy-Weinberg equilibriumLMMlinear mixed-effect modelFTOFat mass and obesity associatedMC4RMelanocortin 4 ReceptorTMEM18Transmembrane Protein 18WHRWaist-Hip Ratio