ABSTRACT
Background Type 2 Diabetes (T2D) is a pervasive chronic disease influenced by a complex interplay of environmental and genetic factors. To enhance T2D risk prediction, leveraging genetic information is essential, with polygenic risk scores (PRS) offering a promising tool for assessing individual genetic risk. Our study focuses on the comparison between multi-trait and single-trait PRS models and demonstrates how the incorporation of multi-trait PRS into risk prediction models can significantly augment T2D risk assessment accuracy and effectiveness.
Methods We conducted genome-wide association studies (GWAS) on 12 distinct T2D-related traits within a cohort of 14,278 individuals, all sequenced under the Qatar Genome Programme (QGP). This in-depth genetic analysis yielded several novel genetic variants associated with T2D, which served as the foundation for constructing multiple weighted PRS models. To assess the cumulative risk from these predictors, we applied machine learning (ML) techniques, which allowed for a thorough risk assessment.
Results Our research identified genetic variations tied to T2D risk and facilitated the construction of ML models integrating PRS predictors for an exhaustive risk evaluation. The top-performing ML model demonstrated a robust performance with an accuracy of 0.8549, AUC of 0.92, AUC-PR of 0.8522, and an F1 score of 0.757, reflecting its strong capacity to differentiate cases from controls. We are currently working on acquiring independent T2D cohorts to validate the efficacy of our final model.
Conclusion Our research underscores the potential of PRS models in identifying individuals within the population who are at elevated risk of developing T2D and its associated complications. The use of multi-trait PRS and ML models for risk prediction could inform early interventions, potentially identifying T2D patients who stand to benefit most based on their individual genetic risk profile. This combined approach signifies a stride forward in the field of precision medicine, potentially enhancing T2D risk prediction, prevention, and management.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This research was funded by the Path Towards Precision Medicine (PPM) of Qatar National Research Fund (QNRF), a subsidiary of Qatar Research Development and Innovation Council (QRDI) and Sidra Medicine Research Division, grants number PPM 03-0311-190017, and SDR1000043, respectively. Author A.S.A is the project lead principal investigator, budget holder, and the recipient of the research funding support from both organizations.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
QBB Cohort Study QF-QBB-RES-ACC-0075 QBB IRB Approval number is: Full Board-2017-QF-QBB-RES-ACC-0075-0023. Deidentified research conducted under the project E-2020-QF-QBB-RES-ACC-0150-0143. QBB IRB Approval number, E-2017-QF-QBB-RES-ACC-0026-000. Sidra IRB MOPH Assurance: IRB-A-Sidra-2019-0020 Sidra IRB MOPH Registration: IRB-Sidra-2020-009 Sidra IRB DHHS Assurance: FWA00022378. Sidra IRB DHHS Registration: IRB00009930. Sidra Medicine, IRB protocol # 1660756, May 30, 2024. Approval category: Expedited.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
↵* This work is supported by the PPM award (PPM 03-0311-190017) from QNRF to Dr. Ammira Al-Shabeeb Akil
No changes except removing one author based on his request due to his involvement in other conflicting project.
Data Availability
Data described in the manuscript, including all relevant raw data, will be freely available to any scientist wishing to use them for non-commercial purposes without breaching participant confidentiality. Requests should be sent directly to the corresponding author.
ABBREVIATIONS
- T2D
- Type 2 Diabetes
- PRS
- polygenic risk scores
- GWAS
- genome-wide association studies
- QGP
- Qatar Genome Programme
- ML
- machine learning
- AUC
- Area under the ROC Curve
- SNPs
- nucleotide polymorphisms
- MENA
- Middle East and North Africa
- SVM
- Support Vector Machine
- QBB
- Qatar Biobank cohort
- HMC
- Hamad Medical Corporation
- WGS
- Whole genome sequencing
- HDL
- high-density lipoprotein cholesterol
- LDL
- low-density lipoprotein cholesterol
- TSH
- Thyroid stimulating hormone
- TGL
- Triglycerides
- INT
- Inverse Normal Transformation
- HWE
- Hardy-Weinberg equilibrium
- LMM
- linear mixed-effect model
- FTO
- Fat mass and obesity associated
- MC4R
- Melanocortin 4 Receptor
- TMEM18
- Transmembrane Protein 18
- WHR
- Waist-Hip Ratio